2008/05/18

CI and NCoverage

On our current project the continuous integration process demands that the code base has a minimal test coverage of 85%. If this condition is not met – guess what – the build fails. Initially we’ve started with a limit of 95%. We all knew that this was not quiet realistic but we thought there could be nothing wrong by starting with a noble attitude. As expected the effective coverage of the code was always slowly dropping to the configured limit. To the current point I am still satisfied with the result but there were some issues coming up during the daily stand up meetings:

Who’s in charge fixing the build if the limit is not met and the build fails?

From my point of view this is a no brainer. The person who actually checks in is responsible. But one member of my team had a different view. After some days off the project he joined us again, made a new class, checked it in and the build failed. From his point of view the coverage before his check in was already not sufficient because there should always be enough reserve for an empty class with a constructor to be checked in without the build failing. So he suggested we should all take some time to collectively increase the overall coverage.
I replied that if you are consequently do test first there should not even be such a class without an according test. Unfortunately until this point not every member of our team has come to the point where he is consequently doing test first. Also we did usually no coverage analysis on our local machine before checking in. This means, if the person which checks the code in is not a test-hardliner the build might. It means further that if you are close to the breaking limit of the test coverage your build will fail proportional to the number of members in your team which do not have enough experience in testing. This is really not a good option because a failing build means not only subsequent check ins will also fail if the build has not been fixed in the meantime, it means also more turbulence in the team.

For me this means that even if I manage to convince the team that if the build fails it is the one checking in who's in charge, I have to live with the fact that there’s some flow getting lost if we are near the limit.

Do we set the limit narrow enough to allow code be within the constraint which the team agrees does not have to be tested? (Like declarative code which when tested would only duplicate the declaration) Or do we tag the code with the according attributes to ignore it in the analysis and therefore are able to set the limit on a higher percentage?

From my point of view I do not like my code to be cluttered with ignore attributes just to be able to meet a self determined limit in our build process. But there were other opinions. One team member suggested that if we all agree that this code does not need to be tested why should we not tag it with such an attribute. I thinks this is a legitimate argument. Although I would not like the additional typing if we could avoid it simply by lowering the coverage limit accordingly. Eventually this is why we are configuring such a limit: because there is code which we all agree cannot be efficiently or meaningful be tested.

Bottom line we are still gathering experience how to properly handle the additional metrics of a coverage analysis in our development process.

3 comments:

Anonymous said...

Often people do not think in the long term of a software-project.
I agree that testing a new empty class is not a very challenging task, but we might think about the following points:
1.) There will be more complicated code in future. So please prepare a TestFixture for it and don't be lazy an leave that task for another collegue in your team!!! In this case e.g calling an empty methode is as easy as having NCover ignored that piece of code.
2.) Writing a test for an empty methode is not wasted time. Hopefully this method will once do something meaningfull an than you have already written the test-body
3.) Lowering the acceptance-level cannot be a solution as we can't foreseen the amount of untestable (mostly declaritive code) in a long term. I think lowering the acceptance-level causes that testable code is without justification not tested until the coverage drops to the lowered level. I think it's better when code is explicitly excepted from the coverage-analyses (e.g by attributes). In this case a team member states to the other team members 'I think it's not worth to test that code'.

Conclusion: The rule that the one that breaks the build has to fix it applies without exception. Something else is just not fair for the other teammembers.

Christoph Hilty said...

I'm glad that we largely agree. But maybe I put myself not clear enough in one point. I am not saying 'Let us lower the coverage limit, we have so much code which cannot be tested!'. All I'm saying is up to now there's no need to exclude something from the analysis because that's why we did not specify a limit of 100% in the first place. If the declarative code really becomes so frequent that the limit cannot be maintained out of this reason, then I am totally at one with you with excluding the code from the analysis. Apart from that I'm assuming that every code which is not tested is communicating: 'This code is not worth testing' - whether there's such an attribute or not. ;) In dubio pro reo.

Anonymous said...

Hey - I am certainly glad to find this. cool job!