Note: this article assumes you’re somewhat familiar with the idea of Test-Driven Development.
Automated tests improve (minimally) the quality of your code by revealing some of its defects. If one of your tests fails, in theory this points to a defect in your code. You make a fix, the test passes, and the quality of your software has improved by some small amount as a result.
Another way to think about this is that the tests apply evolutionary selection pressure to your code. Your software needs to continually adapt to the harsh and changing conditions imposed by your test suite. Versions of the code that don’t pass the selection criteria don’t survive (read: make it into production).
There’s something missing from this picture though. So far, the selection pressure only applies in one direction: from the tests onto the production code. What about the tests themselves? Chances are, they have defects of their own, just like any other code. Not to mention the possibility of big gaps in the business requirements they cover. What, if anything, keeps the tests up-to-scratch?
If tests are actually an important tool for maintaining code quality, then this is an important question to get right. Low-quality tests can’t be expected to bring about higher quality software. In order to extract the most value out of automated tests, we need a way to keep them up to a high standard.
What could provide this corrective feedback? You could write tests for your original tests. But this quickly leads to an infinite regress. Now you need tests for those tests, and tests for those tests, and so on, for all eternity.
What if the production code itself could somehow apply selection pressure back onto the tests? What if you could set up an adversarial process, where the tests force the production code to improve and the production code, in turn, forces the tests to improve? This avoids the infinite regress problem.
It turns out this kind of thing is built into the TDD process. Here are the 3 laws of TDD:
- You must write a failing test before you write any production code.
- You must not write more of a test than is sufficient to fail, or fail to compile.
- You must not write more production code than is sufficient to make the currently failing test pass (emphasis mine).
It’s following rule 3 that applies selection pressure back onto the tests. By only writing the bare minimum code in order to make a test pass, you’re forced to write another test to show that your code is actually half-baked. You then write just enough production code in order to address the newly failing test, and so on. It’s a positive feedback loop.
You end up jumping between two roles that are pitted against each other: the laziest developer on the planet and a test engineer who is constantly trying to show the developer up with failing tests.
Another benefit to being lazy is that it produces lean code. At some point, there are no more tests to write; you’ve implemented the complete specification as it’s currently understood. When this happens, you will often find that you’ve written far less code than expected. This is a win because all else being equal, less code is easier to understand.
Reading about this is one thing, but it needs to be tried out to really grasp its benefits. It turns out there is an exercise/game called Evil Coder that was created to practise this part of TDD. You pair up with another developer, with one person writing tests and the other taking the evil coder role:
Evil mute A/B pairing: Pairs are not allowed to talk. One person writes tests. The other person is a “lazy evil” programmer who writes the minimal code to pass the tests (but the code doesn’t need to actually implement the specification).
You can try this out by heading along to the next Global Day of Code Retreat event in your city – they are a lot of fun.
TL;DR: Improve your tests and your production code as a result, by being lazy and evil.
Thanks to Ali and Xiao for proofreading and providing feedback on a draft of this essay.