As the tests get more specific, the code gets more generic.
Posted by Uncle Bob on 08/06/2009
I tweeted this not too long ago. The basic idea is that as you add tests, the tests get more and more specific. This makes sense since tests are, after all, specifications. The more specifications you have, the more specific the whole body of specifications becomes.
As a general rule, good design dictates that the more specific your requirements become, the more general your code needs to be. This is saying roughly the same thing as Greenspun’s Tenth Rule of Programming: “Any sufficiently complicated [...] program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp.” Or rather, as more and more constraints pile upon a program, the designers look for ways to push those constraints out of the program itself and into the data.
In return for my tweet people asked for examples.
One of the better examples (though perhaps a bit trivial) is the Prime Factors Kata. This lovely little experiment grows and grows as you add test cases, and then suddenly collapses into an elegant three line algorithm.
The tests continue to become ever more specific. The production code starts out just as specific as the tests. But with the second or third test the programmer must make a decision. He can write the production code to mirror the tests (i.e. writing it as an if/else statement that detects which test is running and supplying the expected answer) or he can come up with some kind of more general algorithm that satisfies the tests without looking anything like them.
The algorithm grows and warps and twists; and then, just when it looks like it’s destined to become a wretched mess; it simply evaporates into a lovely little three line nested loop.
We see the principle at work in other ways as well. Often the programmers have a whole list of tests that they know must pass. As they write them one by one, they write the production code that satisfies them. Then, as in the Bowling Game Kata the tests start to pass unexpectedly. You were done with the code, and you weren’t aware of it. You continue writing tests, expecting one to fail, but they all pass. The test code grows, but the production code remains the same.
Sometimes this happens in a less surprising way. Sometimes you know that you have implemented an algorithm that will pass all remaining tests, but you write those tests anyway because they are part of the specification
The point is that test code and production code do not grow at the same rate. Indeed, as the application increases in complexity, the test code grows at a rate that is faster than the production code. Sometimes the production code actually shrinks as the test code grows because the programmers moved a load of functionality out of the code and into the data.
Consider FitNesse. A year ago there were 45,000 lines of code, of which 15,000 were tests, so 33% of the total were tests.
Now Fitnesse is 58,000 lines of code of which 26,000 are tests. We added 13,000 lines of code overall, but 8,000 (61%), are tests! The tests have grown to over 44% of the total.
Comments
Mark Nijhof about 21 hours later:
I like to see something about the maintainability of your tests suite. When doing functional changes to the code what is a good way to change the tests? I thought of two approaches (also blogged about it here:http://blog.fohjin.com/blog/2009/5/14/How_to_Re_factor_or_Change_Behavior_using_TDD).
First approach would be to make the changes to the code and adjust the tests afterward, but then you are not doing TDD anymore. This might be ok for small functional changes but for larger ones I would like to follow the second approach
Create new tests to cover your changed functionality/behavior doing TDD and keep going until you feel that you have implemented it correctly with the correct tests. While doing this ignore any broken tests and if a test doesn’t compile then commented the body of it out and add assert.fail(). Then when you are done you should have x number of failing tests which after proper investigation you could delete.
Now second thing is how you group your tests together, the clasic (as I understand it) way in TDD is that you group them by class that they test, and while this is fine for smaller systems I feel you run into test maintains issues rather quickly. After Jeremy Miller did a presentation at NDC we talked about this and he groups the tests by functionality/behavior (more like BDD) and in this I could see great benefit. Doing this will I believe help making functional changes to your code and tests. What is your opinion on this?
Btw I have your book (clean code) in front of me to start reading it, so if the answer is in there that would suffice as an reply as well :)
Esko Luontola 1 day later:
When changing the behaviour in a big way, I would write completely new classes and then when they are complete enough to replace the old classes, I would change the system to begin using the new classes. This is what Kent Beck calls Parallel Strategy inhttp://www.infoq.com/presentations/responsive-design starting around 37min (great presentation, btw).
After doing the change, I would delete all the old code and tests that were replaced. When TDD’ing the new code, I would probably have a look at the old tests, to make sure that I cover all the wanted behaviour and corner cases.
The way that I like to group and organize tests is closer to BDD. I would not write tests like Uncle Bob in his Bowling Game Kata – that style might be called “tests as examples”. Instead I would write “tests as specification”.
In case of the bowling game, the test names would be such that you could print them in a book under the title “rules of scoring in bowling”. Similar to page 2 (“Scoring Bowling”) of Uncle Bob’s Bowling Game Kata presentation. I haven’t yet tried writing a Bowling Game Kata with my style (maybe I should try it), but you can see my style in the case of a Tetris game here:http://github.com/orfjackal/tdd-tetris-tutorial/tree/master/src/test/java/tetris
The reason why I use this style, is that when the requirements have changed and some tests fail, then I can read the name of the test and know that whythat test was written – what is the behaviour specified by that test. Then it’s easy to evaluate whether that test is still needed and its code just has to fixed, or whether the test is outdated and should be removed. It also helps when writing new code, that have all corner cases and combinations been covered by the tests.