TDD in Clojure
Posted by Uncle Bob on Thursday, June 03, 2010
OO is a tell-don’t-ask paradigm. Yes, I know people don’t always use it that way, but one of Kay’s original concepts was that objects were like cells in a living creature. The cells in a living creature do not ask any questions. They simply tell each other what to do. Neurons are tellers, not askers. Hormones are tellers not askers. In biological systems, (and in Kay’s original concept for OO) communication was half-duplex.
Clojure is a functional language. Functional languages are ask-dont-tell. Indeed, the whole notion of “tell” is to change the state of the system. In a functional program there is no state to change. So “telling” makes little sense.
When we use TDD to develop a tell-don’t-ask system, we start at the high level and write tests using mocks to make sure we are issuing the correct “tells”. We proceed from the top of the system to the bottom of the system. The last tests we write are for the utilities at the very bottom.
In an ask-don’t-tell system, data starts at the bottom and flows upwards. The operation of each function depends on the data fed to it by the lower level functions. There is no mocking framework. So we write tests that start at the bottom, and we work our way up the the top.
Therein lies the rub.
In a tell-don’t-ask system, the tells at the high level are relatively complex. They branch out into lower subsystems getting simpler, but more numerous as they descend. Testing these tells using mocks is not particularly difficult because we don’t need to depend on the lower level functions being there. The mocks make them irrelevant.
In an ask-don’t-tell system the asks at the low level are simple, but as the data moves upwards it gets grouped and composed into lists, maps, sets, and other complex data structures. At the top the data is in it’s most complex form. Writing tests against that complex data is difficult at best. And there is currently no way to mock out the lower levels1 so all tests written at the high level depend on all the functions below.
The perception of writing tests from the bottom to the top can be horrific at first. Consider, for example, the Orbit program I just wrote. This program simulates N-body gravitation. Imagine that I am writing tests at the top level. I have three bodies at position Pa, Pb, and Pc. They have masses Ma, Mb, and Mc. They have velocity vectors of Va, Vb, Vc. The test I want to write needs to make sure that new positions Pa’, Pb’, Pc’, and new Velocity vectors Va’, Vb’, and Vc’ are computed correctly. How do I do that?
Should I write a test that looks like this?
test-update { Pa = (1,1) Ma = 2 Va = (0,0) Pb = (1,2) Mb = 3 Vb = (0,0) Pc = (4,5) Mc = 4 Vc = (0,0) update-all Pa should == (1.096, 4.128) Va should == (0.096, 3.128) Pb should == (1.1571348402636772, 0.1571348402636774) Vb should == (0.15713484026367727, -1.8428651597363226) Pc should == (3.834148869802242, 4.818148869802242) Vc should == (-0.16585113019775796, -0.18185113019775795) }
A test like this is awful. It’s loaded with magic numbers, and secret information. It tells me nothing about how the update-all function is working. It only tells me that it generated certain numbers. Are those numbers correct? How would I know?
But wait! I’m working in a functional language. That means that every function I call with certain inputs will always return the same value; no matter how many times I call it. Functions don’t change state! And that means that I can write my tests quite differently.
How does update-all work? Simple, given a list of objects it performs the following operations (written statefully):
update-all(objects) { for each object in objects { accumulate-forces(object, objects) } for each object in objects { accelerate(object) reposition(object) } }
This is written in stateful form to make is easier for our non-functional friends to follow. First we accumulate the force of gravity between all the objects. This amounts to evaluating Newton’s F=Gm1m1/r^2 formula for each pair of objects, and adding up the force vectors.
Then, for each object we accelerate that object by applying the force vector to it’s mass, and adding the resultant delta-v vector to it’s velocity vector.
Then, for each object we reposition that object by applying the velocity vector to it’s current position.
Here’s the clojure code for update-all
(defn update-all [os] (reposition-all (accelerate-all (calculate-forces-on-all os))))
In this code you can clearly see the bottom-to-top flow of the application. First we calculate forces, then we accelerate, and finally we reposition.
Now, what do these -all functions look like? Here they are:
(defn calculate-forces-on-all [os] (map #(accumulate-forces % os) os)) (defn accelerate-all [os] (map accelerate os)) (defn reposition-all [os] (map reposition os))
If you don’t read clojure, don’t worry. the map function simply creates a new list from an old list by applying a function to each element of the old list. So in the case of reposition-all it simply calls reposition on the list of objects (os) producing a new list of objects that have been repositioned.
From this we can determine that the function of update-all is to call the three functions (accumulate-forces, accelerate, and reposition) on each element of the input list, producing a new list.
Notice how similar that is to a statement we might make about a high level method in an OO program. (It’s got to call these three functions on each element of the list). In an OO language we would mock out the three functions and just make sure they’d been called for each element. The calculations would be bypassed as irrelevant.
Oddly, we can make the same statement in clojure. Here’s the test for update-all
(testing "update-all" (let [ o1 (make-object ...) o2 (make-object ...) o3 (make-object ...) os [o1 o2 o3] us (update-all os) ] (is (= (nth us 0) (reposition (accelerate (accumulate-forces os o1) (is (= (nth us 1) (reposition (accelerate (accumulate-forces os o2) (is (= (nth us 2) (reposition (accelerate (accumulate-forces os o3) ) )
If you don’t read clojure don’t worry. All this is saying is that we test the update-all function by calling the appropriate functions for each input object, and then see if the elements in the output list match them.
In an OO program we’d find this dangerous because of side-effects. We couldn’t be sure that the functions could safely be called without changing the state of some object in the system. But in a functional language it doesn’t matter how many times you call a function. So long as you pass in the same data, you will get the same result.
So this test simply checks that the appropriate three functions are getting called on each element of the list. This is exactly the same thing an OO programmer would do with a mock object!
Is TDD necessary in Clojure?
If you follow the code in the Orbit example, you’ll note that I wrote tests for all the computations, but did not write tests for the Swing-Gui. This is typical of the way that I work. I try to test all business rules, but I “fiddle” with the GUI until I like it.
If you look carefully you’ll find that amidst the GUI functions there are some “presentation” functions that could have been tested, but that I neglected to write with TDD[2]. These functions were the worst to get working. I continuously encountered NPEs and Illegal Cast exceptions while trying to get them to work.
My conclusion is that Clojure without TDD is just as much a nightmare as Java or Ruby without TDD.
Summary
In OO we tend to TDD our way from the top to the bottom by using Mocks. In Clojure we tend to TDD our way from the bottom to the top. In either case we can compose our tests in terms of the functions they should call on the lower level objects. In the case of OO we use mocks to tell us if the functions have been called properly. This protects us from side-effects and allows us to decouple our tests from the whole system. In clojure we can rely on the fact that the language is functional, and that no matter how many times you call a function it will return the same value.
1 Brian Marick is working on something that looks a lot like a mocking framework for clojure. If his ideas pan out, we may be able to TDD from the top to the bottom in Clojure.
2 This is an unconscious game we all play with ourselves. When we have a segment of code that we consider to be immune to TDD (like GUI) then we unconsciously move lots of otherwise testable code into that segment. Yes, I heard my green band complain every time I did it; but I ignored it because I was in the GUI. Whoops.
Comments
Patrick 21 minutes later:
When I write tests, I find myself struggling with white- versus black-box testing. You’re proposing what I think of as whitebox testing, where the test writer encodes intimate knowledge of update-all into the test. An alternate strategy is to approach this as, “I don’t care how update-all does it, but for a given input, it should apply forces and acceleration and calculate a new position for the elements”. Each of the other functions would be tested in the same way. I tend to prefer this because it allows me to refactor upper-level functions like update-all more easily, because the test does not encode knowledge (or as I think of it, does not violate encapsulation) of the function being tested. Is this something you yourself think about?
Thanks Patrick
Daniel Martins about 1 hour later:
Nice article!
One question though. I understand when you say that, in the first example, the code is “loaded with magic numbers, and secret information”. But isn’t that necessary to test the calculations, specially in math-heavy software like the Orbit program?
The last piece of code just tests, in my view, whether the “asks” are issued correctly; it doesn’t test the calculations themselves.
Since you duplicates the body of the update-all function in its test code, you wouldn’t be able to detect any miscalculation on its delegate functionsreposition, accelerate and accumulate-forces.
Or maybe I’m missing something… :)
zvolkov about 1 hour later:
Very interesting! Where can I read more on “tell-don’t-ask”?
Colin Jones about 1 hour later:
Interesting, I wouldn’t have thought of just repeating those lower-level function calls, but it makes a lot of sense for pure functions, and it makes the test really simple and easy to understand.
Regarding footnote [1], there actually are a couple of mechanisms available for stubbing/mocking, using the built-in binding (see http://blog.n01se.net/?p=134) or the more fully-featured clojure.contrib.mock (http://richhickey.github.com/clojure-contrib/mock-api.html). I haven’t used clojure.contrib.mock, but it seems promising.
techbehindtech.com about 3 hours later:
There is a way to mock/stub low level. Check outhttp://github.com/amitrathore/conjure
Ed Bowler about 6 hours later:
I am unclear as to what features a mocking framework would bring to clojure, other than syntax. The binding macro surely provides the ability to TDD top down. Please expand on your thoughts Uncle Bob.
Jason Y about 9 hours later:
So, is it correct that mocking functions is the needed, missing feature? That makes sense to me.
Being relatively new to TDD, I’m probably missing something. It appears to me that the example in the OP is not an example of where mocks are needed.
“It’s loaded with magic numbers, and secret information.”
They are arbitrary input and resulting data. The arbitrary input is necessary (unless you apply a certain refactoring unrelated to this discussion) in both cases (it’s what goes in the ”...” in the latter example, correct?). The resulting answers are required to verify correct results… unless you replace them with a reimplementation of the SUT, as in the latter example calling the 3 underlying methods itself. (I’m not saying the latter example is stupid, but that this is a tradeoff at best, as you seem to indicate.)
“It tells me nothing about how the update-all function is working. It only tells me that it generated certain numbers. Are those numbers correct? How would I know?”
I can’t think of a non-sarcastic way to say that they are calculated with a calculator based on the laws of physics. BUT, since the question “Which laws do we apply?” comes up, you do have a point, since we never apply all laws of physics (e.g. friction due to nearby gases). Still, I would think that those looking at the tests know what laws the product being tested deals with. To be sure, a comment giving the formula, or even a list of factors included, will clear up all questions.
Again, I’m probably missing something really basic, such as why we care about what the update-all method calls to begin with. Why does it matter as long as we get the right answer?
To clarify, I understand why, in more complicated examples, it simplifies unit tests dramatically to mock away lower layers, and also why one would want to mock things like DateTime.Now or SendEmail. I’m just trying to understand what’s so “awful” about the first example test given.
Phil about 13 hours later:
“My conclusion is that Clojure without TDD is just as much a nightmare as Java or Ruby without TDD.”
I may be missing the point here, but TDD is (surely!) not language specific. It is all about intent. Regardless of the syntax or even the semantics of the language of implementation, the intent remains the same. TDD’s power lies in ensuring intent is maintained as functionality is added modified.
Is there a way, even, that we can write that intent in its own language, such that it can be verified regardless of the language of implementation?
SI Hayakawa about 20 hours later:
Please check your “it’s”. A couple of them should be “its”—no apostrophe for the possessive pronoun.
“applying the force vector to it’s mass, and adding the resultant delta-v vector to it’s velocity vector”
Nils Wloka about 20 hours later:
Thanks for sharing your thoughts. I was struggling with TDDing clojure code mainly because I am used to working outside-in when doing Java development. After reading your article, I redid a kata I was practicing recently, evolving code and test bottom-up, which felt a lot more natural.
Nevertheless I noticed that I was mainly using the tests for covering corner cases and regression while doing most of the design in the REPL.
As an aside, you might want to considered using clojure.test/are instead of repeating is for less “noise”.
Matt about 20 hours later:
I wrote clojure.contrib.mock which I believe is sufficient for top down testing, as that is how I tend to develop in clojure. It lets you set parameter expectations, return values, call counts and even replace given functions with simple stub functions that can do calculations on the input based on the parameters given.
The only real downside to c.c.mock of which I am currently aware is that it doesn’t really hold up in multi-threaded tests as it fundamentally uses clojure.core/binding to do the function replacements, and that is thread-local by nature.
If you can find any missing features in c.c.mock, I’d be more than happy to hear from you, as I am eager to enhance it as best I can. I think I may actually write a detailed post on how to use it later this week just so everyone is at least aware of its existence ;)
Joe Gutierrez 4 days later:
Pretty good. I really like the distinctions between OOP and functional. I really think that you missed an abstraction, though. All your `-all` functions are maps with functions to data, then you might want to create a new abstraction:
(defn for-all [os, fn] (map fn os))
(for-all os accelerate) (for-all os reposition) (for-all os #(accumulate-forces % os))
I think you also get a better language! Also you would be able to reduce the facade noise.
Brian Marick 8 days later:
I’ve started talking about stub-driving Clojure here:http://www.exampler.com/blog/2010/06/11/tdd-in-clojure-a-sketch-part-1/