Posts‎ > ‎

Writing specifications that double as tests with Spock

posted Aug 7, 2013, 3:48 AM by Renato Athaydes   [ updated Aug 7, 2013, 5:01 AM ]
I love writing software. Especially software that actually works! That's why I have become a testing freak. If I write a single line of code, I feel rather uncomfortable if I haven't a unit test covering it. At least one automated integration test must be running that line of code for me to feel in peace.
However, testing is at the best of times tedious to write. At its worst, it's a productivity drain that may be more trouble than it's worth (you probably think this is blasphemy, but be honest, haven't you ever felt that maybe you shouldn't spend a week writing a comprehensive test for that hard-to-test feature that's not even that important to your customers?) That's even more true if you write your tests in Java...

The problem with writing tests in Java

Java is a verbose language. Being a statically typed, class-based Object-oriented language has many advantages, of course, but conciseness is most certainly not one of them. Java test classes are treated as just any other... so you have to write quite a lot of boiler-plate code to get anything done. Don't get me wrong, Java is a fine language, but in testing, this is a huge drawback. A test has to be so easy to write and read that it's trivial! Otherwise, you would have to write tests for your tests!

Let's look at some examples with the probably most widely used Java testing framework, JUnit.

Example - toTimeDuration( long )

Suppose you have to test the implementation of the following method (in a class called StringFormatHelper):

String toTimeDuration( long timeInMs )

The most sensible way I can see to test this would be to run a few examples and see if the results are as expected. So, with the help of JUnit you can write this:

    public void testToTimeDuration() {
        StringFormatHelper formatter = new StringFormatHelper();
        assertEquals( "0", formatter.toTimeDuration( 0 ) );
        assertEquals( "0.001 seconds", formatter.toTimeDuration( 1 ) );
        assertEquals( "0.250 seconds", formatter.toTimeDuration( 250 ) );
        assertEquals( "1.000 seconds", formatter.toTimeDuration( 1000 ) );
        assertEquals( "8 hours, 5 minutes, 4.002 seconds", formatter.toTimeDuration(
                2 +               // ms
                        4 * 1000 + // sec
                        5 * 1000 * 60 + // min
                        8 * 1000 * 60 * 60 // hour
        ) );

Looks easy enough. Even concise... but I see several problems with this.

First of all, if the first example fails, the test as a whole fails, so I have no idea whether the other examples also fail or not.

Secondly, if I want to show my colleagues (programmers, testers, managers or anyone else) the test for this method, I must show them the code. Even if they are all coders, they will still have to filter off the whole text the few things that really matter, which are the values for the examples. In this simple example, that may still work fine... but as you get more and more tests, including higher level integration tests and so on, things quickly get hairy...

Finally, there's no context here... what exactly are we going to use this method for? To get a printable time duration of a certain transaction? Or maybe how long a customer has been with your company? No clue. Again, in this constrained example, this may not sound like a big deal, but imagine a complex integration test where without context, nothing else makes much sense. You definitely should add context to all your tests - including unit tests! What is crystal clear today, may not be after a year of big changes in your team.

Of course, we can improve a little bit on this test... for example, we might want to separate each example in its own test method, so that all the examples always run... but that's not very scalable.... and you may be excused for not writing just as many examples as you might have wanted at first.
You might also try to use JUnit's Theories, which allow you to use statis DataPoints which are automatically passed as arguments to the test methods. I have tried that myself, but found that to be far too inflexible if you want to test more than one method, with different data points, in the same test case, and (to make matters worse) very verbose!
One more slight improvement you might get in Java is to use Hamcrest matchers to improve readability:

// using Hamcrest matchers
assertThat( formatter.toTimeDuration( 0 ), equalTo( "0" ) );

Quite honestly, I find that "improvement" less than impressive, but it does allow for more flexibility in your assertions (instead of having to add more methods such as assertEquals, you can just have more matchers, like equalTo above, to allow concise comparisons - see TestFX, a framework for testing JavaFX applications which makes extensive use of matchers to make GUI testing much more concise).

Tests should be written in a scripting language

The problems with using Java for testing steam from the fact that Java is not a scripting language. It was not designed to do that. And I am claiming here as boldly as I can:

"Every automated test is a script."

Wikipedia states that "scripting language (...) supports the writing of scripts, programs written for a special runtime environment that can interpret and automate the execution of tasks which could alternatively be executed one-by-one by a human operator".

To make that the definition of automated tests, you just need to specify that the goal is to test a piece of software.

So why not consider using a scripting language which runs on the JVM to write tests for Java?

Note: there is an interesting argument claiming that, in the case of TDD, you may prefer to use the target language of your API to write tests... however, as I will show, Groovy is so similar to Java that, together with the Java compiler (which will take care of type-checking for free), there's no valid case I know of where your unit test could be less reliable than if it had been written in Java, even if the consuming code is written in Java.

Introducing Groovy

Groovy is a scripting language for the JVM. Arguably, it offers the best integration with Java that is even possible. Groovy can be seen as a super-set of Java: nearly all Java code could be run without modifications in Groovy. For example, the example test we saw previously could be run as is in Groovy... But you probably want to use the power of Groovy to make the test better and easier to read and write.

So let's see what the example test could look like in Groovy:

    public void testToTimeDuration( ) {
        def formatter = new StringFormatHelper();
        def testMethod = formatter.&toTimeDuration
        assert testMethod( 0 ) == "0"
        assert testMethod( 1 ) == "0.001 seconds"
        assert testMethod( 250 ) == "0.250 seconds"
        assert testMethod( 1000 ) == "1.000 seconds"
        assert testMethod(
                2 +               // ms
                        4 * 1000 + // sec
                        5 * 1000 * 60 + // min
                        8 * 1000 * 60 * 60 // hour
        ) == "8 hours, 5 minutes, 4.002 seconds"

Notice that we do not need an assertion API:
  1. the assert keyword is always enabled in Groovy. If the assertion fails, you get an awesome error message, as we'll see.
  2. the "primitive" operator == in Groovy actually invokes the equals method.
Not having an assertion API is already a win. All you need to know is the language basics - how the operators work, which should be intuitive for any Java developer. This is better for the same reason that writing a == b is better than a.equals( b ).
We also made the test less verbose by using a reference to the method under test (using the .& operator) rather than repeating every time the object name (don't even think of making methods like this static unless you're keen on making your code an untestable mess).

Should an assertion fail in Groovy, you always get a wonderfully clear error message explaining what went on... check this example:

Assertion failed: 

assert testMethod( 1000 ) == "1 second"
       |                  |
       1.000 seconds      false
<followed by the stack-trace here>

This is one of my favourite things about Groovy. Small things can make a huge difference in the long run.

But this still does not solve the first problem mentioned in the previous section: if one of the assertions fails, the next ones are not even executed.
Arguably, it also does not solve the second problem completely: even though the code is less verbose, it's still code.

The solution to all our testing problems :) Spock!

Enter Spock. Spock is basically a open-source BDD framework for Groovy that allows you to write Specifications that double as tests. It's easiest to just show how the example test could be implemented in Spock:

    def "Time amounts should look adequate for Test Reports"( ) {
        "The example evaluates to a number"
        def example = timeDurationInMillis

        "I Convert the time duration to a presentable String"
        def result = new StringFormatHelper().toTimeDuration( example )

        "The result is as expected"
        result == expected

        timeDurationInMillis                         | expected
        '0'                                          | '0'
        '1'                                          | '0.001 seconds'
        '250'                                        | '0.250 seconds'
        '1000'                                       | '1.000 seconds'
        '''2 + //ms
            4 * 1000 + // sec
            5 * 1000 * 60 + // min
            8 * 1000 * 60 * 60 // hour''' | '8 hours, 5 minutes, 4.002 seconds'

Note: I used Strings (and evaluated them into Numbers before passing them on to the test method) in the example tables especially to make the last example more readable in the report - see below

It may look even longer than the previous examples.... but really, have a closer look at this test... Look at just how much more information is conveyed in it.... there's a clear context: this method will be used to make the duration of tests presentable in a Test Report (incidentally, this is part of the unit tests of my own Spock Report Extension).

There's also a clear distinction between what assumptions are made, what action is being performed, and what the result is expected to look like (benefits of the so-called Gherkin language more than Spock itself). The Strings explaining what the test is doing are optional, but recommended to make it extra clear what's the intention, and also as source of information for the reports which will be generated (see below). Examples are given in the most natural way: a table which you can easily write in plain text using | between columns.

Notice that the word assert is left out of the assertion line. That's because in Spock, you may omit it in the then: block, because clearly that's where your assertions are supposed to go.

Now, you may object that this is still code - so your business managers (or even the testers) may not be willing to go through them with your development team (even though they should, and this is one of the main pillars of BDD, ie. the participation of all business stakeholders in determining detailed Specifications).

That's where I got a little unhappy about Spock. It's still a very new framework (at version 0.7 at the time of writing), so it's still has some way to go to be complete. But I have loved Spock so much that I was willing to spend some of my own time to contribute by writing an extension to it to address what I saw, after using it to write lots of tests, as its main problem: its lack of reports.

Spock Report Extension

Writing extensions to Spock is pretty easy. Although it's not very well documented yet, Spock is open-source, so I just checked out the built-in extensions they included in the source and started off from there.
What this extension does is generate reports when you run Spock tests. Initially, it can only generate HTML reports but I have plans to add other formats in the future.

Here's the part of a report regarding the example test:

Example report

I used Groovy to write both production and test code, so feel free to look at the source code and see much more complex examples of tests written in Groovy and Spock.

That's it for now! Please leave comments below and let me know what you think.