Testing

Why test?

Testing is often an after though of development. It is a check when we finish that everything is correct. We build something we want to pass a test and then build the test. This make writing the code harder as we don't have a clear idea of what the code should do.

Instead we can write the test as part of defining the code. If we test we define both the what and the how of code. We decide in the test what the input should be to a function and what the output we want from it is. We can define limits of the code as well by testing multiple inputs, and purposely raising errors and incorrect results.

Once we have code we can tell straight away if it works correctly by running the tests. For any errors found we have better localisation based on the failed tests. This can make debugging faster and more focused. Extra tests can always be added later for errors not caught by the original testing. The method of writing tests before code is test driven development.

test driven development

In Test Driven Development we actually spend more steps if not even more time with the tests than with the code. We don't just start with the tests. The tests are always used as reference for any changes to the code. If we think of a reason to make a change then we modify the tests first and then the code.

The concept behind this method is that we identfy our goals first. When we write our tests we think about the shape of our software:

The answers to these questions will shape:

The tests are a programmed form of a specification and writing the specification after the work is never a good idea. The exact expression of the tests will vary between languages. For some examples and guidance on test writing in Python see unittest Examples.

Unit Testing

In general though tests will take the form of one or a series of assert statements. An assert statement is simply the statement that at this point in the code something must be true or else an error has occured and the test has failed. This means that we first prepare input and run the code. Though we might assert something both before and after running a function to check for a correct update. For instance in psuedo code with some basic functions:

test_addition():

inputA = 1

inputB = 2

assertTrue(inputA + inputB == 3)

output = addition(inputA, inputB)

assertEquals(output, 3)

Here we test that an addition function correctly adds two values. We have not defined the addition function yet because we are instead defining its behaviour based on the tests it must pass. It would be sensible in this case to add an extra couple of tests to check it can handle negative and non-integer numbers.

test_addition():

inputA = 1

inputB = 2

assertTrue(inputA + inputB == 3)

output = addition(inputA, inputB)

assertEquals(output, 3)

assertEquals(addition(5, -2), 3)

assertEquals(additon(-2, -1), -3)

assertEquals(additon(1, 0.5), 1.5)

assertEqualts(additon(-0.25, 0.2), -0.05)

We don't have to define the inputs and check them every time. Here we check a good range and combination of possible signs and integer/float combinations. When we come to write the addition function we will need to think about these different possibilities. If we forget one though we will be reminded when we run the test. If we add messages to our assert methods for when they fail we will even know which test has failed.

test_addition():

inputA = 1

inputB = 2

assertTrue(inputA + inputB == 3, "Basic mathematics engine is broken")

output = addition(inputA, inputB)

assertEquals(output, 3, "Positive Int to positive int addition is broken")

assertEquals(addition(5, -2), 3, "Positive int to negative int addition is broken")

assertEquals(additon(-2, -1), -3, "Negative int to negative int addition is broken")

assertEquals(additon(1, 0.5), 1.5, "Positive int to positive float addition is broken")

assertEqualts(additon(-0.25, 0.2), -0.05, "Negative float to positive float addition is broken")

These tests may take longer to write than a simple function like addition would take to write. Consider how often you use addition though and what would happen if it was incorrect. This is why it can be worth writing good tests even for very small functions.

Integration Testing

So far we have looked at unit testing. It is important that we test our individual components before putting them together. Indeed the effort to design these tests can help to make sure that functions are well modularised so that they don't need so much work to build inputs for. Once we have these units though we need to put them together and test the result. This is called integration testing. 

Integration testing can be harder and take longer to run than unit testing. Often it will require so much input that a file of testing data might need to be generated. This is useful as it is an opportunity to introduce purposeful errors to the system to test that it returns the correct errors.

This input will then be run based on a real application of the units in the system by testing larger functions, scripts or combinations of objects. As well as large inputs these can have large outputs which may need to be saved to file for comparison. 

Integration testing forms a large amount of testing added after implementation and to maintain the code over time as things change but it still fundamentally relies on assert that something is true at a particular point in the code.

Development

In the graphic above we show the development cycle of test driven development. We write the tests then the code. We then test the code by running the tests. This allows us to then correct the code until it passes the tests. This is not the end of development. Now that we think the code is correct we use the code. As we do it is likely we will find errors that were not checked in our tests or which didn't occur on our test data.

It would be natural to think that the next step on finding those errors would be to start trying to repair the code. In test driven development though we start by isolating the error by trying to write a test that raises the same error. This helps us to know when the error is fixed not just for the exact code we were using but in general. Then as we repair the code we might adjust the test to limit its scope as we eliminate possible sources of the error. Finally by the time we repair the code we should have a new test which understands the cause of the error and will check for it in future versions of the software.

Debugging, MAINTENANCE and compatibility

Debugging

The process of finding errors in the code and repairing them is essential to all programming. It is a skill that embodies the scientific method. We start with a theory, this should come from the original error message or the test in which the error occured. This will point us to a particular piece of code. 

We can then inspect the code either by writing a test to check that its responce to input is correct or in many IDEs we can use breakpoints to pause the code while it runs to inspect the variables. This allows us to see if for instance the correct values are being passed in. If the types of variables is as expected and at which line of the code something looks wrong. In many systems it is possible after a break point to step through (go to the next line of code) or step into (look inside of nested functions) to check the action of the code on the input in fine detail. This can allow us to find errors easily. It can also help us to write correct tests to check for the error. If I discover that the input is incorrect I can test the input before it is handed in, this would be an integration problem. If the output is an integer rather than a float (loosing accuracy) then we can test the type of the output as part of a unit test.

Maintanence and Compatibility 

Testing can contribute to maintanence by providing a quick set of checks. When the packages and/or language your code uses is updated the functionality of your code with the updates can be tested just by running the tests. This can make checking compatibility a quick and easy task. It can also mean that when there is a compatibility issue it can be identified quickly to a particular set of code. This does not prevent major changes to language and packages breaking your code (which is why a good requirements definition is important) but it will make the smaller ones easier to weather and the problems quicker to identify.