Many programming assignments encourage us to test our code. For example, a lab might ask us to write a function such as this one:
def rec_product(a, b):
'''Calculate product a * b recursively'''
And ask us, when we are done coding, to test our code by checking that your program does the following:
rec_product(0, 5) returns 0
rec_product(1, 5) returns 5
rec_product(-1, 5) returns -5
Or course, we can do this by just typing the function calls into the Shell window in repl.it. But there is a more professional way, and one that we can use to automate the testing any time we change our code. We can set up what are called unit tests and then use the pytest program to run those. This article explains how.
To illustate pytest, we'll use a different function so that we aren't giving away the answer to the lab where you are asked to write the function definition for rec_product(a, b).
Instead, we'll use the functions below. Note that each function definition below shows us doing return "wrong_answer". This is on purpose. Before we can rely on our tests, we need to be sure that they fail when we return a wrong answer. So our first version of the function definitions always return a wrong answer, and then we try to set up our tests before we write the function definition. This technique is called "test driven development".
So here are our function definitions:
def larger(a,b):
''' return a if a > b, otherwise return '''
return "wrong answer"
def add_em(a,b):
''' return the sum of a + b '''
return "wrong answer"
def is_string(x):
''' return True if x is a string, otherwise return false '''
return "wrong answer"
When using pytest, you need to install it in your repl. This only needs to be done once.
If you have import pytest at the top of one of your files, the first time you click Run it should install
Or, you can simply type this command in the Shell window: python3 -m poetry add pytest
Let's set up a repl with these functions in main.py. You can find the code for this demo in this repl, which you can fork and try for yourselves: https://replit.com/@PhillipConrad/spis2022-demo-pytest
The animation below shows up copying/pasting the code above into the repl, and then fixing up the indentation errors, and stray characters we sometimes get when copying/pasting code from the web into a repl.
Note that each function definition has:
a docstring (e.g. ''' return a if a > b, otherwise return ''' )
and then has return "wrong answer"
This is our starting point for writing tests.
We now create a second file called tests.py as shown in the animation below.
In this file, we start with the line
import pytest
You might get a warning from repl.it that pytest is "unused", but you can ignore that warning; we actually need this at the top of the file to ensure that the first time we click Run, the pytest module is loaded.
Then, we type these lines to import each of the functions that we want to test, as shown below.
from main import larger
from main import add_em
from main import is_string
(Some of you may know that we could also just do from main import * but I'll explain later why that isn't as good a choice.
In tests.py, we'll now define our test cases, which means a set of function calls, and what we expect each to return.
As an example, if we were define tests cases in plain english, we might say:
we expect larger(3,4) to return 4
we expect larger(10, 10) to return 10
we expect larger("UC Berkeley", "UC San Diego") to return "UC San Diego" (since S for San Diego comes later in the alphabet than B for Berkeley, it has a larger ASCII/Unicode value (85 for S, 66 for B).
we expect larger("UC San Diego", "Stanford") to return "UC San Diego" since U's value is 85, and S's value is 66
Converting these into the syntax of pytest, we have a few rules:
Each test case becomes a single function definition
Each function definitions needs a different name, and all must start with test_
Each function definition uses assert to check that the value we expect is the one we get
While it's not required, a good practice is to name the function this way:
test_name_of_function_parameters_separated_by_underscores
The example below should illustrate the idea:
def test_larger_3_4():
assert larger(3,4)==4
def test_larger_10_10():
assert larger(10,10)==10
def test_larger_UC_San_Diego__UC_Berkeley():
assert larger("UC San Diego","UC Berkeley")=="UC San Diego"
def test_larger_UC_San_Diego__Stanford():
assert larger("UC San Diego","Stanford")=="UC San Diego"
Add this to tests.py as shown below
The first time you click Run in a repl after adding import pytest to the top of a file, it will take an extra long time (60-90 seconds) install pytest in the repl. This delay only happens once, so try to be patient with it. It looks like the animation below.
You are waiting for the > prompt to show up in the Console window, or for the code in main to finish running (if there is any).
Run the tests
After you've installed pytest (as shown above), you should be able to run pytest in the Shell window (not the Console window) by typing:
pytest tests.py
(Note that there is nothing special about the filename tests.py; we could have named that file anything with .py at the end; but it's sensible to call it tests.py since that's what is inside of the file.)
The first time, when we have function definitions that deliberately return the wrong answer, all of our tests should fail. That looks like this. Look at the output, then we'll discuss a few things about it.
A few things to notice:
In the output, we see this near the top
The "collected 4 items" part is important; it shows us that it found four tests. We should do a quick check to make sure that the number there matches the number of tests we are expecting. Sometimes you can make an error like this one. Do you see the error? See if you can spot it before you read on.
def test_larger_3_4():
assert larger(3,4)==4
def test_larger_10_10():
assert larger(10,10)==10
def test_larger_UC_San_Diego__UC_Berkeley():
assert larger("UC San Diego","UC Berkeley")=="UC San Diego"
def test_larger_UC_San_Diego__UC_Berkeley():
assert larger("UC San Diego","Stanford")=="UC San Diego"
If we run with this, we get the following—only three tests! Why?
The reason is that we made a copy/paste error while defining our tests. Note that the last two have the same name, i.e.
test_larger_UC_San_Diego__UC_Berkeley. It can be a little annoying, but Python doesn't treat this as an error; it assumes that you are redefining what the name test_larger_UC_San_Diego__UC_Berkeley means; thus only the second function definition is actually used. So it's a good idea to keep track of how many tests you have.
Returning to the correct output, the second thing to notice is the string FFFF, i.e. four F's, and if you can see color, they are all red. This is an easy way to see at a glance that all four tests failed—and at this point, since we set up four function definitions with incorrect answers, this is exactly what we want!
Next, we can look at the detail in the FAILURES section to see if they failed for the right reasons. What we want to see is something like this.
Focus on the three lines:
> assert larger(3,4)==4
E AssertionError: assert 'wrong answer' == 4
E + where 'wrong answer' = larger(3, 4)
What these three lines show is this:
assert larger(3,4)==4 is the test that failed
It failed because 'wrong answer' == 4 is false
Where did 'wrong answer' come from? It was the result of evaluating larger(3, 4)
We can see that all of the rest of the output is similar in this case. In each of the three tests, the reason for failure was an assertion error, because "wrong answer" isn't the right answer.
But, there is another possibility. What if we tweak one of our function definitions to have an typo. For instance, instead of this:
def is_string(x):
''' return True if x is a string, otherwise return false '''
return "wrong answer"
What if we had this?
def is_string(a,b):
''' return True if x is a string, otherwise return false '''
return "wrong answer"
That woudl be bad, because the function call and the function definition need to match. Let's see what happens when we write tests. Let's try these tests:
def test_is_string_UCSD_in_quotes():
assert is_string("UCSD")==True
def test_is_string_3_in_quotes():
assert is_string("3")==True
def test_is_string_3_not_in_quotes():
assert is_string(3)==False
def test_is_string_True_in_quotes():
assert is_string("True")==True
def test_is_string_True_not_in_quotes():
assert is_string("True")==False
Add them to the file tests.py, so it looks like this:
Now we run the tests again with pytest tests.py along with the incorrect definition of is_string in place,
i.e. with the first line def is_string(a,b): instead of is_string(x)
Here's what we get for the output (I'm just showing two, because all five of the new test cases are the same)
The error message about the missing argument b directs our attention to the mismatch between the function definition which is asking for two arguments (a and b), and the function calls that only supply one. So we fix the function definition to say is_string(x) and then the error returns to what we expect, i.e. that "wrong answer" is not the correct result:
Now, at long last, we can write the correct function definitions. Here is what it looks like to put in the correct function definition for larger, and then run the tests. This is shown in this repl:
https://replit.com/@PhillipConrad/spis2022-demo-pytest-1#main.py and in the animation below:
Note that at the top of the output, we see tests.py ....FFFFF as shown below (with the dots in green and the F's in red). The green dots represent test cases that passed. All of the test cases for larger passed, so now we can turn our attention to is_string
Here is what it looks like to change the code for is_string to correct code and see all of the tests pass (also see this repl: https://replit.com/@PhillipConrad/spis2022-demo-pytest-2). But wait, this isn't what we expected? We expected all of our tests to pass? So what's going on here?
Let's take a closer look at the test that failed:
The message tells us that it's on line 32 of tests.py, which is helpful. Let's see what that line looks like. Aha! The name of the test is test_is_string_True_not_in_quotes, but True is in quotes!
Let's fix that. Here's a new version of the test (and a repl showing it here: https://replit.com/@PhillipConrad/spis2022-demo-pytest-3#tests.py). This illustrates that sometimes the bug is not in the code itself, but in the tests.
def test_is_string_True_not_in_quotes():
assert is_string(True)==False
You might think that with our tests passing, we are done. But in fact, there's one more step.
Now that we have a good test suite, and a first version of the code, what we should do is:
Commit that good version to GitHub
See if we can improve the code (refactoring)
Make sure that the refactored version still passes the tests
If so, we commit that new improved version to GitHub
If not, we undo our changes (or if they are complicated, we can revert to the earlier version.)
As an example, consider this function:
def is_string(x):
''' return True if x is a string, otherwise return false '''
if type(x)==str:
return True
else:
return False
This is an opportunity for improvement! This can be replaced with a single line of code. Before you go on, can you think of what that line of code would be?
I'm leaving some blank space so that you have to scroll down for the answer.
Scroll down for the answer
↓
↓
↓
↓
So, any time you see the code if condition return True else return False in pretty much any programming language, you can just replace that with return condition.
(Note that the use of Comic Sans above for return True else return False is deliberate and intended to be ironic)
So here's how we refactor, and make sure that our code still works (repl here: https://replit.com/@PhillipConrad/spis2022-demo-pytest-4#main.py )
Starting with this repl, can you add the tests for add_em? First see the tests fail, then see them pass.
If you have any questions, ask a mentor during lab time.