Checking and Testing Code#

Learning Objectives#

  • Understand the importance and limitations of software testing

  • Recognize different types of testing such as system-level and defect testing

  • Learn how to implement and interpret assert statement in python code

  • Understand when and why to use assert staements

  • Understand the concept of unit testing and Test Driven Development (TDD)

  • Write and execute unit tests using the unit test module

  • Learn the purpose and implementation of fixtures and mocks in testing

  • Apply these concepts to test code that depends on external resources

  • Understand the role of code linting and type checking

  • Use tools like flake8 and mypy to ensure code quality and adherence to style guidelines

Testing software#

Although we cannot prove our code is free of defects, or bugs, we can, and should, establish that it behaves as intended.

System level testing#

Once we have completed our software we should ensure that it works as intended. This is referred to as - validation testing. Typically this will involve taking a sample of input data and ensuring that the output of our software is as expected for the given input.

This validation testing will tell us in the overall system runs, and produces valid results. Such tests should be repeated when changes are made to the software, to ensure the changes have not introduced errors.

Changes that can impact your software are diverse and include

  • new Python language releases

  • upgrades to imported libraries

  • operating system updates.

Defect testing#

In a research environment it is often the case that there is no explicit specification for the software we create.

By specification we mean something like

  • Written statement of user requirements - typically “user stories”

  • Functional requirements - e.g what file formats are to be supported

  • Non-functional requirements - e.g. subject data must be encrypted

Discussion#

If you don’t have a specification for your software, how might you establish suitable tests to find and resolve defects?

Assert statement#

The built-in Python assert statement looks like this -

# Try modifying this code to deliberately fail the assert statements

def my_add_two(a):
    return a + 2.0

assert my_add_two(1) == 3
# Better to include a message in case of failure
assert my_add_two(3) == 5, f"my_add_two(3) failed with {my_add_two(3)}, expected 5"

When to use assert#

assert should never be used to modify control flow.

Assertions allow you to verify that parts of your program are correct, but are only applied if the internal constant __debug__ is True. Although __debug__ is usually set to True, it is not guaranteed.

## This is approximate what the assert statement does 

def my_assert(condition, message):
    if __debug__ and not condition:
        raise AssertionError(message)

my_assert(my_add_two(1) == 3, "my_add_two(1) failed")

Why might we want different behaviour from our assert statements?#

What would you want your assert statements to do?#

Unit-tests#

Unit-tests are small tests that test the behaviours of our functions and classes.

Unit-tests are typically run within a testing framework or test-runner that automates testing, often inside our IDE.

Test Driven Development (TDD)#

TDD is an approach to software design, it is not software testing. TDD uses unit-tests to create a software design, especially when the design is created incrementally, as with Agile.

Refactoring#

Whether or not you adopt TDD, refactoring - changing the implementation of your code without changing its behaviour, is something that you are certain to do. If only to remove print statements, or change the names of variables.

Refactoring code without appropriate tests can easily introduce new errors.

unittest#

Python 3 distributions include the unittest module. See https://docs.python.org/3/library/unittest.html

import unittest

class TestMyAddTwo(unittest.TestCase):
    def test_my_add_two(self):
        self.assertEqual(my_add_two(1), 3)
    def test_my_add_two_3(self):
        self.assertEqual(my_add_two(3), 5)

# unittest.main(argv=[''], exit=False)

Fixtures and mocks#

Ideally each unit of code should be tested independently.

Why is this?#

However, there are situations where testing code might require data read from a file or a database connection. If only one test requires this external data, then opening the file and reading the data will be part of the test. If several tests require this data, then we use a fixture.

# Module level fixture setup and teardown
def setUpModule():
    global sample_data
    sample_data = open("data/rows.txt", "r")

def tearDownModule():
    sample_data.close()

# unittest.main(argv=[''], exit=False)
# Class level fixture setup and teardown
class TestMyAddTwo(unittest.TestCase):
    @classmethod
    def setUpClass(cls):
        cls.sample_data = open("data/rows.txt", "r")

    @classmethod
    def tearDownClass(cls):
        cls.sample_data.close()

    def test_file_parsing(self):
        for line in self.sample_data:
            # Do something with the line
            pass

Mocks#

Mock and MagicMock objects create all attributes and methods as you access them and store details of how they have been used. You can configure them, to specify return values or limit what attributes are available.

See https://docs.python.org/3/library/unittest.mock.html

How can we test a function that does not return a value?#

def show_results():
    arr = [1, 2, 3]
    print(arr)
    print()

show_results()
[1, 2, 3]

Here is a possible test#

from unittest.mock import MagicMock

class TestShowResults(unittest.TestCase):
    def setUp(self):
        global print
        print = MagicMock()
    def tearDown(self):
        global print
        print = __builtins__.print
    def test_show_results(self):
        show_results()
        self.assertEqual(print.call_count, 2)

# unittest.main(argv=[''], exit=False)

How does this test work?#

‘Linting’ code with flake8 and mypy#

There are various tools that can analyse Python code and suggest fixes or improvements without running the code.

These are ‘static code checkers’ or ‘linters’ - because they help you remove fluff!

mypy#

We saw mypy briefly before, it is used to find mistakes in type hints, and can even be used to enforce type hints if desired.

https://mypy.readthedocs.io/en/stable/getting_started.html

flake8#

Flake8 runs a variety of checks on your Python scripts, and can be used with IDEs such as VS Code to help you write clearer, more readable, code. The, optional, but highly recommended style guide for Python is PEP 8.

https://peps.python.org/pep-0008/

https://flake8.pycqa.org/en/latest/index.html

Test coverage#

Coverage.py works in three phases:#

  • Execution: Coverage.py runs your code, and monitors it to see what lines were executed.

  • Analysis: Coverage.py examines your code to determine what lines could have run.

  • Reporting: Coverage.py combines the results of execution and analysis to produce a coverage number and an indication of missing execution.

See https://coverage.readthedocs.io/en/7.5.3/api.html

Pytest#

Pytest makes it easier to write and run tests.

Pytest uses file and function naming conventions to discover test. You will rarely need to run a test directly as the framework will find and run tests for you when you modify your code.

Follow the instruction here to install pytest - use a virtual environment.

https://docs.pytest.org/en/8.2.x/getting-started.html

Coverage#

https://pypi.org/project/pytest-cov/

pytest-notebook#

See https://pytest-notebook.readthedocs.io/en/latest/

Resources#

See the testing section of https://alan-turing-institute.github.io/rse-course/html/module01_introduction_to_python/index.html

Testing Practical Exercise#

Python 3 distributions include the unittest module. See https://docs.python.org/3/library/unittest.html

Here is the example included in the Python documentation.

Exercise 1#

Using the above as a template create a test class for the Upper class we used earlier.

class Upper(str):
    def __new__(cls, text=""):
        return super().__new__(cls, text.upper())

Important#

What should (and can) be tested?

See https://docs.python.org/3/library/unittest.html

Exercise 2#

Design a new capability for the class using TDD.

Here are some suggestions -

  • Do not allow strings without at least one letter

  • Only allow strings that begin with a letter

  • Limit the length of the string to 10 characters