13. Testing

Part of CS:2820 Object Oriented Software Development Notes, Spring 2021
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

 

An aside on Java enumerations

Consider the following declaration of an enumeration in Java:

enum Color { red, green, blue }

If you wrote the above in C or C++, it would be equivalent to something like this:

const int red = 0;
const int green = 1;
const int blue = 2;

Enumerations are significantly more complex in Java. To Java, the above enum declartion is equivalent to something like this:

final class Color {
    private final int value; // the only instance variable

    private Color( int v ) { value = v; }

    public static final Color red = new Color(0);
    public static final Color green = new Color(1);
    public static final Color blue = new Color(2);

    private static final String[] names = { "red", "green", "blue" };
    public toString() { return names[ this.value ]; }
}

Thee above is oversimplified. There are several other methods provided for each enumeration, but this also illustrates some useful code patterns:

Comparing values from an enumeration class is easy if all you want to do is compare for equality. The == and != operators work as a naive programmer expects and it is never necessary to use equals(). That is, if a and b are two variables of the same enumeration class, a==b and a.equals(b) will always return the same result.

Comparing for order is also allowed. a.compareTo(b) compares the hidden underlying integer values. These values are always assigned in the same order as the textual order of the enumeration values. In this example, red was declared first, so it is less than everything after it. However, in Java, you can't use a>b.

Testing

How do you go about testing a program? There are numerous approaches to this problem ranging from using random input to carefully reading the code and then designing tests based on that reading.

Fuzz Testing

Random input is actually useful enough that it has a name: Fuzz testing. In an early study of the quality of various implementations of Unix and Linux, fuzz testing forced a remarkable number of standard utility programs into failure, and showed that reimplementing a utility from the specifications in its manuals generally tended to produce a more solid result than the original code that had evolved from some cruder starting point, sometimes decades ago. If a program reads a text file, a fuzz test will simply fill a text file with random text for testing purposes. Programs that respond to mouse clicks can be fuzz tested with random sequences of mouse clicks. A program is deemed to fail a fuzz test if it goes into an infinite loop or fails with an unhandled exception. A program passes if it responds with appropriate outputs and error messages.

Here is a little Java program that generates a fuzz to standard output:

import java.util.Random;
import java.lang.NumberFormatException;

/** Fuzz test generator
 * output a random length string of gibberish to standard output
 * this gibberish is in ASCII, the 128 character subset of Unicode
 * takes 1 command line argument, an integer,
 * this controls the expected output length.
 * @author Douglas W. Jones
 * @version Sept. 18, 2020
 */
class Fuzz {

    public static void main( String arg[] ) {
        Random rand = new Random();
        int n = 0; // controls length of output
	if (arg.length != 1) {
	    System.err.println( "argument required -- length of output file" );
	    System.exit( 1 );
	} else try {
	    n = Integer.valueOf( arg[0] );
	} catch (NumberFormatException e) {
	    System.err.println( "non numeric argument -- length of output" );
	    System.exit( 1 );
        }
	while (rand.nextInt(n) > 0) {
	    System.out.print( (char)rand.nextInt( 128 ) );
	}
	System.out.print( '\n' );
    }
}

Fuzz testing is an example of black-box testing. Black-box testing works from the outside. The designers of black-box tests works from the external specifications for the behavior of the system, trying out everything the system is documented as being able to do, and trying variations in order to see that the system behaves appropriately when the inputs are outside the specifications. Black-box testing does not rely on any knowledge of the code or internal details of how the system is actually built.

In the case of fuzz testing, the test designers don't even need to read the specifications very closely, since the goal is to simply throw junk at the program to see if it breaks. Black-box testing is unlikely to find backdoors or other undocumented features of software, and is therefore not a very strong tool for security assessment. However, fuzz testing breaks many real programs, even programs that are sold commercially. Fuzz testing finds surprisingly many very stupid programming errors.

If you wanted to fuzz test the road network simulator, you might run the following shell commands repeatedly:

java Fuzz 100 > test
java RoadNetwork test

The trouble with this is that the fuzz is too fuzzy. You'd have to run the test thousands of times just to have chance that the test would begin with the word intersection, so the test wouldn't be very thorough. On the other hand, if your program does throw an exception on fuzz test input, it is seriously defective. We could use fuzz testing where a file begins sensibly and then ends with fuzz. For example, consider this:

echo "intersection " > test
java Fuzz 10 >> test
java RoadNetwork test

Or this:

echo "intersection a" > test
echo "intersection b" >> test
java Fuzz 10 >> test
java RoadNetwork test

This allows us to construct a series of tests that add selective fuzz at various places in the input. Still, fuzz testing is too fuzzy. You really don't know how thorough the test was. Nonetheless, fuzz testing can be easily automated to run thousands of trials, ignoring normal output from the program and paying attention only to outputs that indicate unhandled exceptions or other fairly easy to detect failures.

Path Testing

One common way to design a systematic test is called path testing. To design a path test, you read the code and try to provide test inputs that force the program to execute every possible alternative path through the code.

Path testing is an example of white-box testing (which really ought to be called transparent-box testing). White-box testing is the opposite of black-box testing. White-box testing uses access to the internal structure of the system in order to design tests. In effect, the test designer looks at the mechanism and then creates tests to assure that the mechanism actually does what it seems intended to do. White-box testing can find undocumented features of a system, but it is more labor intensive than black-box testing.

Looking at just the main program we have been working on, there are 4 paths through the code, so our path test will have at least 4 test cases:

Here is a transcript of a successful path-testing session that tests 3 of the above cases:

[HawkID@serv15 ~/project]$ java RoadNetwork
Missing filename argument
[HawkID@serv15 ~/project]$ java RoadNetwork a b
Unexpected extra arguments
[HawkID@serv15 ~/project]$ java RoadNetwork nothing
Can't open file 'nothing'

Two of the above test cases were completely tested, but we did a bad job on the final case because we only tested for nonexistant files, not unreadable files. We really ought to test for files that exist but cannot be read. We can do this as follows:

[HawkID@serv15 ~/project]$ echo "nonsense" > testfile
[HawkID@serv15 ~/project]$ chmod -r testfile
[HawkID@serv15 ~/project]$ java RoadNetwork testfile
Can't open file 'nothing'
[HawkID@serv15 ~/project]$ rm testfile

This series of Unix/Linux shell commands creates a file called testfile containing some nonsense, and then changes the access rights to that file so it is not readable before passing it to the program under test. Finally, after testing the program, the final shell command deletes the test file.

The final test we are missing above is one to test how the main program handles a file that contains a valid description of a road network. Reading the code for readNetwork(), the outer loop is a while loop, that terminates when there is no next token in the input file. As a result, an empty input file is a valid (but trivial) road network, so we can finish the path testing of our main program hoping to pass a test something like this:

[HawkID@serv15 ~/project]$ echo "" > testfile
[HawkID@serv15 ~/project]$ java RoadNetwork testfile
[HawkID@serv15 ~/project]$ rm testfile

Test Scripts

Before we move onward, note that it is prudent to re-run all of the tests after any change to the source program. Doing this by hand gets tedious very quickly, so it is tempting to just test the things you think you changed, while hoping that everything else still works. Any experienced programmer knows that fixing one bug frequently creates another, so this is a foolish testing philosophy.

It is far better to create a test script that can be run again after each change to the program. Consider a test file something like this:

#!/bin/sh
# testroads -- test script for road network
java RoadNetwork
java RoadNetwork a b
java RoadNetwork nothing
echo "nonsense" > testfile
chmod -r testfile
java RoadNetwork testfile
rm testfile
echo "" > testfile
java RoadNetwork testfile
rm testfile

The first line of this test script, #!/bin/sh tells the system what shell to use to run your file. You have a choice shell script. If you make the file executable using the chmod +x testroads, the first line lets you run it by just typing the ./testroads shell command.

The second line is a comment giving the file name the text is intended to be in. It would make sense to add comments taking credit for the file, noting the creation date, and other details. The remaining lines are the test. After you create this file, you can run it like this:

[HawkID@serv15 ~/project]$ sh < testfile
Missing filename argument
Unexpected extra arguments
Can't open file 'nothing'
Can't open file 'testfile'

Once you have a test script, you face the problem of maintaining the script. Some changes to the source code will require corresponding changes to the script. One feature of the extreme programming methodology is the idea that each development step begins with a change to the specificiations, usually an enhancement, and then test development before coding. When the change to the specifications alters some part of the input syntax, tests for that part of the syntax must be updated.

The tests suggested above produce quite a jumble of output, and for that matter, the test file itself is a jumble. At the very least, we need to document the output we expect, and the output of the test script should really document the tests being performed and describe how to recognize whether the test was passed or failed. Here's a better test script:

#!/bin/sh
# testroads -- test script for road network

echo "GROUP of path tests for main program"
echo "TEST missing file name argument"
java RoadNetwork

echo "TEST unexpected extra arguments"
java RoadNetwork a b

echo "TEST can't open nonexistant file"
java RoadNetwork nothing

echo "TEST can't open unreadable file"
echo "nonsense" > testfile
chmod -r testfile
java RoadNetwork testfile
rm testfile

echo "TEST reading from an empty file"
echo "" > testfile
java RoadNetwork testfile
rm testfile

The echo shell command simply outputs its command line arguments to standard output, so most of the echo commands above serve both as comments in the source file and to put comments into the output of the tests.

Note that several of the later tests in this test script create temporary test files by using echo to put text into a test file. Each of these tests ends by removing the test file. We could just as easily create a suite of permanent test files, and for larger tests, this would make good sense.

Running the test

Unfortunately, our road network program fails the first interesting test, the final test in our script with an empty source file. We get output something like this:

[HawkID@serv15 ~/project]$ java RoadNetwork roads
Exception in thread "main" java.lang.NullPointerException
    at RoadNetwork.printNetwork(RoadNetwork.java:149)
    at RoadNetwork.main(RoadNetwork.java:172)

What went wrong here? The above error message says that line 149 in our program tried to use a null pointer, and that this was called from line 172 of the program. On opening up the program in an editor and looking at these lines in our source code, it turns out that line 172 is the call to printNetwork(), so we have definitely finished our path coverage of the main program.

Line 149 is this:

        for (Intersection i:inters) {

Here, we tried to pick an intersection out of a list of intersections, but there was no list! That is, the list inters was null, a condition quite distinct from being an empty list. That is, in Java, the statements that "there is no list" and "the list is empty" are not equivalent.

Why was inters null? The call to initializeNetwork() in the main program was supposed to build roads and inters, but since the input file was empty, it did nothing, leaving these lists as they were initially. The initial values of these lists were determined by their declarations:

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads;
    static LinkedList <Intersection> inters;

These declarations are, as it turns out, wrong. The default value of any object in Java (excepting built-in types like int) is null, and that is the source of our null-pointer exception. What we need to do is initialize these two lists to empty lists, not null pointers.

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads
        = new LinkedList <Road> ();
    static LinkedList <Intersection> inters
        = new LinkedList <Interseciton> ();

A question of format: Why wrap both lines when only the second line was too long to fit in an 80 column display? We could have written this:

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads = new LinkedList <Road> ();
    static LinkedList <Intersection> inters
        = new LinkedList <Interseciton> ();

The problem is, these two lines of code are parallel constructions. If we write them so that they wrap identically, the fact that they are parallel is easy to see. If, on the other hand, we wrap them differently or worse, let the text editor wrap them randomly, you have to actually read and understand the text to see the parallel. Attention to this kind of detail makes programs much easier to read.