CS:2820 Notes, Lecture 8

The code we produced in the last class was ready to try out, so it's time to start testing. Here's the main program:

    public static void main(String[] args) {
        if (args.length < 1) {
            Errors.fatal( "Missing filename argument" );
        } else if (args.length > 1) {
            Errors.fatal( "Unexpected extra arguments" );
        } else try {
            readNetwork( new Scanner( new File( args[1] ) ) );
        } catch (FileNotFoundException e) {
            Errors.fatal( "Can't open file '" + args[0] + "'" );
        }
    }

Testing

How do you go about testing a program? There are numerous approaches to this problem ranging from using random input to carefully reading the code and then designing tests based on that reading.

Random input is actually useful enough that it has a name: Fuzz testing. In an early study of the quality of various implementations of Unix and Linux, fuzz testing forced a remarkable number of standard utility programs into failure, and showed that reimplementing a utility from the specifications in its manuals generally tended to produce a more solid result than the original code that had evolved from some cruder starting point, sometimes decades ago. If a program reads a text file, a fuzz test will simply fill a text file with random text for testing purposes. Programs that respond to mouse clicks can be fuzz tested with random sequences of mouse clicks. A program is deemed to fail a fuzz test if it goes into an infinite loop or fails with an unhandled exception. A program passes if it responds with appropriate outputs and error messages.

Fuzz testing is an example of black-box testing. Black-box testing works from the outside. The designers of black-box tests works from the external specifications for the behavior of the system, trying out everything the system is documented as being able to do, and trying variations in order to see that the system behaves appropriately when the inputs are outside the specifications. In the case of fuzz testing, the test designers don't even need to read the specifications very closely, since the goal is to simply throw junk at the program to see if it breaks. Black-box testing is unlikely to find backdoors or other undocumented features of software, and is therefore not a very strong tool for security assessment.

One common way to design a test is called path testing. To design a path test, you read the code and try to provide test inputs that force the program to execute every possible alternative path through the code.

Path testing is an example of white-box testing (which really ought to be called transparent-box testing). White-box testing is the opposite of black-box testing. White-box testing uses access to the internal structure of the system in order to design tests. In effect, the test designer looks at the mechanism and then creates tests to assure that the mechanism actually does what it seems intended to do. White-box testing can find undocumented features of a system, but it is more labor intensive than black-box testing.

Path Testing the Main Program

Looking at just the main program above, we see that there are 4 paths through the code, so our path test will have at least 4 test cases:

[HawkID@serv15 ~/project]$ java RoadNetwork
Missing filename argument
[HawkID@serv15 ~/project]$ java RoadNetwork a b
Unexpected extra arguments
[HawkID@serv15 ~/project]$ java RoadNetwork nothing
Can't open file 'nothing'

Two of the above test cases were completely tested, but we did a bad job on the final case because we only tested for nonexistant files, not unreadable files. We really ought to test for files that exist but cannot be read. We can do this as follows:

[HawkID@serv15 ~/project]$ echo "nonsense" > testfile
[HawkID@serv15 ~/project]$ chmod -r testfile
[HawkID@serv15 ~/project]$ java RoadNetwork testfile
Can't open file 'nothing'
[HawkID@serv15 ~/project]$ rm testfile

This series of Unix/Linux shell commands creates a file called testfile containing some nonsense, and then changes the access rights to that file so it is not readable before passing it to the program under test. Finally, after testing the program, the final shell command deletes the test file.

The final test we are missing above is one to test how the main program handles a file that contains a valid description of a road network. Reading the code for readNetwork(), the outer loop is a while loop, that terminates when there is no next token in the input file. As a result, an empty input file is a valid (but trivial) road network, so we can finish the path testing of our main program hoping to pass a test something like this:

[HawkID@serv15 ~/project]$ echo "" > testfile
[HawkID@serv15 ~/project]$ java RoadNetwork testfile
[HawkID@serv15 ~/project]$ rm testfile

Test Scripts

Before we move onward, note that it is prudent to re-run all of the tests after any change to the source program. Doing this by hand gets tedious very quickly, so it is tempting to just test the things you think you changed, while hoping that everything else still works. Any experienced programmer knows that fixing one bug frequently creates another, so this is a foolish testing philosophy.

It is far better to create a test script that can be run again after each change to the program. Consider a test file something like this:

#!/bin/sh
# testroads -- test script for road network
java RoadNetwork
java RoadNetwork a b
java RoadNetwork nothing
echo "nonsense" > testfile
chmod -r testfile
java RoadNetwork testfile
rm testfile
echo "" > testfile
java RoadNetwork testfile
rm testfile

The first line of this test script, #!/bin/sh tells the system what shell to use to run your file. You have a choice shell script. If you make the file executable using the chmod +x testroads, the first line lets you run it by just typing the ./testroads shell command.

The second line is a comment giving the file name the text is intended to be in. It would make sense to add comments taking credit for the file, noting the creation date, and other details. The remaining lines are the test. After you create this file, you can run it like this:

[HawkID@serv15 ~/project]$ sh < testfile
Missing filename argument
Unexpected extra arguments
Can't open file 'nothing'
Can't open file 'testfile'

That's quite a jumble of output, and for that matter, the test file itself is a jumble. We need to document the output we expect, and the output of the test script should, at the very least, document the tests being performed and describe how to recognize whether the test was passed or failed. Here's a better test script:

#!/bin/sh
# testroads -- test script for road network

echo "GROUP of path tests for main program"
echo "TEST missing file name argument"
java RoadNetwork

echo "TEST unexpected extra arguments"
java RoadNetwork a b

echo "TEST can't open nonexistant file"
java RoadNetwork nothing

echo "TEST can't open unreadable file"
echo "nonsense" > testfile
chmod -r testfile
java RoadNetwork testfile
rm testfile

echo "TEST reading from an empty file"
echo "" > testfile
java RoadNetwork testfile
rm testfile

The echo shell command simply outputs its command line arguments to standard output, so most of the echo commands above serve both as comments in the source file and to put comments into the output of the tests.

Note that several of the later tests in this test script create temporary test files by using echo to put text into a test file. Each of these tests ends by removing the test file. We could just as easily create a suite of permanent test files, and for larger tests, this would make good sense.

Running the test

Unfortunately, our road network program fails the first interesting test, the final test in our script with an empty source file. We get output something like this:

[HawkID@serv15 ~/project]$ java RoadNetwork roads
Exception in thread "main" java.lang.NullPointerException
    at RoadNetwork.printNetwork(RoadNetwork.java:149)
    at RoadNetwork.main(RoadNetwork.java:172)

What went wrong here? The above error message says that line 149 in our program tried to use a null pointer, and that this was called from line 172 of the program. On opening up the program in an editor and looking at these lines in our source code, it turns out that line 172 is the call to printNetwork(), so we have definitely finished our path coverage of the main program.

        for (Intersection i:inters) {

Here, we tried to pick an intersection out of a list of intersections, but there was no list! That is, the list inters was null, a condition quite distinct from being an empty list. That is, in Java, the statements that "there is no list" and "the list is empty" are not equivalent.

Why was inters null? The call to initializeNetwork() in the main program was supposed to build roads and inters, but since the input file was empty, it did nothing, leaving these lists as they were initially. The initial values of these lists were determined by their declarations:

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads;
    static LinkedList <Intersection> inters;

These declarations are, as it turns out, wrong. The default value of any object in Java (excepting built-in types like int) is null, and that is the source of our null-pointer exception. What we need to do is initialize these two lists to empty lists, not null pointers.

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads
        = new LinkedList <Road> ();
    static LinkedList <Intersection> inters
        = new LinkedList <Interseciton> ();

A question of format: Why wrap both lines when only the second line was too long to fit in an 80 column display? We could have written this:

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads = new LinkedList <Road> ();
    static LinkedList <Intersection> inters
        = new LinkedList <Interseciton> ();

The problem is, these two lines of code are parallel constructions. If we write them so that they wrap identically, the fact that they are parallel is easy to see. If, on the other hand, we wrap them differently or worse, let the text editor wrap them randomly, you have to actually read and understand the text to see the parallel. Attention to this kind of detail makes programs much easier to read.

Path Testing the Input Parser

With this fix, the code compiles and we get the expected result, no output because the input file was empty. So, we can begin testing. As with the main program, we begin with something very simple, a one-line data file that defines just one intersection. For now, let's try these files without worrying about scripts and automation:

The program output is identical to the input, so if it works, we can build on this, adding more intersections and roads, working up to something like this:

intersection A
intersection B
road A B 10
road B A 20

This is not particularly interesting unless we uncover some bugs. The next step is to start making some errors. Consider this input file:

intersection A
intersection B
intersection A
road A B 10
road B A 20

Here, we've deliberately inserted a duplicate intersection definition. When we run the program over this input (stored in the file roads, we get this output:

[HawkID@serv15 ~/project]$ java RoadNetwork roads
Intersection A redefined.
Intersection A
Intersection B
Intersection A
Road A B 10
Road B A 20

This is correct, in as far as it goes, but the output is not very readable. The problem is, the error message is not cleanly distinguished from the output. Our current version of the errors package is at fault, with code something like this:

class Errors {
    static void fatal( String message ) {
        System.err.println( message );
        System.exit( 1 );
    }
    static void warning( String message ) {
        System.err.println( message );
    }
}

What we need is simple, a standard prefix on each error message that distinguishes it from the normal output of the program. Consider this:

class Errors {
    static void fatal( String message ) {
        System.err.println( "Fatal error: " + message );
        System.exit( 1 );
    }
    static void warning( String message ) {
        System.err.println( "Error: " + message );
    }
}

Aside: Standard Error versus Standard Output

In our program, we have output error messages to System.err and normal data output to System.out

By default, when running under the Unix/Linux shell (and under the DOS command line under Windows), output to System.err is mixed in with output to System.out, but they can be separated. Here is a Unix/Linux example to illustrate this:

[HawkID@serv15 ~/project]$ java RoadNetwork roads > t
Intersection A redefined.
[HawkID@serv15 ~/project]$ cat t
Intersection A
Intersection B
Intersection A
Road A B 10
Road B A 20
[HawkID@serv15 ~/project]$ rm t

The added > t at the end of the command running our program diverts System.out (or rather, the Linux/Unix standard output stream) to the file named t. So, when our program runs, the only thing we see on the screen is the error message. Then, we use the command cat t to dump the file t to the screen. We could just as easily have used any text editor to examine the file, and finally, although nothing required us to do so, we deleted the file with the rm command.

Under some Unix/Linux command shells, it is almost as easy to divert standard error (System.err) to a file, but this was an afterthought, so the way you do so differs from one shell to another. Initially, the designers of the Unix shell assumed that users always wanted to see the error messages immediately, while they might want to save other output. As a result, shell tools for redirecting standard error are afterthoughts and differ from one shell to the next.

The two most common families of Unix/Linux shells are sh (the Bourne shell) and its open-source replacement bash (the Bourne-again shell), on the one hand, and csh (the C shell) and its open-source replacement tcsh (the TENEX-inspired rewrite of csh). To find out what shell you are using, type echo $SHELL. This will output the file name from which your current shell is being executed.

In sh and bash, typing >f after a shell command redirects standard output to a file named f while leaving standard error directed to the terminal. In contrast, typing 2>f redirects standard error and leaves standard output unchanged. This strange use of the numeral 2 is based on the fact that, in Unix and Linux, all open files are numbered, and by default, file 0 is standard input, file 1 is standard output, and file 2 is standard error. This is a really odd design, but it works. If you want to redirect both standard output and standard error to different files, you can write >f 1>g.

In csh and tcsh, typing >f after a shell command works as it did in sh. In csh typing >&f after a shell command redirects both standard output and standard error to the same file. If you want to split the two into different files, you can use >f >&g. This works because the first redirection took only standard output, so all that is left for the second redirection is standard error. In effect, the >& really means to take standard output, standard error or both, whichever has not already been redirected.

In both csh and tcsh, typing >f after a shell command will overwrite the contents of file f if that file already exists. In contrast, typing >>f after the command will append that command's output to the existing file.

Path Testing Continued

Another obvious error to explore occurs when a road is defined in terms of an undefined intersection. Consider this input file:

intersection A
intersection B
intersection A
road A B 10
road B A 20
road A C 2000

When we run this, we get the expected error messages, but when it tries to output Road C we get a null pointer exception.

What is the problem? There are some bug notices in our code that are closely related to this. Specifically, in the initializer for Road, when we output the warning about an undefined intersection, we wrote this:

        if (destination == null) {
            Errors.warning(
                "In road " + sourceName + " " + dstName +
                ", Intersection " + dstName + " undefined"
            );
            // Bug:  Should we prevent creation of this object?
        }

We did not prevent creation of the object when the declaration of that object contained an undefined destination interseciton name. Instead, we left the object with a null destination field. This caused no problem until later when we tried to output the road description using the toString() method:

    public String toString() {
        return (
            "Road " +
            source.name + " " +
            destination.name + " " +
            travelTime
        );
    }

In this code, we blindly reached for the name fields of the source and destination intersections without checking to see if they exist. We need to add this check. Perhaps the uglyest but most compact way to do so is to use the embedded conditional operator from Java:

    public String toString() {
        return (
            "Road " +
            (source != null ? source.name : "---" ) +
            " " +
            (destination != null ? destination.name : "---" ) +
            " " +
            travelTime
        );
    }

This code works, substituting --- for any names that were undefined in the input file, but it is maddeningly difficult to format this code so that it is easy to read. C, C++ and Java all share the same basic syntax for the conditional operator (a?b:c), and some critics consider this operator to be so unreadable that they advise never using it. It might be better to add a private method that is easier to read and packages the safe return of either the name or dashes if there is no name. We'll worry about this later.

Test Frameworks

Testing a large program typically produces a number of tests. During program development, it is a good idea, after each change to the program, to re-run all of the tests in order to make sure that the change did not cause any damage. Doing the tests by hand can be time consuming, so it is a good idea to create a test framework and automate things.

Second, as we've already suggested, it makes sense to write a shell script to automate the testing. Following the model used in our first script example, we might add tests that look something like this:

echo "GROUP of path tests for readNetwork"
echo "TEST reading from an empty file"
echo "" > testfile
java RoadNetwork testfile
rm testfile

echo "TEST reading from a non empty file"
echo "intersection A" > testfile
echo "intersection B" >> testfile
echo "road A B 10" >> testfile
echo "road B A 10" >> testfile
java RoadNetwork
rm testfile

echo "TEST error when not a command"
cat > testfile << '--end--'
intersection A
road A A
error A
--end--
java RoadNetwork
rm testfile

The above illustrates two ways of creating multiple line test files. The first is to echo each line of the test into the test file separately, the second uses a special form of input redirection that takes all input after the command up to the indicated termination string.

The problem with this is that test output gets long and it is up to the person running the test to scroll through all the output and see if it makes sense. It would be better to have the test script pause after each test and tell the user what output to expect in cases where the test title didn't explain it:

echo "GROUP of path tests for readNetwork"
echo "TEST reading from an empty file"
echo "" > testfile
java RoadNetwork testfile
rm testfile
echo "--- The above should produce no output"
read -p "--- Press enter to continue"

echo "TEST reading from a non empty file"
echo "intersection A" > testfile
echo "intersection B" >> testfile
echo "road A B 10" >> testfile
echo "road B A 10" >> testfile
java RoadNetwork
echo "--- The above should output something equivalent to this:"
cat testfile
rm testfile
read -p "--- Press enter to continue"

echo "TEST error when not a command"
cat > testfile << '--end--'
intersection A
road A A
error A
--end--
java RoadNetwork
echo "--- The above should complain: 'error' is not a road or intersection"
rm testfile
read -p "--- Press enter to continue"

The above test framework requires human evaluation of the output of each test. There are tools we can use to automate this. The most important of these tools is the diff shell command. This command compares two source files and outputs all of the differences between the files. If the files are identical, it exits with a success code, while if they are different, it exits with a failure code. This allows you to write a shell script that runs flat out except when tests fail, and only then halts to call attention to the failure.

For the next example, we suppose that the directory testfiles contains test data files and files of the expected output for each test. We could rewrite the first test in the above test script as follows:

echo "TEST reading from an empty file"
java RoadNetwork testfiles/emptyfile > output 2> errors
if ! diff output testfiles/emptyfile
        then read -p "FAILURE, wrong output, press enter."
fi
if ! diff errors testfiles/emptyfile
        then read -p "FAILURE, wrong errors, press enter."
fi
rm output errors

Note that the above script is written for sh or bash. The syntax of conditionals in csh and tcsh is different. The convention for indenting shell scripts is to use a single tab for each indenting level. It shouldn't be difficult to figure out the above. The if ! command executes the command that follows it on the same line and sets things up so that code following the next then command will execute only if the tested command fails. You can read this as "if there are differences between the two files, then prompt with a message starting with FAILURE and await input.

The strange command fi ends the if-then-else block, since fi is if spelled backwards. The designer of sh, Stephen Bourne, copied this idea from Algol 68, an innovative language that also ended do blocks with od and case blocks with esac. People joked that Algol 68 comments beginning with comment should have ended with tnemmoc but the language designers relented and made them end with a semicolon.

At this point, it should be clear that developing test scripts is itself a programming job. This was recognized as long ago as the late 1960s, when Fred Brooks, in his 1974 book The Mythical Man Month (a classic in the field of software engineering, still in print) suggested that in a software developmnt team, having someone specialize in building tools such as test frameworks makes very good sense, as does having someone specialize in testing and someone else specialize in documentation.

A suggestion: As each new test is developed, do the testing by hand first. Once you are satisfied that the code passes the test, redirect the output to capture the expected output into a file, and then add that test to the script.

8. Some Debugging

Where were we?