5. Starting to Build a Model

Part of CS:2820 Object Oriented Software Development Notes, Fall 2020
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

 

Back to the road-network example

In earlier lectures, we suggested that a model of a road network would have classes for roads and intersections, with each linking to the other to describe the topology of the network. The resulting code might look like this:

import java.util.LinkedList;

/** Roads are one-way streets linking intersections
 *  @see Intersection
 */
class Road {
    float travelTime;         //measured in seconds
    Intersection destination; //where the road goes
    Intersection source;      //where the comes from
    // textual name of road is source-destination
}

/** Intersections join roads
 *  @see Road
 */
class Intersection {
    String name;
    LinkedList <Road> outgoing = new LinkedList <Road> ();
    LinkedList <Road> incoming = new LinkedList <Road> ();
    // Bug: deal with type of intersections: uncontrolled, stoplight, etc
}

It's time to start working on initializing a road network. In theory we could build the road-network description in many ways. For example, we could build a graphical user interface where users could point and click to drag and drop intersections and roads into place.

GUIs are marvelously fun to use, interesting to design, and worthy of an entire course, but this is not that course, so we will pursue a simpler approach, reading the description of a road network from a text file. Consider, for example, a file structured as follows:

intersection ...
intersection ...
intersection ...
road ...
road ...
road ...
road ...
road ...

Each line of the file begins with a keyword, either road or intersection, followed by the attributes of that road or intersection. We will expand on this description of the text file after we make some progress toward reading it.

Access to the text file

Java provides a very useful class for reading text files, the scanner. To quote the official definition of this class: "A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types using the various next methods." The setup for calling a scanner is as follows:

import java.io.File;
import java.util.Scanner;

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        // Bug:  Need code to see if there is a file name
        Scanner sc = new Scanner(new File(args[0]));
        // Bug:  What if the file doesn't exist?
    }
}

The main class of the program must be declared as public class and the class name must match the file name. The main method must be declared as a public static void method with the name main, and it must take an un-dimensioned array of strings as an argument. By convention, this parameter is called args because it holds the arguments that were used to launch the program. The conventions for main are inherited from C++ which inherited them from C, a language that was developed in the late 1960s in the context of the Unix system and is strongly coupled to the Unix command line user interface.

Aside: This mix of required text and traditional text is called boilerplate. The concept of boilerplate text comes from legal documents, where boilerplate text is text that has been tested in court for generations and is therefore known to be resistant to challenge (like armor plate or the plate steel used to make steam boilers). Lawyers copy boilerplate from lawbooks instead of writing creatively in order to avoid the risk that they might overlook some detail that creates text that is vulnerable in court. Any text that you copy from a standard form instead of writing creatively has come to be knwn as boilerplate.

The above code creates a scanner called sc that reads from a file whose name is give by the first command line argument passed when launching the main program.

Consider running your program with this command:

[HawkID@serv15 ~/project]$ java RoadNetwork IowaCity.txt

This launches RoadNetwork.class, the file created by compiling RoadNetwork.java. Inside the main method of this class, you will find that args[0] has been set to the value "IowaCity.txt", so the call to the initializer

        Scanner sc = new Scanner(new File(args[0]));

is equivalent to

        Scanner sc = new Scanner(new File("IowaCity.txt"));

There are a number of different initializers for class Scanner, but the one we are calling here expects an open file as a parameter, and the initializer we are using for class File takes the file name, as a string, as a parameter.

Of course, the skeletal definition given above has some bugs. The code ought to check that there is a command line argument before using it, and it ought to output a sensible error message if the file does not exist or cannot be read. We need to fix the latter to make this file compile at all.

Where were we?

Putting everything together, we have the following class definitions for building our road network:

import java.io.File;
import java.util.LinkedList;
import java.util.Scanner;

/** Roads are one-way streets linking intersections
 *  @see Intersection
 */
class Road {
    float travelTime;         //measured in seconds
    Intersection destination; //where the road goes
    Intersection source;      //where the comes from
    // name of road is source-destination
}

/** Intersections join roads
 *  @see Road
 */
class Intersection {
    String name;
    LinkedList <Road> outgoing = new LinkedList <Road> ();
    LinkedList <Road> incoming = new LinkedList <Road> ();
    // Bug: multiple types of intersections (uncontrolled, stoplight)

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        // Bug:  Must add code to see if there is a file name
        Scanner sc = new Scanner( new File( args[0] ) );
        // Bug:  What if the file doesn't exist?
    }
}

Recall also that we have decided to use a file containing a list of intersections and roads to describe the road network. Ignoring all details of how roads and intersections are described, the file might look something like this:

intersection ...
intersection ...
intersection ...
road ...
road ...
road ...
road ...
road ...

The skeletal code given above contains bugs, but it also contains a potential problem for programmers familiar with C or C++: The first command line argument after the program name is args[0]. This will seem strange to programmers accustomed to C or C++, where argv[0] is the program name and argv[1] is the first parameter after the program name.

Another problem for C or C++ programmers is that there is no count of command line arguments in Java. A C or C++ programmer would use an additional parameter to the main program, argc, to learn the count of the number of arguments. Arrays in C and C++ do not have a length attribute, but in Java, we ask the array args to tell us its length by using args.length.

Knowing the above, we can fix one bug in the program, but only at the cost of introducing another:

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        if (args.length != 1) {
            // Bug:  Complain about wrong number of arguments
        } else {
            Scanner sc = new Scanner( new File( args[0] ) );
            // Bug:  What if the file doesn't exist?
        }
    }
}

We'll put off the question of what to do if no input file is specified, but whatever it is, it will be very similar to what we do if the input file is specified but doesn't exist. In that case, the attempt to open the file within the scanner will throw an exception, and Java won't allow us to write code that could throw an exception without providing a handler. So, the skeleton of our main program code will look like this:

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {
    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else try {
            Scanner sc = new Scanner( new File( args[0] ) );
            // Bug:  Now we can process the file here
        } catch (FileNotFoundException e) {
            // Bug:  Complain that the file doesn't exist
        }
    }
}

Processing the text file

What does a scanner do? We can ask the scanner whether there is more input with sc.hasNext(). We can ask if the next input is a number with sc.hasNextInteger() or sc.hasNextFloat(). We can ask for the next string from the input with sc.next() or the next integer from the input with sc.nextInt().

The outermost loop of the road network initializer is pretty obvious: Read lines from the text file and process them. We could do all the processing in the outer loop, but that means that the outer loop needs to know about every detail of describing roads and intersections. One of the principle ideas behind object oriented programming is that all the aspects of each class should be encapsulated inside that class. How to read the description of a road, for example, is an issue that only matters to class Road.

Encapsulating everything about roads in class Road does have a potential downside. It means that the code to process the input language of our highway simulator will be scattered through the simulator. This makes it easy to modify details of roads, for example, but difficult to find out what the entire input language is. These kinds of design tradeoffs are unavoidable.

If we accept the decision to put all details of how roads are described in class Road, and to handle class Intersection similarly, we get code like this:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

/** RoadNetwork, the main class to build a network of roads and intersections.
 *  @see Road
 *  @see Intersection
 */
public class RoadNetwork {

    /* the sets of all roads and all intersections */
    static LinkedList <Road> roads;
    static LinkedList <Intersection> inters;

    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else try {
            Scanner sc = new Scanner( new File( args[0] ) );
            while (sc.hasNext()) {
                // until the input file is finished
                string command = sc.next()
                if (command == "intersection") {
                    inters.add( new Intersection( sc ) );
                } else if (command == "road") {
                    roads.add( new Road( sc ) );
                } else {
                    // Bug: Complain about unknown command
                }
            }
        } catch (FileNotFoundException e) {
            // Bug:  Complain that the file doesn't exist
        }
    }
}

The central tool used above is the next() method of class Scanner. You should look up class scanner to see all of its next methods, but the simplest of these is simply called next(). All than next() does is return the next string from the input stream. By default, successive strings in the input are delimited by things like spaces, tabs and newlines. Other next methods get the next integer, the next boolean, the next character, or the next float. We will use some of these later.

The above code assumes that we can use constructors for classes Road and Intersection to create a new class members, where the initializer is responsible for scanning the description of the new object from the source file. The code also assumes that we want to keep a list of all the roads we have scanned and all the intersections. At this point, we are not committing ourselves to do anything with these lists, but when the time comes to connect two intersections with a road, we'll have to look up those intersections somewhere.

Regardless of the number of spaces used for each indenting level, the above code is indented at close to the limit that can be easily understood. Psychologists say that the human mind can only handle about 7 plus or minus 2 different things in short-term memory, so once the number of levels exceeds 5, regardless of the visual presentation, a program will be hard to understand. We can resolve this by putting the loop outside the try block, but that makes it possible that sc could be null. Alternatively, we can move the bulk of the code in a second method:

    /** Initialize this road network by scanning its description
     */
    static void readNetwork( Scanneer sc ) {
        while (sc.hasNext()) {
            // until the input file is finished
            string command = sc.next()
            if (command == "intersection") {
                inters.add( new Intersection( sc, inters ) );
            } else if (command == "road") {
                roads.add( new Road( sc, inters ) );
            } else {
                // Bug: Complain about unknown command
            }
        }
    }

    /** Main program
     * @see readNetwork
     */
    public static void main(String[] args) {
        if (args.length < 1) {
            // Bug:  Complain about a missing argument
        } else try {
            readNetwork( new Scanner(new File(args[0])) );
        } catch (FileNotFoundException e) {
            // Bug:  Complain that the file doesn't exist
        }
    }
}

Text file design

As we've already indicated, the above code assumes that the constructors for classes Road and Intersection exist. In order to write this code, we need to start fleshing out some details of the source file describing the road network. Here, we will assume that the first item in each line is the name of the road or intersection. Intersections have arbitrary names, while road names consist of a pair of interseciton names, separated by a space or tab. For roads, we'll assume that the next attribute is the travel time.

This is an inadequate way to describe roads. Later, we'll discover that some intersections have stoplights that alternately allow east-west travel and north-south travel. When we get to that point, we'll have to extend our naming convention so that a road can connect, for example, outgoing north from intersection A and incoming east to intersection B. For now, we'll ingore this, but we'll do so with the knowledge that our initial design is inadequate. The initial design gives us something like this:

intersection a
intersection b
intersection c
road a b 30
road b a 30
road a c 12
road c a 12
road b c 22

A text file design for the epidemic model

The initialization logic for the epidemic model is going to be different, because the epidemic model does not rest on a pre-specified map of the community! Instead, in the epidemic model, the initialization code will read in a list of statistics and then generate a community.

We have already discussed the basic parameters we might want to specify. All we have to do is give them a specific format:

pop 2500;       // population
house 4,3;      // household size average 4, plus or minus 3
jobs 0.25;      // 1/4 of the population has jobs
study 0.5;      // 1/2 of the population are students
school 250,100;	// number of students per school, plus or minus
class 25;	// student-teacher ratio

This is still underspecified, but we have a something to start with.

Note: Commas and semicolons are included above at the end of each item out of habit. Our programming languages are heavy on these. If we run into difficulty with this idea we can abandon punctuation. The // comment format is also borrowed, but we really don't need comments, so again, if we run into difficulty, we can abandon this idea.

The biggest problem with this specification is that the probability distributions are odd. Consider the household size distribution given above:

Perhaps we need to refine the specificaiton so that we can write this:

house 4,3 uniform;
school 250,100 normal;

Or, perehaps we should make the type of probability distribution for each type of item fixed by the item type in order to avoid creating a general purpose mechanism to solve a problem where, in real life, the distribution depends only on the type of real-world data we are specifying.

Executive decision: Prototype software should stay simple! We will not introduce an unnecessary general purpose mechanism here!

Scanners again

When in doubt about tools, experiment! Here is a little program to test class Scanner agains the proposed input format for the epidemic model:

import java.util.Scanner; public class ScanTest { static final Scanner in = new Scanner( System.in ); public static void main(String[] args) { System.out.print( in.next() ); // get a string System.out.print( in.nextInt() ); // followed by an int System.out.print( in.next(";") ); // followed by semicolon } }

Playing with it shows that using scanners with the default definition of an integer requires that there be a space between the integer and the semicolon. We can live with this, at least for the time being. It's also clear that any real application using scanners needs to catch exceptions so that the input isn't cluttered with stack traces every time someone makes a typo!