5. Using some Built-In Classes

Part of CS:2820 Object Oriented Software Development Notes, Fall 2015
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Some Built-in Classes

Java provides a large library of useful built-in classes, far more than just int and float.

Among the most useful of these are a variety of collection classes allowing objects to be arranged in various types of lists and queues. The Wikipedia page on the Java collections framework gives pretty good documentation of this.

Looking at our growing road network example, we can immediately see one place where we need something from this library. Here is quote from the notes for Lecture 4:

class Intersection {
        /** Intersections join roads
         *  @see Road
         */
        // Bug: multiple outgoing roads
        // Bug: multiple incoming roads
        // Bug: multiple types of intersections (uncontrolled, stoplight)
}

The first two bug notices above refer to collections of incoming and outgoing roads. All Java collections support methods to clear the collection, add elements to the collection, test if an element is in the colleciton, test if a collection is empty, remove an object from a collection and iterate over members of the collection. These are documented in the official Java documentation for Interface Collection <E>.

The Java AbstractSet class is a candidate. Sets do not permit duplcate entries and they have no natural ordering but you can iterate over their members.

The Java LinkedList class is particularly applicable here. Linked lists are collections that have order and permit duplicate entries. While we are not interested in duplication, we do need order, because the rules for intersections controlled by 4-way stop signs have an ordering constraint (priority goes to the rightmost vehicles among those waiting at the intersection).

Considerations in Selecting Classes

A natural rule for selecting the class to use in a particular place is, select the class that comes closest to implementing the semantics you are looking for. What could be more obvious.

The downside of this rule is that it can lead to two significant problems: One is a problem for the programmer. You have to remember all the class interfaces for all the classes you are programming with. The human mind is indeed a remarkable thing, but it is not unlimited. You will work faster if you limit your tool set to a small enough size that you are not constantly looking things up.

The second problem is one of program size and performance. In a large system, if you make independent selections of the classes you will use to implement each system component, the size of the system can grow explosively. The set and list classes offered by Java do most of the same things, but they do not necessarily share much of their implementation (although I suspect that sets are implemented using lists).

Modern computers have very large and inexpensive memories, so explosive growth of code is not obviously a problem. However: Modern computers use cache memory technology. This means that recently used memory locations are faster than memory locations that have not been used recently. As a result, if one software package, such as a class implementation, is used instead of two different packages, the result will probably run faster.

Another reason to try to limit the number of classes on which your code depends comes from the land of security: If you are trying to write a secure applicaiton, each piece of foreign code, that is code you did not write yourself, is a potential attack vector. That is, it is possible that the author of that code could deliberately or accidentally include instructions in that code that create a security problem for your application. Limiting the number of foreign code modules allows you to put more effort into assuring that the foreign code on which you rely is safe.

All of the above considerations suggest that it is a good idea to try to adopt policies, in any large project, to limit the size of the search space. Typically, such a policy might say: To solve this general category of data-structure problems, use this specific class. A secondary policy might specify what process you should follow in order to justify adding a new class to the set of recommended classes.

Continuing the Example

Given that the LinkedList class largely duplicates what the AbstractSet class does, and given that an extra ordering constraint is not a problem for the set of incoming roads at an intersection, we can decide that linked lists will dominate our model. This leads to the following code:

import java.util.LinkedList;

class Intersection {
        /** Intersections join roads
         *  @see Road
         */
        // The roads that connect to this intersection
	LinkedList outgoing = new LinkedList();
	LinkedList incoming = new LinkedList();
        // Bug: can Java know that the above list elements are roads
        // Bug: multiple types of intersections (uncontrolled, stoplight)
}

The import line before the class definition tells the compiler to look for the LinkedList class in the java.util package.

There is a new bug notice above because the list declarations used above allow anything at all to be put in the incoming and outgoing lists. The list could contain, as consecutive elements, a road, an integer, a character string, or just about anything else. This is not really what we want! Java has a mechanism to solve this problem, generic classes: What we ought to declare is:

import java.util.LinkedList;

class Intersection {
        /** Intersections join roads
         *  @see Road
         */
        // The roads that connect to this intersection
        LinkedList<Road> outgoing = new LinkedList<Road>();
        LinkedList<Road> incoming = new LinkedList<Road>();
        // Bug: multiple types of intersections (uncontrolled, stoplight)
}

The above notation explicitly indicates that outgoing and incoming are linked lists of roads and not of anything else.

Continuing the Example

Recall that the basic discrete event simulation algorithm from the notes for Lecture 4:

// for all x from the set of initial events
eventSet.schedule( x )
repeat {
	e = eventSet.getNext
	// simulate event e at time e.time
	// this may involve scheduling new events
}

The Java library has several classes that can represent the event set. Notable among these are the classes in the SortedSet interface collection. Class PriorityQueue is not in this category but is also relevant.

Note that, while the pending event set is, formally speaking, a set, there is no restriction preventing there from being multiple pending events. If you look at the definition of the SortedSet class you will find that it requires that there be a total ordering on the set elements. This is incompatable with our discrete-event simulation model. Sadly, the skip-list implementation that is standard in Java was developed for discrete-event simulation but is not applicable to our problem because of this interface restriction.

In contrast, the PriorityQueue class has no restriction requiring a total ordering; instead, it offers nondeterministic (but not necessarily random) breaking of ties between elements with identical priority. As a result, we will use a priority queue to implement the pending-event set.

The Need for an Input Language

We need an input language to describe our road network. Before doing any computation on the road network, we need to load a set of roadt into the system. An obvious way to do this is to have the program read a text file containing the road network description, but this requires that we have a language (of sorts) for this file. Consider the following description of the roads around the Pentacrest on campus:

intersection Clinton&Washington
intersection Clinton&Iowa
intersection Clinton&Jefferson
intersection Madison&Jefferson
intersection Madison&Iowa
intersection Madison&Washington
intersection Capitol&Washington

road Clinton&Washington Clinton&Iowa
road Clinton&Iowa Clinton&Washington

road Clinton&Iowa Clinton&Jefferson
road Clinton&Jefferson Clinton&Iowa

road Madison&Jefferson Clinton&Jefferson

road Madison&Iowa Madison&Jefferson
road Madison&Jefferson Madison&Iowa

road Madison&Washington Madison&Iowa
road Madison&Iowa Madison&Washington

road Madison&Washington Capitol&Washington
road Capitol&Washington Madison&Washington

road Clinton&Washington Capitol&Washington

The above description of a road network names each intersection by the street names that intersect there, and then describes each segment of street by the intersections it connects. Street segments are one-way, so where there is a two-way street, it is mentioned twice, once for each way. The above description contains two one-way segments, Jefferson Street between Clinton and Madison, and Washington Street between Clinton and Capitol.

We can obviously extend this description by adding additional attributed to each street, for example, the travel time for the street, and we can add attributes to intersections, for example, the type of the intersection.

It is worth asking: How complex a language do we want. It is easy to imagine adding features to the above framework until the street network description language becomes a general purpose programming language. This is probably a bad idea. It won't be a very good programming language. If you look at the world of computing, though, it is common to find that special purpose language gradually become general purpose: The world of the Minecraft game, for example, can be made to behave as a programming language, although building a useful applicaiton in Minecraft is not likely to be a very effective way to build, for example, a spreadsheet.

As a general piece of advice, avoid the temptation to develop your applicaton's input language into a programming language. The general result is rarely very good.

There are two ways to build a program that reads the language suggested above:

You could write a large block of code in your main program's initializer that reads the entire input file and then, after reading each line of input, calls the appropriate code to create roads and intersections. This has the advantage of putting all code involved with the language in one place, but it has a disadvantage: The language-reading code has to understand the complete class interface of each class you are working with.
You could write the initializer of each object the language creates to include a section that reads the part of the input describing that object. This has the advantage that the main program's initializer just has to determine that it is creating a road or an intersection, and then after that, it leaves the details up to the specific class that is being created. The disadvantage, of course, is that if you want to reverse engineer a description of the language from the code, you've got to look all over the place.

Both of the above approaches work, and neither is obviously superior, althoug later, we will find that the second is likely to be better because it divides the responsiblity for the different software components more effectively.

Continuing the Example

The above discussion of the input language leads immediately to two design decision:

Since each intersection has a name, we must add a name field to the class Intersection
Since each road is named by the intersections it connects, each road must know about its source as well as its destination.

The following refinement to our code captures these decisions:

class Road {
        /** Roads are one-way streets linking intersections
         *  @see Intersection
         */
        float travelTime;         //measured in seconds
        Intersection destination; //where the road goes
        Intersection source;      //where the comes from
        // name of road is "source destination"
}

class Intersection {
        /** Intersections join roads
         *  @see Road
         */
        String name;

	...
}