11. Iterators, Input Parsing
Part of
CS:2820 Object Oriented Software Development Notes, Spring 2019
|
As the last class ended, we were just starting to develop code to look up intersections by name. We presented this skeleton:
public class RoadNetwork {
static LinkedList inters
= new LinkedList ();
...
static Intersection findIntersection( String s ) {
// return the intersection named s, or null if no such
// Bug: flesh this out!
}
...
}
The obvious (even rather stupid) implementation we suggested was a linear search of inters for an intersection with the right name. We noted that Java provides much better tools.
Here is the stupid code to flesh out findIntersection
static Intersection findIntersection( String s ) {
// return the intersection named s, or null if no such
for (Intersection i: RoadNetwork.inters) {
if (i == s) {
return i;
);
}
return null;
}
This code is almost inexcusably simple, but it works and it takes very little thought to write it. The fact is, linear searches are awful, and the Java standard library provides some far better tools. The Java Map interface is the gateway to improved performance here. Java maps, when given a key, return the associated value. What we've done above is implement a mapping using a linear search, close to the worst possible implementation.
Why did we opt to do this? There are two reasons:
First: There is a general rule in software development, in three parts:
The point is, you should get the function right first, and only after you are sure that the program actually does something useful and worth optimizing should you worry about optimizing it. Time spent optimizing code that does something that turns out to be wrong is wasted time, and time spent optimizing code that turns out to have little impact on performance is almost as bad.
Second: The time it takes to figure out the complexities of the Map interface and figure out how to apply that interface to this problem is significantly longer than the time it takes to cobble together a quick and dirty solution such as the above. This strongly argues that the quick and dirty solution is the right way to get started.
If you try the above code, it won't work! In fact, it won't even come close to working. No matter how carefully we format the input file, all we get are error messages saying things like:
'intersection' not a road or intersection 'road' not a road or intersection
The problem is, we read a command such as the text road into the variable command in the main program, and then, when we tried the comparison command == "road" we faced a failure. Everywhere we try to compare strings, we have written things like the following:
if ((command == "intersection")
What's wrong with this? The problem is, Java offers several tools for comparison. The == operator is the simplest of these. In detail, the == operator has two distinct uses:
In the case of class String, each string constant such as "road" is created as a new object by the Java compiler before the time the program containing that constant begins running. When a program calls an instance of class Scanner to read a string from some file, the scanner creates a new object of class String for each string it reads, without making any effort to see if a String object already exists that contains the same textual value. As a result, there may be a number of strings holding the same text that the == operator considers to be different strings.
Java didn't have to work this way. Class String could have maintained a dictionary of all strings encountered, so that whenever a new string is constructed, if that string was already in the dictionary, the existing dictionary entry would be used instead of creating a new object. The designers of the String class opted to make string construction fast instead of eliminating duplicates.
There are several alternative ways of comparing two strings a and b:
Note that there are two useful ways to compare string variables with string constants:
We'd better rewrite readNetwork() to take this into account:
static void readNetwork( Scanner sc ) {
while (sc.hasNext()) {
// until the input file is finished
String command = sc.next();
if ("intersection".equals( command )) {
inters.add( new Intersection( sc, inters ) );
} else if ("road".equals( command )) {
roads.add( new Road( sc, inters ) );
} else {
// Bug: Should probably support comments
Errors.warning(
"'"
+ command
+ "' not a road or intersection"
);
// Bug: Should skip to the next line
}
}
}
Of course, we also need to rewrite findIntersection().
There were two places in our code with bug notices that we can now attack, in the initializers for the road and intersection classes. Consider class Intersection:
public Intersection( Scanner sc, LinkedList inters ) {
// scan and process one intersection
name = sc.next();
// Bug: look up name!
String skip = sc.nextLine();
// Bug: What if more junk at end of line (skip not empty)
}
We need to look up the name of the intersection because our source file format only allows one definition of each named intersection. If an intersection has already been defined with this name, we ought to complain. Here is an initial suggestion for how to do this:
public Intersection( Scanner sc, LinkedList inters ) {
// code here must scan & process the intersection definition
name = sc.next();
// Bug: What if at end of file? What does sc.next do?
if (RoadNetwork.findIntersection( name ) != null) {
Errors.warning( "Intersection "
+ name
+ " redefined."
);
// Bug: Can we prevent the creation of this object?
}
String skip = sc.nextLine();
// Bug: What if more junk at end of line (skip not empty)
}
We now have an initializer that complains with a useful error message when an attempt is made to declare a second (or third) intersection with the same name as a pre-existing intersection, but all may not be well. The new intersection is still created and added to the list of all intersections. Should we prevent this? Also, what if there is no next symbol for the scanner to scan? Does it return null? Does it return an empty string? Our code is not prepared for either possible outcome, so we have inserted more bug notices.
The second initializer we have to worry about is for roads. In this case, we want to guarantee that both intersections that the road connects are well defined. Here is the old code:
public Road( Scanner sc, LinkedList inters ) {
// code here must scan & process the road definition
String sourceName = sc.next();
String dstName = sc.next();
// Bug: Must look up sourceName to find source
// Bug: Must look up dstName to find destination
// Bug: What if the next isn't a float?
travelTime = sc.nextFloat();
String skip = sc.nextLine();
// Bug: What if more junk at end of line (skip not empty)
}
What we need to do is look up both the source and destination roads and complain if either are undefined. Reasonable error messages should identify the road that caused the problem and also explain the error, so our improved code gets a bit long-winded:
String sourceName = sc.next();
String dstName = sc.next();
// Bug: What if at end of file? What does sc.next do?
source = RoadNetwork.findIntersection( sourceName );
destination = RoadNetwork.findIntersection( dstName );
if (source == null) {
Errors.warning(
"In road " + sourceName + " " + dstName +
", Intersection " + sourceName + " undefined"
);
// Bug: Should we prevent creation of this object?
}
if (destination == null) {
Errors.warning(
"In road " + sourceName + " " + dstName +
", Intersection " + dstName + " undefined"
);
// Bug: Should we prevent creation of this object?
}
As with the previous example, we have not addressed what should be done if the object declaration contains an error, we have merely detected and reported the error. And again, we have not dealt with the possiblity that there might not be a source or destination intersection name.
The duplicated code above suggests that we might want to package the test for undefined names in a subroutine of some kind, perhaps a utility method for syntax checking that gets a name and checks to see if it is defined.
We've also inserted a new bug notice because we just noticed that there may be no sc.next() when we are trying to scan an intersection name. This also suggests we might want to package a more elaborate bit of code into a subroutine of some kind that not only scans an intersection name but checks to see if there is one.