Assignment 7

Done as an in-class quiz

Part of the homework for CS:2820, Fall 2020
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

This quiz was offered in class as a timed test, 2 minutes per question. The different sections may have seen the questions in different orders and the answers also in different orders:

Simple multiple choice questions, 1 point each:

Here is a Java regular expression: "([1-9][0-9]*)|(0[0-7]*)"
Which of the following does this expression not match?
a) 0123456789 — correct answer
b) 5678901234
c) 9876543210
d) 4321098765
e) it matches all of them!
The correct answer (the expressino that does not match) begins with zero and contains digits outside the range from 0 to 7. The others did not start with zero, so they matched the first half of the regular expression that permits all digits.
When you write regular expressions as Java string constants you have to double many of the backslashes. So, instead of writing "\d" to mean a digit, you write "\\d". Why does "\\n" mean the same thing as "\n"? Because ...
a) The pattern \n matches the new-line character which can also be included directly in the string.
— correct answer.
b) The patternCompile method has a special case to catch this.
c) Control characters such as new-line cannot be directly included in strings.
d) A double backslash is always equivalent to a single backslash in a string literal of a pattern.
e) none of the above.
Answers b, c and d are simply false. In more detail, the Java string "\\n" represents the regular expression \n and the rules for Java regular expressions allow \n to stand for newline. In contrast, the Java string "\n" matched a newline, and in a Java regular expression, newline is just a character literal that matches itself.
Given that the pattern \s (lower case) means any whitespace character, and the pattern \S (upper case) means the opposite, any character that is not whitespace, here is a proposed implementation of the next method in class Scanner:
```
    public String next() {
         this.skip( A )
         this.skip( B )
         return this.match().group()
   }
```
To make this sort of work, A and B have to be as follows:

a) A = "\\s*" and B = "\\S+" — correct answer
b) A = "\\S*" and B = "\\s+"
c) A = "\\s+" and B = "\\S*"
d) A = "\\s" and B = "\\S"
e) A = "\\S+" and B = "\\s+"
The correct answer says "first skip zero or more delimiters then skip one or more non-delimiters, up to the next delimiter." We need the possibility of zero delimiters before a token to allow that token to be the first thing in the file, with nothing before it. Answers b and e are wrong because they make next() return the delimiter, not the token. Answer c allows next() to return an empty token. Answer d allows only one character per token.
These questions about patterns were asked when your memory of patterns should have been very fresh from working with them to solve MP4.
A wrapper or interface class ...
a) typically has just one object as an instance variable. — correct
b) always has static methods.
c) must contain pass-through methods for all methods of the class it wraps.
d) must prevent use of the underlying object.
e) is essential to use in a well-structured program.
Answers b, d and e are utter nonsense. Answer c is wrong because you do not need pass-through methods for any method that is in the underlying class and not needed by the application. Answer d is wrong because the whole point of a wrapper is to control use of the underlyng object, not prevent it.
Java provides both Float.isNaN(f) and f.isNaN() methods. These do the same thing and either can be applied when f is declared as either Float or float.
a) Float.isNaN(f) is faster when f is a float. — correct
b) f.isNaN() is faster when f is a float.
c) Float.isNan(f) is faster when f is a Float.
d) they are always the same speed, regardless of f being Float or float.
e) Float.isNan() is an interface or wrapper around f.isNan().
Recall that the Float class is a wrapper class around float variables. Answer a does not need to do any boxing or unboxing. Answer b is slowed down by autoboxing. Answer c requires that f be unboxed to pass it as a parameter. Answer d is wrong because it ignores the cost of boxing and unboxing. Answer e misuses terminology: We have spoken about wrapper classes, not wrapper methods.

Machine Problem 4 Comments

Many of the solutions people submitted to MP4 showed evidence of really limited testing, if any. Here are some example tests, first, one with all the spaces still there that were required in MP3, designed to look at the correlation between workplace and home, since it has exactly two of each:

pop 10 ;
infected 0 ;
house 5 , 0 ;
employed 1 ;
workplace 5 , 0 ;

Here is one to test exactly the spacing changes required by the MP4 assignment, plus a few comments:

pop 1;
infected 1 ; // comment line
employed 0;
house 2, 2.1   // missing semicolon should be detcted
workplace 2 1; // missing comma should be detected
// comment here too

Finally, here is a test that deals with inputs that go just a little above and beyond what was required, things like a blank line and no spaces around comment marks. For MP4, failure to handle these wouldn't be penalized, but unhandled exceptions are clearly unacceptable, and if a solution handles them, that is worth a bit of extra credit.


pop 1;
house             // comment between arguments?
      2 , 2.1    ;// odd semicolon spacing
employed 0 ;
workplace 2 , 1 ;
//no space after slashes
infected 1 ;

Tests like the above are being used to grade MP4. The solution distributed in class does handle all the above.

Machine Problem 5 -- due Monday, Oct 12

Write version 5 of the epidemic model. You are still permitted to use any solution to MP4, but use one that works for a variety of reasonable tests. The solution distributed to the class works with all of the tests suggested above (or at least, it worked when I tested it).

First, make all the computations involved in concatenating strings that might be needed in error messages lazy, that is, no string concatnation should take place until it is determined that there is indeed an error. And, lazy concatenation should be done with λ expressions. In the example solution to MP4, this involves all of the error message parameters to the get methods of class MyScanner.

Second, make the input part of the code accept the full range of real number represtations allowed by Java. The solution distributed for MP4 does not handle all of these. That includes things like:

10
-11.
11.5
-.75
1.75
1E5
-2.0E-5
3.0E+20

Your program must permit commas, semicolons and comments to follow numbers without any intervening space, and commas are required

pop 25;
house 3.2, 1;        // median and scatter of houshold sizes
infected 12;
employed 0.5;        // probability of employment
workplace 10.5, 12;  // median and scatter of workplace size (syntax error?)

Your solution should allow negative integers and floats to be properly handled in the input. That is, they should be recognized as legitiate numbers, and only rejected (when appropriate) as invalid values by after they are scanned from the input.

In posing this requirement, we assume a well structured program where the problem of numeric input is solved separately from the part of the program that needs the numbers. Negative numbers are perfectly permissible as numbers, so they should cause no problems in the numeric input processing part of the program. That part of the program might object to other things like the integer 9999999999 (too big).

On the other hand, the part of the program that needs the numbers should not permit negative values where those are not part of the problem specification. Note that the sanity constraints on numeric values depend on the customer. Populations less than one and median home sizes less than one don't make sense. The spread on home sizes should not be negative, but a spread of zero does make sense — that simply means that all the homes are the exact same size.

Your changes must not break any of the things that worked properly in earlier versions of the code.

A student asked: Should we allow decimal points in integers?

No. You do need to allow negative integers, but integers should not include decimal points or exponent fields. A negative integer should not cause error messages about invalid number formats or the like. Higher level code in the application may well object that populations below 1 are illegal.

A student asked: Should we allow two commands on the same line? What about line breaks within a command, for example, between the command and the semicolon that terminates it? What about leaving out all whitespace except the whitespace between a command and the following number? What about using a semicolon where a comma was expected?

None of the examples discussed in class, including solutions distributed to previous versions of the Epidemic simulator nor any of the highway simulator code was sensitive to line breaks except that comments were defined as running from // to the line end. So, for every version of the code we've discussed, all the input can be on one line and semicolons can be in odd places. All of the solutions distributed only accepted commas where commas were required and only accepted semicolons where semicolons were required. Part of the reason for this is that the solutions all ignore line breaks, so semicolon means move on to the next command. All of this is just like Java, by the way.

A student asked: Should we run the simulation if there are errors in the input? Can we abort the program early if the number of warnings hits some threshold?

We aren't running a simulation yet, but when we do -- which will be soon, we will not run the simulation if there have been any warnings. During the debugging phase, when we are just dumping the population, it would not be wrong to have it quit without dumping the population if there were any input errors.

There's been a bug notice in class Error for some time saying it would be nice to count the warnings and give up if the count hit some threshold. You're welcome to fix this bug by doing so. If you poke into the example road-network code, you'll see that this bug is fixed there with a warning limit of 10.

A student asked: What if someone inputs pop 5, 6, 7; with extra parameters? Is that an error?

It certainly seems like an error to me.

A student asked: When the user input is 2E2 as a float or double, is that 2×2² or 2×10²? an error?

The number 2E2, in FORTRAN, C, C++, Python and Java always means 2×10². The E notation is the common approximation of scientific notation that has been used on computers since the late 1950s.

But wait, there's more hiding in this question! You are not responsible for doing any of the artimetic needed to convert from textual to the internal floating-point format of the machine. Python's parseDouble and parseFloat methods already do that. The programming problem you face has to do with splitting the number from preceeding or trailing punctuation.