Machine Problem 5

Due Mar 8 Mar 15, on line

Part of the homework for CS:2820, Spring 2021
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Note: The one-week extension on this assignment is due to issues that came up in solving this problem that I believe were not adequately covered in lecture or prerequesite material. In exchange for the extra week to do the assignment, this changes from being a 5-point assignment to a 10-point assignment.


The specifications for MP3 and MP4 allowed this input:

population 10 ;
place home 4.0 0.0 ;
place work 2.0 1.0 ;
place school 1.0 0.0 ;
role homemaker 0.3
   home ;
role worker 0.4
   work ;
role student 0.3
   school ;

This has several shortcomings. First, there is the simple matter of syntax. The space before the semicolon is a nuisance. Things will get worse when the punctuation gets more interesting, as it will shortly. The syntax we want will be like this (with a very much smaller example):

population 10;
place home 4.0 0.0;
role homemaker 0.3

The second problem is one of time. People move from place to place, and to model that, for each role, people will need not only a list of places, but a list of placess annotated with the times they arrive and depart. Note that the times are an attribute of the interaction between the role and the place. For students, the school day may start at 8:30 and end at 3, but teachers need time to set up in the morning and time to clean up, grade, and plan tomorrow's lessons in the afternoon, so they come at 8 and leave at 5. Similar differences apply to restaurants, where the work day for the staff begins before opening time and extends after closing time.

Time is complicated by additional problems: Times like work and school times will be part of a Monday through Friday schedule, while others, for example, church times, will be Sunday only. Furthermore, some times are rock solid: I always go to work on weekdays (except when sick), while others will be probabilistic: I sometimes go to the store after work. So, our goal is to work towards something like this:

role student 0.3
   school (8:30AM-3:00PM weekdays, 0.95);

That would mean that those in the role called student go to a place called school at 8:30AM and return home at 3:00PM on weekdays, 95% of the time. The other 5% of the time, they stay home.

This is a bit too much for now, so we'll simplify it and say that, for each role, the list of places is augmented by, for each, an optional pair of times enclosed in parentheses, where a time is (for now) a time in hours measured from midnight, so 15.5 means 3:30PM. The first time in each pair is the start time, while the second is the end time. So, we will plan on allowing things like this:

role student 0.3
   school (8.5-15)
   soccer (15.5-17.5);


Extend any solution to MP4 so that:

Also in writing your code, think ahead. In future versions:

Your program must produce a somewhat more elaborate debug output to show that is successfully met these requirements. The record for each person will now occupy multiple lines:

Hint, the above material was written without saying "another class" but you might find that you need one or more new classes.


On or before the end of Monday, Mar. 8, you must submit your program using the following shell command on the fastx machine:

~dwjones/submit xxxx

Substitute your section number for xxxx above, but do not alter the text dwjones.

The program will confirm that you successfully submitted your solution with the message "submiter: succesfull submission of xxxx/HawkID" with xxxx replaced with your section number and HawkID replaced with your HawkID.

You may submit as many times as you want. The last successful submission is the one that will count. As such, it is a good idea to submit as soon as you have code that sort-of works, submit again when you have code that works the way you like, and perhaps submit again when you bring your code up to the highest standard of readability.


Code that executes correctly is worth half credit (2.5 out of 5 points). Expect demerits for the usual style issues in addition to the specific requirements given above.

Student questions

A student asked: Please give an example input and the output you expect:

Here is a very simple example of legal input:

population 2;
place home 2 .0;
place school 2. 0.0 ;
role student 1
role teacher 1 home school ( 8 - 17 ) ;

The output expected might look like this:

Person@28d93b30 student
  home Place@6bc7c054
  school Place@6bd505bd (8.5-15.0)
Person@28e087a4 teacher
  home Place@6bc7c054
  school Place@6bd505bd (8.0-17.0)

A student asked: Your example includes bothschool(8.5-15); without spaces and school ( 8 - 17 ) ; with spaces. Should I grab the entire text between parentheses with one pattern, then work on disassembling the string into two floating-point numbers, disregarding the dash and parens, or should I work through it sequentially, picking off the paren, then a number, then a dash, then a paren?

It's tempting to say "whatevere works for you," and ultimately, that may be the route you have to take. However ...

The "normal" way that people do this in compiler writing and other text processing applications is to avoid all but the minimal number of copies. So, the usual solution is to pick off consecutive tokens from the original stream instead of trying to snarf up blocks of input as strings and then dissect those blocks. If you can write a "getNext" method that handles semicolon without a leading or trailing space, the same logic should work for parens and for dashes.

A student asked: When I try to match begin parens, I get a java.util.regex.PatternSyntaxException.

This is because these characteres have special meanings in patterns:

. | ) ( [ ] \

If you were try Pattern.compile( "(" ) to make a pattern that matches just begin paren, you will get this error.

You should use Pattern.compile( "\\(" ) to make that pattern. Note that the double backslash is because the string literal "\\" in Java is used to create a string containing just the backslash character. Pattern.compile.

A student asked: How should this program process input that was legal for MP4 but does not contain any schedule for any of the places.

The schedule information for each place is optional, so any legal input for MP4 should also be legal for MP5, although now, the output would be on more lines because of the listing of all of a person's places on lines after that person is listed.

In addition, input for MP4 without whitespace around semicolons should be legal.

A student asked: Do we have to fix getNextLiteral() so it supports literals that aren't surrounded by whitespace, or can we invent a different interface to myScanner that solves this problem in some other way?

My suspicion is that the public interface to getNextLiteral() may need to change to solve this problem. Writing a hasNext() method that returns a boolean to tell you if there is a punctuation mark without skipping it may be very difficult. Therefore, you will need combine testing for literals with skipping them. This clearly involves thinking about both the code that calls whatevere method is involved and designing the behavior of that method.

A student asked: Can the pair of times be an attribute of the role?

No. Consider what happens when you combine several of the examples given above:

role student 0.3
   school (8.5-15)
   soccer (15.5-17.5);
role teacher 1
   school ( 8 - 17 ) ;

Here, you see the student has one pair of times for school and another pair of times for soccer, the two roles have different times for school. This clearly means that the pair of times is neither an attribute of the role nor an attribute of the kind of place, but rather, an attribute of the linkage from role to place.

A student asked: I'm trying to use hasNext(Pattern p) to see if there is a begin paren in the upcoming input, and I'm having problems making it work when there are no spaces.

All versions of next() and hasNext() require that the next token run up to and be terminated by a delimiter. While it may be possible to do something with dynamically changing the delimiters in order to recognize whether the next thing on the input is a begin paren, for example, or a semicolon, for another example, it is not going to be easy.

It is much easier to use skip(). There are two basic ways to use it. Either let it throw an exception to tell you "sorry, there was no begin paren" or "there is no semicolon," or alternatively, use a pattern that includes the empty string and let the empty string serve as a repor that the character you were looking for wasn't there.

Using skip() this way has consequences. Skip will eat the pattern you were looking for if it finds it. In contrast, hasNext() does not. It should be practical to engineer something like this into a replacement or supplement for getNextLiteral(), but the specification of that method will probably need to change.

A student asked: What about semicolons and newlines? Is this legal?

role student 0.3 home school (8.5-15); soccer (15.5-17.5); role
teacher 1 home school (8-17);

There are two issues in the above:

First, the semicolon after school before soccer. That is not allowed, and was there in an early version of this assignment. It was a typo.

Second, while the example uses whitespace in the way most typists would be tempted to use it, you can squeeze out even more whitespace. The following is legal:

role student 0.3 home school(8.5-15)soccer(15.5-17.5);role
teacher 1 home school(8 - 17);

A student asked: How does the relationship between roles and kinds of places relate to the relationship between individual people and the specific roles they have? And how do schedules fit into this?

The input data format implies a logical data structure relating roles to kinds of places. For example:

role student 0.3
   school (8.5-15)
   soccer (15.5-17.5);

This strongly suggests that each object of class Role has a list of associated kinds of places, or rather, a list of tuples, where each tuple consists of a kind of place and a schedule for places of that kind.

The output data format is similarly suggestive. One of the Person objects created from the above input might produce this output:

Person@28d93b30 student
  home Place@6bc7c054
  school Place@6bd505bd (8.5-15.0)
  soccer Place@5e834B5E (15.5-17.5)
home Place@6bc7c054
  school Place@6bd505bd (8.0-17.0)

As with the above example, this too suggests that each object of class Person has a list of associated places, or rather, a list of tuples, where each tuple consists of a specific place and the schedule for that place.

How does the first list above generate the second list? It is not as simple as cloning a data structure! The tuples in the first list refer to categories of places. The tuples in the second list refer to specific places. Each tuple also holds a schedule, and that component is the only component that can be copied from one to the other.

How do you store these tuples? The natural way to do this is with class for each distinct kind of tuple. For example, the list associated with each Person object might be constructed from these list elements:

class PlaceSchedule {
    Place place;
    Schedule schedule;

Of course there are alterenatives. First, this and any other classes used to make tuples can be declared as private inner classes to minimize their visibility. Some students have used parallel arrays or parallel lists,

Place places[];
Schedule schedules[];

With parallel lists, the tuples aren't immediately obvious. The idea is that tuple i would consist of places[i] and schedule[i]. This programming practice has been widely used since the 1950s, but it is strongly discouraged in programming languages that provide any kind of record or object structure.

Other students have solved this problem with things like HashMap data structures, using the hash function to connect the two members of each tuple. I suppose this can be done, but it substitutes computation for data structure, a very odd thing to do.