3. Everything is an Object, Simulation

Part of CS:2820 Object Oriented Software Development Notes, Fall 2017
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science


Everything is an instance of a class

The textbook has a chapter titled Everything is an Object, and in the world of object-oriented programming, that is true. When you look at a large programming problem and think about how to create code, every noun you find in the problem description is a very good candidate for the name of either an object or a class of objects in the program that solves that problem.

For example, when you look at the screen of a computer, you typically see windows, icons and a cursor on that screen. An object-oriented implementation of the window manager for that computer will almost certainly be built on classes with names like Window, Icon and Cursor. If the window manager only supports one screen, there will probably be an object called screen. if the window manager supports multiple screens. There may well be a class, Screen, with one object of this class per screen, and possibly an object named currentScreen that names the object that that the user is currently focused on because the cursor is there. By convention, Java class names are almost always capitalized, while object names are usually in lower case. This is only a convention! Nothing requires this. Other conventions are used in some settings, but we will try to conform to the Java convention here.

In our road-newtork example, there will be classes like Road, Intersection and Vehicle. In our logic-circuit example, there will be classes like Gate and Wire. In our neural-network example, there will be classes like Neuron and Synapse.

The important thing to note in all of these examples is that we can actually construct a huge amount of the framework of a program by analyzing the classes that make up the problem and their relationship to each other. Significant parts of this work can be done long before we know what algorithms are involved, before we know what output the program is supposed to produce, and before we know what input the program will take.

Class Definition in Java

We spoke in the abstract about classes of objects in our discussion of modelling a road network, a digital logic circuit and a neural network. Now, let's talk about implementing these classes in Java. Initially, we'll talk about these classes as pure data. We'll add behavior later.

If we are modelling a road network, we might begin with the following classes:

class Road {
    // indent to here between braces

class Intersection {

An aside on code format

Note in the above that the closing brace for each block is aligned under the keyword that opened the block, while the opening brace is at the end of the line (except perhaps for a comment). This indenting style is preferred both by our textbook's author and by me.

There is a matter of style here. Java is perfectly happy if we write this without newlines, without comments, and with a minimum of spaces like this:

class Road{}class Intersection{}

That is not very readable, and the only reason to write code this way is to prevent it from being read. Unreadable code will not be tolerated in this class.

We could also write it as follows, keeping the opening and closing braces vertically aligned with everything between indented. This style tends to push code onto more lines, pushing code off the bottom of the editing window.

class Road
        // indent to here between braces

class Intersection

The style I use (and that the book uses) makes balancing brackets easy enough without pushing text off the bottom of the window.


Another question about code format is, how much should you indent. Short indents, for example, 4 spaces, allow deeper nesting than deep indents without adding pressure for longer lines. Tab stops in plain text files such as are used to store programs usually default to every 8 characters, a default dating back to the early Unix system from around 1970. This is also the default supported by web browsers when displaying .txt files and when displaying text between <pre> and \<pre> tags. While most text editors allow you to set the tabs to other spacings, this causes trouble if you change editors, e-mail the code to someone else, or print the file using the default printer settings. So the best practice is to leave the tab setting at its default. If you want to use shorter tabs, use the space bar, not the tab key.

But, the human mind can only digest a certain amount of complexity before it is overwhelmed. If your program really needs more than 4 or 5 levels of indenting, it may be too complex to understand and perhaps it should be broken up into digestable components. This suggests that indenting using one tab per indenting level is reasonable. This was the standard indenting convention in C for the first 15 years of use of that language. The Sun/Oracle formatting standards for Java suggest that 4 spaces is reasonable, while allowing 8 spaces. They emphasize uniformity over the exact value of the indenting step. Generally, it is very bad form to mix code that uses 4-space indenting with code that uses 8-space indenting.

Line Length

Similar arguments suggest that long lines are not a good idea. Yes, the 80 column default for terminal windows is directly descended from the fact that punched cards have 80 characters each (a standard IBM introduced in 1928). This is an archaic reason for the default length of a line, but it is not a bad length. The 80 column standard is based on the length of a line of text on a typical page of typing paper. That, in turn, is based on experience with easy readability.

When pages get wider than on the order of 80 characters, it gets difficult for the reader to track from the end of one line to the start of the next. If you're reading this on the web with your web browser window maximized to take up the full screen, your reading speed will be significantly reduced compared to your reading speed with the window set to somewhere from 50 to 100 characters.

When faced with wide pages, people have long opted for multi-column text. This goes back to the days when hand-written ink on parchment was the standard, and it continues today in contexts such as large-format books and newspapers. Keep this in mind when you are tempted to simply widen your editing window and write really long lines of code.

Back to the road network

Regardless of how you indent it, the code given above is a framework, but we can store this in a file and start testing immediately. Consider using the file RoadNetwork.java to hold the code for a road-network simulation

Of course, we ought to document this file with appropriate commentary, so right up front, before starting to write any code ,we'll add some notes:

// RoadNetwork.java - Classes needed to describe a road network

/** Roads are one-way connections between intersections
 *  @author Douglas Jones
 *  @version -1?
 *  @see Intersection
class Road {
    // Bug: Lots of details are missing

/** Intersections join roads
 *  @see Road
class Intersection {
    // Bug: Lots of details are missing

// Bug: Java demands that this file contain class RoadNetwork

The above commentary uses the javadoc style of comments so that, later, when the program grows huge, we can use the javadoc tool to generate a documentation file from these comments. Note that Javadoc insists that the comment documenting any specific class, field or method be placed directly before that class definition.

In short, the special comment marker /** opens a Javadoc comment, and the marker */ closes the comment. Between these two markers, you can put arbitrary text, but the @ symbol causes the following text to be processed specially. Look up Javadoc in Wikipedia; that's not a bad introduction.

Test this! Save the above Java code and use the javac command to make sure you have not messed up, then come back and start thinking about the next step. Let's start fleshing out the first class:

class Road {
    float travelTime;         //measured in seconds
    Intersection destination; //where the road goes
    // Bug: do we need to know where this road comes from?

One attribute of each road is its length, but (at least for this class) we aren't as worried about the physical length of the road as how long it takes to travel down the road. So, we'll measure length in terms of the travel time for a vehicle going at the speed limit. The decision to measure travel time in seconds is arbitrary.

Once a vehicle enters a road, it must end up somewhere, so we also added a field that indicates what intersection we get to if we get on this road. This, in turn, implies that each road is a one-way connection from some source intersection to some destination intersection. If you want to model two-way roads, you do it with a pair of one-way roads, one for each direciton. If you want to permit U turns at some point along a two-way road, you do it with a special intersection.

We also added a comment indicating a currently unanswered question: When looking at a road, do we ever need to know where that road came from? We will probably learn this much later, when we start thinking about algorithms.

It is a really good idea to adopt a convention of writing comments in your code to document bugs and other things you don't understand. If you consistently use a word like Bug to mark such comments, you'll have a very easy time finding places you marked earlier as needing work. Do not put off writing comments until the end. Time spent thinking about how to comment your code is usually time well spent because it forces you to think about the code you have and recognize bugs early in the design process.

An aside on capitalization

The above code fragment illustrates two issues: The first has to do with multiple-word variable names. It might have been nice to call one variable travel time and the other intersection destination, but in Java, you cannot put spaces inside an identifier. Other languages differ. In FORTRAN (the oldest high-level programming language), spaces are allowed in identifiers. In fact, in FORTRAN, all spaces are ignored, so they can be added at random.

An alternative to the style used in the code here (and in the textbook) is to use underscore as a space character in identifiers, for example, travel_time. This is a very popular style.

The style used in the text, and here, has been called StudlyCaps as if there is something masculine about squeezing out the spaces and capitalizing the first letter of each word, and also BiCapitalization.

The secnd issue surrounding the use of capital letters is a matter of convention: Here, the first letter of each class name is capitalized, while this is not done for variable names. When you define a new symbol, you can capitalize it any way you want, but conventions can improve readability.

So why aren't the names of built-in classes like int and float capitalized? There are two explanations:

First: We could claim that this is to emphasize the fact that int and float are not quite first-class classes. If a Java object is from a first-class class, it inherits a large number of attributes from the superclass of all classes. This has a high cost. Objects of built-in classes like int and float don't inherit these attributes. They have much more limited semantics in order to allow very efficient execution.

There is a full-scale class, Integer, that does everything that class int does, but more slowly. Each Integer has a single field of type int. Similarly, there is a class Float. These classes are useful because they contain a number of attributes and methods supporting the built-in classes.

Second: We could give the actual explanation. The type names int and float come from C and C++. Java didn't change things that worked just fine in those older languages.

Back to the road network

We can continue fleshing out our road network by adding comments to the definition of an intersection. We have some problems to solve here: How does one include a set of outgoing roads in a class? How does one create a class that comes in several types -- uncontrolled intersections, intersections where some road has a stop sign, intersections where all incoming roads have stopsigns? Does the intersection even need to know the identities of its incoming roads?

/** Intersections join roads
 *  @see Road
class Intersection {
    // Bug: multiple outgoing roads
    // Bug: multiple incoming roads
    // Bug: multiple types of intersections (uncontrolled, stoplight)

Class vehicle has the potential to have attributes like cargo capacity and passenger capacity, but those depend on why we are building the model. Initially, our biggest question about vehicles is, does the vehicle need to know its current location? The answers to these questions depend on how we use the model, but we need to go quite some distance before that matters.

/** Vehicles travel on roads through intersections
 *  @see Intersection
 *  @see Road
class Vehicle {
    // Bug: what are the relevent attributes of a vehicle?
    // Bug: do vehicles need to know their current location?

Finally, as mentioned in the previous lecture we will eventually need to worry about events. We will put off that issue until we dive into discrete event simulation in considerably more detail.

Discrete Event Simulation

In a discrete-event simulation model, the state of the system being simulated is described by a set of state variables. For example, variables describing where the vehicles are in a highway network, or variables describing the state of each wire in a digital computer as either true or false. Any changes to these variables are described as events. For example, when a vehicle arrives at an intersection, this is an event, or when a logic gate changes its output from true to false, this is an event.

In a discrete-event model, events are instantaneous and nothing at all happens between events. This is obviously a matter of abstraction or simplification. In the real world, all changes above the quantum level take time.

The basic discrete-event simulation algorithm has been known since the mid 1960s, and the algorithm applies to just about any discrete event model. The central feature of the model is a data structure, the pending event set. This is, as its name suggests, a set of pending events, that is, events that have been caused by events in the past but have not yet occurred.

The basic operations on the pending event are:

schedule e
The event e is inserted into the pending event set. The attribute e.time is the time at which the event is scheduled to occur.

e = getNext
An event e is extracted from the pending event set, where e.time is less than or equal to the times of all other events in the set. If it happens that other events in the set are scheduled to occur at exactly the same time, the model may be nondeterministic.

Some discrete-event models allow previously scheduled events to be deleted. Some higher level models consider events that extend over time, but we will ignore these. Events that take time can be modeled by a sequence of instantaneous events, for example, one marking the start of a long event and another to mark its end, and if an event may need to be cancelled, this can be indicated by adding a state variable to indicate whether that event should do its usual work or do nothing.

Before the simulation starts, initial conditions are set up. These include the initial state of the model, and they include initial events. Without the initial events, nothing would happen ever. Initial events might, for example, mark the arrival of vehicles from the outside world and, for intersections with stop lights, start those lights on their red-green cycles.

The basic algorithm for discrete-event simulation can be stated in a Java-like language as follows:

// initialize the model
eventSet.schedule( x ) // for all x in the set of initial events

// run the simulation
repeat {
    e = eventSet.getNext
    // simulate event e at time e.time
    // this may involve scheduling new events

There are two ways to terminate such a simulation: Either one of the initially scheduled events is a "terminate" event marking the end of time, or the model runs until the event set is empty.

In either case, within any iteration of the loop, e.time is the current time, and time lurches forward from event to event in an irregular manner. This is quite different from simulation models in which time advances in uniform-sized ticks; that type of model is more commonly used to deal with systems that are described by differential equations.

Note however that within the simulation of any event, the model may use arbitrarily complex code to predict the times of future events. In extreme cases, this prediction might involve running other simulation models, for example, involving differential equations.

A Classic Example

Consider the problem of modelling a bank. We begin with an informal description of one customer's visit to the bank: You arrive at the bank, wait in line for an available teller, interact with that teller for a while, and then leave. Tellers may need to do some paperwork or computer work after a customer departs before they are ready for the next customer. From this description, we can determine that our model will have the following objects and classes:

A little more thought allows us to refine this by identifying events: