5. Everything is an instance of a class
Part of
CS:2820 Object Oriented Software Development Notes, Spring 2021
|
The textbook has a chapter titled Everything is an Object, and in the world of object-oriented programming, that is true. When you look at a large programming problem and think about how to create code, every noun you find in the problem description is a very good candidate for the name of either an object or a class of objects in the program that solves that problem.
For example, when you look at the screen of a computer, you typically see windows, icons and a cursor on that screen. An object-oriented implementation of the window manager for that computer will almost certainly be built on classes with names like Window, Icon and Cursor. If the window manager only supports one screen, there will probably be an object called screen. if the window manager supports multiple screens. There may well be a class, Screen, with one object of this class per screen, and possibly an object named currentScreen that names the object that that the user is currently focused on because the cursor is there. By convention, Java class names are almost always capitalized, while object names are usually in lower case. This is only a convention! Nothing requires this. Other conventions are used in some settings, but we will try to conform to the Java convention here.
In our road-newtork example, there will be classes like Road, Intersection and Vehicle. In our logic-circuit example, there will be classes like Gate and Wire. In our neural-network example, there will be classes like Neuron and Synapse.
The important thing to note in all of these examples is that we can actually construct a huge amount of the framework of a program by analyzing the classes that make up the problem and their relationship to each other. Significant parts of this work can be done long before we know what algorithms are involved, before we know what output the program is supposed to produce, and before we know what input the program will take.
We spoke in the abstract about classes of objects in our discussion of modelling a road network, a digital logic circuit and a neural network. Now, let's talk about implementing these classes in Java. Initially, we'll talk about these classes as pure data. We'll add behavior later.
If we are modelling a road network, we might begin with the following classes:
class Road { // indent to here between braces } class Intersection { }
Note in the above that the closing brace for each block is aligned under the keyword that opened the block, while the opening brace is at the end of the line (except perhaps for a comment). This indenting style is preferred both by our textbook's author and by me.
There is a matter of style here. Java is perfectly happy if we write this without newlines, without comments, and with a minimum of spaces like this:
class Road{}class Intersection{}
That is not very readable, and the only reason to write code this way is to prevent it from being read. Unreadable code will not be tolerated in this class. Languages such as Python force you to use newlines and indenting, but Algol 60, Simula 67, C, Pascal, C++ and Java, among many others, leave indenting and newlines entirely to the programmer.
We could also write it as follows, keeping the opening and closing braces vertically aligned with everything between indented. This style tends to push code onto more lines, pushing code off the bottom of the editing window.
class Road { // indent to here between braces } class Intersection { }
The style I use (and that the book uses) makes balancing brackets easy enough without pushing text off the bottom of your editing window.
Another question about code format is, how much should you indent. Short indents, for example, 4 spaces, allow deeper nesting than deep indents without adding pressure for longer lines. Tab stops in plain text files such as are used to store programs usually default to every 8 characters, a default dating back to the early Unix system from around 1970. This is also the default supported by web browsers when displaying .txt files and when displaying text between <pre> and \<pre> tags in HTML. While most text editors allow you to set the tabs to other spacings, this causes trouble if you change editors, e-mail the code to someone else, or print the file using the default printer settings. So the best practice is to leave the tab setting at its default. If you want to use shorter tabs, use the space bar, not the tab key.
The human mind can only digest a certain amount of complexity before it is overwhelmed. If your program really needs more than 4 or 5 levels of indenting, it may be too complex to understand and perhaps it should be broken up into digestable components. This suggests that indenting using one tab per indenting level is reasonable. This was the standard indenting convention in C for the first 15 years of use of that language. The Sun/Oracle formatting standards for Java suggest that 4 spaces is reasonable, while allowing 8 spaces. They emphasize uniformity over the exact value of the indenting step. Generally, it is very bad form to mix code that uses 4-space indenting with code that uses 8-space indenting.
Similar arguments suggest that long lines are not a good idea. Yes, the 80 column default for terminal windows is directly descended from the fact that punched cards have 80 characters each (a standard IBM introduced in 1928). This is an archaic reason for the default length of a line, but it is not a bad length. The 80 column standard is based on the length of a line of text on a typical page of typing paper. That, in turn, is based on experience with easy readability.
When pages get wider than on the order of 80 characters, it gets difficult for the reader to track from the end of one line to the start of the next. If you're reading this on the web with your web browser window maximized to take up the full screen, your reading speed will be significantly reduced compared to your reading speed with the window width set somewhere in the range from 50 to 100 characters.
When faced with wide pages, people have long opted for multi-column text. This goes back to the days when hand-written ink on parchment was the standard, and it continues today in contexts such as large-format books and newspapers. Keep this in mind when you are tempted to simply widen your editing window and write really long lines of code. Oracle's standard for Java formatting requires lines to be no more than 80 charactes. We will enforce this standard.
Regardless of how you indent it, the code given above is a framework, but we can store this in a file and start testing immediately. Consider using the file RoadNetwork.java to hold the code for a road-network simulation
Of course, we ought to document this file with appropriate commentary, so right up front, before starting to write any code ,we'll add some notes:
// RoadNetwork.java - Classes needed to describe a road network /** Roads are one-way connections between intersections * @author Douglas Jones * @version -1? * @see Intersection */ class Road { // Bug: Lots of details are missing } /** Intersections join roads * @see Road */ class Intersection { // Bug: Lots of details are missing } // Bug: Java demands that this file contain class RoadNetwork
The above commentary uses the javadoc style of comments so that, later, when the program grows huge, we can use the javadoc tool to generate a documentation file from these comments. Note that Javadoc insists that the comment documenting any specific class, field or method be placed directly before that class definition.
In short, the special comment marker /** opens a Javadoc comment, and the marker */ closes the comment. Between these two markers, you can put arbitrary text, but the @ symbol causes the following text to be processed specially. Look up Javadoc in Wikipedia; that's not a bad introduction.
Test this! Save the above Java code and use the javac command to make sure you have not messed up, then come back and start thinking about the next step. Let's start fleshing out the first class:
class Road { float travelTime; //measured in seconds Intersection destination; //where the road goes // Bug: do we need to know where this road comes from? }
One attribute of each road is its length, but (at least for this class) we aren't as worried about the physical length of the road as how long it takes to travel down the road. So, we'll measure length in terms of the travel time for a vehicle going at the speed limit. The decision to measure travel time in seconds is arbitrary.
Once a vehicle enters a road, it must end up somewhere, so we also added a field that indicates what intersection we get to if we get on this road. This, in turn, implies that each road is a one-way connection from some source intersection to some destination intersection. If you want to model two-way roads, you do it with a pair of one-way roads, one for each direciton. If you want to permit U turns at some point along a two-way road, you do it by adding an intersection.
We also added a comment indicating a currently unanswered question: When looking at a road, do we ever need to know where that road came from? We don't need the answer immediately, but if we need this information, we left a comment, a bug notice, indicating where the information should be stored if we do need it. Later, after we start thinking about simulation algorithms, we'll find the answer.
It is a really good idea to adopt a convention of writing comments in your code to document bugs and other things you don't understand. If you consistently use a word like Bug to mark such comments, you'll have a very easy time finding places you marked earlier as needing work. Do not put off writing comments until the end. Time spent thinking about how to comment your code is usually time well spent because it forces you to think about the code you have and recognize bugs early in the design process.
As an incentive to think about comments early, if you need help with code and we see that it does not have comments, we'll ask you to fix that before we look at the code.
The above code fragment illustrates two issues: The first has to do with multiple-word variable names. It might have been nice to call one variable travel time and the other intersection destination, but in Java, you cannot put spaces inside an identifier. Other languages differ. In FORTRAN (the oldest high-level programming language), spaces are allowed in identifiers. In fact, in FORTRAN, all spaces are ignored, so they can be added at random.
An alternative to the style used in the code here (and in the textbook) is to use underscore as a space character in identifiers, for example, travel_time. This is a very popular style.
The style used in the text, and here, has been called StudlyCaps as if there is something masculine about squeezing out the spaces and capitalizing the first letter of each word, and also BiCapitalization.
The secnd issue surrounding the use of capital letters is a matter of convention: Here, the first letter of each class name is capitalized, while this is not done for variable names. When you define a new symbol, you can capitalize it any way you want, but conventions can improve readability.
So why aren't the names of built-in classes like int and float capitalized? There are two explanations:
First: We could claim that this is to emphasize the fact that int and float are not quite first-class classes. If a Java object is from a first-class class, it inherits a large number of attributes from the superclass of all classes. This has a high cost. Objects of built-in classes like int and float don't inherit these attributes. They have much more limited semantics in order to allow very efficient execution.
There is a full-scale class, Integer, that does everything that class int does, but more slowly. Each Integer has a single field of type int. Similarly, there is a class Float. These classes are useful because they contain a number of attributes and methods supporting the built-in classes.
Second: We could give the actual explanation. The type names int and float come from C and C++. Java didn't change things that worked just fine in those older languages.
We can continue fleshing out our road network by adding comments to the definition of an intersection. We have some problems to solve here: How does one include a set of outgoing roads in a class? How does one create a class that comes in several types: uncontrolled intersections, intersections where some road has a stop sign, intersections where all incoming roads have stopsigns? Does the intersection even need to know the identities of its incoming roads?
/** Intersections join roads * @see Road */ class Intersection { // Bug: multiple outgoing roads // Bug: multiple incoming roads // Bug: multiple types of intersections (uncontrolled, stoplight) }
Class vehicle has the potential to have attributes like cargo capacity and passenger capacity, but those depend on why we are building the model. Initially, our biggest question about vehicles is, does the vehicle need to know its current location? The answers to these questions depend on how we use the model, but we need to go quite some distance before that matters.
/** Vehicles travel on roads through intersections * @see Intersection * @see Road */ class Vehicle { // Bug: what are the relevent attributes of a vehicle? // Bug: do vehicles need to know their current location? }
Finally, as mentioned in the previous lecture we will eventually need to worry about events. We will put off that issue until we dive into discrete event simulation in considerably more detail.