26. Variations, Static and Cookies

Part of CS:2820 Object Oriented Software Development Notes, Fall 2015
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

 

What if we eliminate Static?

Some object-oriented languages have no static fields of classes. Suppose we have something like this:

class Simulator {
	private static PriorityQueue  eventSet

	static void schedule( Event e ) {
		/** Call schedule to make e happen at its time.
		 */
		eventSet.add( e );
	}

	static void run() {
		while (!eventSet.isEmpty()) {
			Event e = eventSet.remove();
			e.trigger();
		}
	}
}

How do we rewrite this if we can't use static fields? The first step is obvious, delete the word static, but this breaks many things. The main program is going to have to declare an object of class Simulator and pass that as a parameter each time a call is made to some code that needs to schedule an event. Thus, we begin with this code in the main program (this code is from the road network example):


		try {
			Simulator s = new Simulator();
		        readNetwork( s, new Scanner( new File( args[0] )));
			checkNetwork();
			writeNetwork();
			s.run();
		} catch (FileNotFoundException e) {
			Errors.fatal( "Can't open file '" + args[0] + "'" );
		}

Note that we had to pass the simulator to readNetwork() so that readNetwork() could pass it to new Source() so that the initializer for Source() can add a new ExitIntersectionEvent to the pending event set.

This, in turn, requires that class Simulator.event know the identity of the simulator, either taking advantage of the knowledge each inner class has of the identity of its outer class, or by having the simulator passed as an explicit parameter. We take the latter path in the following code:

class Simulator {

	public static abstract class Event {
		public float time;       // the time of this event
		abstract void trigger( Simulator s ); // the action to take
	}

	private static PriorityQueue  eventSet
	= new PriorityQueue  (
		(Event e1, Event e2) -> Float.compare( e1.time, e2.time )
	);

	static void schedule( Event e ) {
		/** Call schedule to make e happen at its time.
		 */
		eventSet.add( e );
	}

	static void run() {
		/** Call run after scheduling some initial events
		 *  to run the simulation.
		 */
		while (!eventSet.isEmpty()) {
			Event e = eventSet.remove();
			e.trigger( this );
		}
	}
}

Of course, we must now rewrite the trigger methods of each class of event.

Bad Consequences

The lack of static fields leads to bad consequences. Chief among these is the temptation to create multiple instances of classes that should logically have only one instance. For example, the basic discrete-event simulation algorithm only admits to having one pending event set. If the designers of a large application accidentally create multiple pending event sets, the application will almost certainly break because the simulated time will no-longer advance monotonically. A simulated world in which time can move backward is simply not compatable with reality as we know it.

This temptation to create multiple instances has its greatest risks when an application contains a pseudo-random number generator (PRNG). For practical purposes, PRNGs appear to produce sequences of random numbers, but there are two differences between a PRNG and a real random number generator.

  1. PRNGs can be "seeded" to start the random number stream with a specific value. If the initializer for a PRNG begins by setting a constant seed, the PRNG will produce the exact same stream of values each time it is called. This is extremely important for debugging, because it means that the program will produce exactly the same outputs whenever it is presented with the same inputs, so you can re-run it to see if you have fixed the bug. With genuinely random numbers, you may never get the same behavior twice when you re-run a program, so if you do find a bug, you may never be able to reproduce it.

  2. PRNGs produce a finite number of values and then repeat the same sequence. So long as the PRNG is used to produce fewer values than the length of one cycle, a good PRNG will pass every test for randomness, but as soon as it reaches the end of the cycle and starts over, it will be obvious that it is a repeating cycle. So long as the cycle length is longer than the number of random numbers you need during the life of your application, this is not a problem, and good PRNGs have cycle lengths measured in the billions or more.

Now, what happens if you have a PRNG class and you create two instances with different seeds:

	a = new PRNG( 14 );
	b = new PRNG( 27584 );

Streams a and b have been seeded above with wildly unrealted numbers, but nothing about the relationship between 14 and 27584 tells you anything about the relative offsets of the two streams. If the cycle length of the PRNG you are using is c, the numbers 14 and 27584 may be c/2 apart in the sequence, in which case you can generate c/2 pseudo-random values from either stream before the two streams begin to show any correlation. On the other hand, it could just as easily be that streams a and b end up displaced by one from each other, so that you start seeing correlations as soon as you draw the second number from either stream. If you picked truly random seeds, the expected value of the offset between the two streams is c/4.

Better, just use one pseudo-random stream and know you can draw c numbers from it for any purpose before you start seeing repeats.

So why not use two streams that use different generators? The biggest reason is that good pseudo-random number generators are really hard to develop. There are many bad ones, several of which have seen many decades of use and have been republished so many times that they are very difficult to stamp out. Merely finding a published PRNG and using it does not guarantee its quality because so many of the bad ones have been published so many times.

So, how do you set the seed if you want a PRNG to actually behave as if it was random? Use a random value from the environment. The time of day at which your application is started, to the millisecond, the time interval between the last two mouse clicks, keypresses or screen touches and other similar values can be harvested from the environment. Some operating systems actually include services that harvest randomness using this kind of approach to provide genuinely random data, at the rate of a few bits of randomness per event on each source of randomness. Use these kinds of things to set your seed to create the appearance of genuinely random behavior.

What if we only have Static fields?

Some systems have mechanisms that strongly resemble classes except that all fields are static. This is not so common in programming languages, but rather, in operating systems and network programming environments. Consider the following class:

class C {
	private Field f;
	public C() { // the initializer
		f = ??
	}
	public Field M() { // an example method
		return f;
	}
}

How do we rewrite this if all fields are static? The key is to use cookies.

While most of us think of cookies as a phenomonon of the World-Wide Web, where web sites leave cookies on your machine, forcing you to carry information about yourself on behalf of the web site (such as tracking information or personal browsing histories), the term appears to have originated in the documentaiton for the standard library of the C programming language. Specifically, the library includes services called called ftell() and fseek() that can be used inquire about the current position of an input/output stream and to set the position in that stream.

In the original Unix implementation of the C stream file model, positions in the stream were simply the integer number of bytes from the start of the stream, but when C was ported other operating systems, the developers quickly learned that some file systems made it difficult to count bytes from the start of a file. Here is how they documented file positions in the Unix Programmer's Manual, Vol I, part 3, published by Bell Laboratories in 1979, revised 1983:

Ftell returns the current value of the offset relative to the beginning of the file associated with the named stream. It is measured in bytes on UNIX; on some other systems, it is a magic cookie, and the only foolproof way to obtain an offset for fseek.

The general definition of a magic cookie you can infer from this brief mention is that it is a value returned to the user by some system where the system understands the construction and use of that value but the user is not expected to be able to interpret or manipulate it. The only useful thing the user can do with a cookie is give it back to the system so that the system can interpret it.

So, how do we use cookies to represent objects in a world where we still have something akin to classes, but all fields and methods are static? We use cookies as object handles, and we have each class maintain a collection of all objects of that class. Here is a rewrite of the example class given above under these constraints, using integer cookies:

static class C {
	static private int alloc = -1;   // used to allocate instances
	static private Field f[maxSize]; // all of the instances
	static public int C() { // the initializer
		alloc = alloc + 1;
		f[alloc] = ??
		return alloc;   // return a handle for the new instance
	}
	public Field M( int handle ) { // an example method
		return f[handle];
	}
}

The above implementation assumes that the constant maxSize gives the maximum number of members of class C that will ever be needed. More sophisticated implementations will allow users to deallocate objects as well as allocate them, and will include (in the allocator) a search for a free object so that storage can be reused.

In the late 1960s and early 1970s, when object-oriented programming was just being invented, people like David Parnas developed methods of modularizing programs, creating software architectures that were almost object-oriented while writing code in languages like Fortran IV. Fortran IV does not support classes, but it is straightforward to write compilation units (separately compiled pieces of a large program) where each unit contains one callable subroutine per method of a logical class and a common block holding the representations of all instances of that class.

This method remains in use today in distributed systems where code is divided between clients and servers. In such a system, it is natural to have one server to implement each class, where the server holds all values of that class in its internal memory and offers clients the ability to perform actions on members of the class.

The biggest weakness of the basic cookie idea is that it has no security. Handles are integers, and if a program accidentally (or intentionally) uses the wrong handle, it will access the wrong object. Nothing in the type checking of a typical programming language prevents the interchange of an integer handle for an object of class C with the handle for an object of some other class, so errors (or security violations) will be very difficult to catch.

Andrew Tannenbaum invented a very nice solution to this problem. In effect, his solution creates a range of values for handles that is huge compared to the number of valid handles. As a result, if a user accidentally or intentionally uses the wrong value of a handle, the likelihood of that handle referencing a valid object is minimal.

His solution involves adding salt to the integer handle we have used above (regardless of whether an improved allocation scheme is used). The allocator adds the salt to the handle, and whenever the handle is used, the class removes the salt, but only after checking to see that it is the right salt for this object. Here is the code:

static class C {
	static private int alloc = -1;    // used to allocate instances
	static private int salt[maxSize]; // the salt for each object
	static private Field f[maxSize];  // all of the instances
	static public int C() { // the initializer
		alloc = alloc + 1;
		f[alloc] = ??
		salt[alloc] = random();
		return alloc + salt[alloc] * maxSize; // return salted handle
	}
	public Field M( int handle ) { // an example method
		// first break the handle into the salt and the real handle
		int mySalt = handle / maxSize;
		int realHandle = handle % maxSize;
		// check that the salt is right
		if (mySalt != salt[realHandle]) complain loudly
		// finally, do the job
		return f[realHandle];
	}
}

Tannenbaum built an operating system called Amoeba where object handles are implemented in essentially this way. The Amoeba system was designed (with funding from the European Space Agency) to create supercomputers from clusters of inexpesive small computers. Amoeba uses this framework extensively, and since it implements the basic "salted cookie" mechanism in standard "boilerplate" code that is distributed with the system, users can think in entirely object-oriented terms when developing applications on Amoeba, without ever worrying about building their own "salted cookies".

Amoeba is more complex than indicated here because it adds, to each object handle, a set of access rights. The salting scheme is further modified to prevent users with a limited-access handle from increasing their access rights, for example, given a read-only file handle, a user cannot convert it into a read-write handle without knowing the correct value of the salt for that handle, and this is cryptographically hidden in a very clever way.