15. Simulation and Frameworks
Part of
CS:2820 Object Oriented Software Development Notes, Fall 2020
|
Up to this point, we have developed code that can read a model into memory and write it back out again, detecting a variety of errors in the model. The only point of writing the model out is to verify that the model has indeed been read correctly.
This highway network model could be used for many purposes:
The logic circuit model we have discussed could also serve many purposes:
A neuron network model could also serve many purposes:
A logic network model could also serve many purposes:
The model we have been referring to as an epidemic model is actually just a model of a community.
Our goal for the projects in this course is to build simulations, and the time has come to discuss this in more detail.
An old bumper sticker I picked up at a simulation conference said: "Simulationists do it continuously and discretely." The sticker was a joke because, while members of the general public reading the sticker might guess one subject (sex), the actual statement is entirely true when you read "it" as a reference to computer simulation. There are two broad categories of simulation:
Continuous simulation models are common in fields as distinct as analog electonics, weather forecasting and macro economics. Here at the University of Iowa, the Hydraulics Institute is largely devoted to continuous simulation of fluid flow. Their building was built in the era when their research was largely done using actual tanks of water and even water from the Iowa River, but today, much of their work is done on computers, simulating not only water flow but also atmospheric flow.
Discrete event simulations are common in fields as distinct as logistics and digital logic simulation.
Almost all simulation models are based on simplifying assumptions. Most physics models assume that air is a vacuum and that the earth is flat. You can build bridges with these assumptions, although for medium and large bridges, it is worth checking how the bridge responds to the wind. (The Tacoma Narrows Bridge disaster of 1940 shows what can happen if you forget the wind when you design a large bridge -- it's in Wikipedia, watch the film clip.)
Our distinction between continuous and discrete models is also oversimplified. There are mixed models, for example, where the set of differential equations that describe the behavior of a system changes at discrete events. At each such event, you need to do continuous simulations to predict the times of the next events.
In a highway network, the events we are concerned with are:
Of course, the model can vary considerably in complexity. A simple model might have a fixed travel time on each road segment, while a more complex simulation might model congestion by having the travel time get longer if the population of a road segment exceeds some threshold, where that threshold may itself depend on the unpopulated travel time.
In a crude model, each car might make random navigation decisions at each intersection. A more complex model might have each car follow a fixed route through the road network, while a really complex model might include cars with adaptive navigation algorithms so that drivers can take alternate routes when congestion slows them on the path they originally planned.
In a network of digital logic gates connected by wires, we might have the following events:
The key element in the above that needs extra discussion is that if the output of a gate is changed and then changed back very quickly, no output change actually occurs. That is, there is a shortest pulse that the gate can generate on its output.
In a neural network model, with neurons connected by syapses, we might have the following events:
The key element in the above that needs extra discussion is how the voltage on a neuron changes with time. Between events, the voltage on a neuron decays exponentially, slowly leaking away unless it is pumped up by a synapse firing. So, for each neuron, we record:
Now, if we want to know the voltage at a later time t', we use this formula:
Of course, once you compute the voltage at the new time t', you record the new voltage and the new time so you can work forward from that the next time you need to do this computation. The constant k determines the decay rate of the neuron (it must be negative).
In a simple model, all the neurons might have the same threshold and the same decay rate, and all synapses might have the same strength. More complex models allow these to be varied.
In simpler models, the voltage on a neuron goes to zero when that neuron fires. In more complex models, the threshold has its own decay rate and the threshold goes up when the neuron fires.
In complex models, the strength of a synapse weakens each time it fires because the chemical neurotransmitter is used up by firing. During the time before the next firing, the neurotransmitter can build up toward its resting level. This allows the neural network to get tired if it is too active. (You can actually see this effect at work in your visual pathways. Look at a bright light and then look away, and you will see a negative afterimage that fades as the synapses that were overexcited recharge.)
In an epidemic model, where people move between places and their contact patterns spread some disease, we have the following basic events:
Any time a person arrives or departs from a place, the number of infected people in that place during the interval since the last arrival or departure, times the length of that interval, times the probability of infection per unit time for the place, gives the probablity that any person in that place during that time will be infected.
The above outline is appropriate for a "slow disease," that is, one where, when you begin to feel bad while at work or school, you stay until the end of the day before going home and staying home until better. The model would need modification for a "fast disease," whith a sudden onset that sends people home in midday.
The key to discrete-event simulation is a data structure called the pending-event set. This holds the set of all future events, events that have not yet been simulated, but that have been predicted to occur at future times as a result of events that have already been simulated.
The simulator operates by picking events out of the pending event set in chronological order. Simulating any particular event can involve any combination of changing variables that represent the state of the simulation model, on the one hand, and scheduling future events by adding them to the pending event set.
Some events may only change state variables without scheduling any future events. Other events may schedule future events without making any change to state variables. We can summarize the basic discrete-event simulation algorithm with the following pseudo-code:
// Initialize PendingEventSet eventSet = new PendingEventSet(); for each event e that known in advance to happen at future time e.time { eventSet.add( e ); { // Simulate while (!eventSet.isEmpty()) { Event e = eventSet.remove(); // e is the new current event at e.time simulate e by { // update simulation state variables to account for e at e.time for each event f that is a consequence of e { eventSet.add( f ); } // if event requires, force simulation to terminate } // simulation terminated because we ran out of events }
Note that the above code gives two different ways to terminate the simulation, one by running out of events, and the other involving a specific event, perhaps scheduled as part of the initialization to mark the end of time.
Either approach to termination works equally well. If we have a global constant endOfTime, we can make the event set become empty at that time by checking, as each event is scheduled, to see if it happens before or after the end of time. If it happens before, schedule it. If it happens after, simply discard the event notice.
So what is the event set? It is a priority queue sorted by the time that events are scheduled to happen. The order in which events are scheduled to happen has little to do with the times at which it can be accurately predicted that they will happen, so this is not a first-in, first-out queue.
Java provides several classes that can be made to do this job, not all of which are even implementations of the Queue interface. This means that the different candidates for the pending event set in the Java library aren't interchangable. Java's class PriorityQueue is based on a heap implementation. ConcurrentSkipListSet and TreeSet are other options. Sadly, the Java library has PriorityQueue as a final class and not an interface, since there are actually a wide range of algorithms for priority queues that all accomplish the same thing with different tradeoffs between different performance parameters.
We'll luse PriorityQueue here. One of the big differences between the Java alternatives that may concern you is whether the ordering is stable. Stable priority queues guarantee that two elements inserted with equal priority will be dequeued in the order in which they were enqueued. For well formulated simulation models, stable priority queues are not necessary because the order in which simultaneous events are simulated should not matter. In the real world, if two outcomes are possible from simultaneous events, it is highly likely that either outcome is correct. Stability may be useful to endure repeatability and easy debugging, but it may also be misleading in cases where the real world behavior is not repeatable because both outcomes are possible.
If you look up class PriorityQueue in the on-line Java reference material, you'll find that the most elementary constructor for priority queues sorts the elements by their natural ordering. That is, all of the items in the queue must be comparable. There is, however, a constructor described as follows:
PriorityQueue(int initialCapacity, Comparator super E> comparator)
Creates a PriorityQueue with the specified initial capacity that orders its elements according to the specified comparator.
There are two obvious questions here: What is the initial capacity and what is a comparator. Java will automatically grow the queue to any size (until your computer runs out of memory), but by specifying the initial capacity of the queue, you can avoid unnecessary startup costs. For example, if a simulation of a population guarantees that all people will have one pending event (the time of their next scheduled move from here to there), you can use this to give the queue a hint that it will need to be at least that big.
The comparator is a more interesting problem. This must be an object with a compare method that can be used to compare two elements of the queue. Every time the queue implementation needs to compare two items, it will use the compare method of the object you pass. As we have seen, λ notation is a way to pass an object that carries a function, and this works for passing a comparator. Therefore, we can create the priority queue we need like this, assuming that event times of type double:
PriorityQueueeventSet = new PriorityQueue<>( initialCapacity, (Event e1, Event e2)-> Double.compare( e1.time, e2.time ) );
This illuatrates a common design pattern: When one generic class is designed to operate on elements of some other class and it needs an operator to fiddle with them, you pass the operator to the constructor using a λ expression, or using a new object of an implementation of the required interface — in Java, λ expressions are merely shorthand for this.
This pattern shows up in Java's priority queue, tree-set and sorting mechanisms, allowing you to sort things into order on just about any basis. It can be used in a variety of other contexts — wherever there is a generic algorithm that can be made to operate on a variety of different data types.
There are many different ways of using discrete event simulation. We can describe these as simulation frameworks. They change the way we write the code to schedule events, but they do not change the underlying simulation model. Here is an initial (and poorly thought out) framework:
/** Framework for discrete event simulation. */ public class Simulator { public static abstract class Event { public double time; // the time of this event abstract void trigger(); // the action to take } private static PriorityQueue<Event> eventSet = new PriorityQueue<Event> ( (Event e1, Event e2) -> Double.compare( e1.time, e2.time ) ); /** Call schedule(e) to make e happen at its (future) time */ static void schedule( Event e ) { eventSet.add( e ); } /** Call run() after scheduling some initial events * to run the simulation. */ static void run() { while (!eventSet.isEmpty()) { Event e = eventSet.remove(); e.trigger(); } } }
The problem with the above framework is that it requires the user to create large numbers of subclasses of events, where each subclass includes a trigger method that does the required computation. Scheduling an event is actually an example of a delayed computation, and as we've seen, Java provides a tool that allows delayed computation and implicit creation of anonymous subclasses, the lambda expression. The above framework isn't set up to use these!
The following simulation framework uses λ expressions. We will use this framework as we develop our simulation:
/** Framework for discrete event simulation. */ public class Simulator { public interface Action { // actions contain the specific code of each event void trigger( double time ); } private static class Event { public double time; // the time of this event public Action act; // what to do at that time } private static PriorityQueue<Event> eventSet = new PriorityQueue<Event> ( (Event e1, Event e2)-> Double.compare( e1.time, e2.time ) ); /** Call schedule to make act happen at time. * Users typically pass the action as a lambda expression: * <PRE> * Simulator.schedule( t, ( double time )-> method( ... time ... ) ) * </PRE> */ static void schedule( double time, Action act ) { Event e = new Event(); e.time = time; e.act = act; eventSet.add( e ); } /** run the simulation. * Call run() after scheduling some initial events to run the simulation. */ static void run() { while (!eventSet.isEmpty()) { Event e = eventSet.remove(); e.act.trigger( e.time ); } } }
When writing a simulation, it is important to begin by settling on a framework, because that determines the sturcture of the simulation code. Changing the framework after you have begun writing code can be messy, but it is not impossible.