6. Discrete Event Simulation

Part of CS:2820 Object Oriented Software Development Notes, Spring 2021
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Discrete Event Simulation

In a discrete-event simulation model, the state of the system being simulated is described by a set of state variables. For example, variables describing where the vehicles are in a highway network, or variables describing the state of each wire in a digital computer as either true or false, or variables describing where each person is and whether they are infected in epidemic model. Any changes to these variables are described as events. For example, when a vehicle arrives at an intersection, or when a logic gate changes its output or when a person gets infected, these are events.

In a discrete-event model, events are instantaneous and nothing at all happens between events. This is obviously a matter of abstraction or simplification. In the real world, all changes above the quantum level take time.

The basic discrete-event simulation algorithm has been known since the mid 1960s, and the algorithm applies to just about any discrete event model. The central feature of the model is a data structure, the pending event set or sometimes just the event set. This is, as its name suggests, a set of pending events, that is, events that are predicted at the simulated time of previous events but have not yet been simulated.

The basic operations on the pending event are:

schedule e: The event e is inserted into the pending event set. The attribute e.time is the time at which the event is scheduled to occur.
e = getNext: An event e is extracted from the pending event set, where e.time is less than or equal to the times of all other events in the set. If it happens that other events in the set are scheduled to occur at exactly the same time, the model may be nondeterministic.

Some discrete-event models allow previously scheduled events to be deleted. Some higher level models consider events that extend over time, but we will ignore these. Events that take time can be modeled by a sequence of instantaneous events, for example, one marking the start of a long event and another to mark its end, and if an event may need to be cancelled, this can be indicated by adding a state variable to indicate whether that event should do its usual work or do nothing.

Before the simulation starts, initial conditions are set up. These include the initial state of the model, and they include initial events. Without the initial events, nothing would happen ever. Initial events might, for example, mark the arrival of vehicles from the outside world and, for intersections with stop lights, start those lights on their red-green cycles.

The basic algorithm for discrete-event simulation can be stated in a Java-like language as follows:

// initialize the model
eventSet.schedule( x ); // for all x in the set of initial events

// run the simulation
repeat {
    e = eventSet.getNext();
    // simulate event e at time e.time
    // this may involve scheduling new events
}

There are two ways to terminate such a simulation: We can either schedule a special event to mark the "end of time" as one of the initial events, or we can let the model run until the event set is empty.

In either case, within any iteration of the loop, e.time is the current time, and time lurches forward from event to event in an irregular manner. This is quite different from simulation models in which time advances in uniform-sized ticks; that type of model is more commonly used to deal with systems that are described by differential equations.

Note however that within the simulation of any event, the model may use arbitrarily complex code to predict the times of future events. In extreme cases, this prediction might involve running other simulation models, for example, involving differential equations.

A Classic Example

Consider the problem of modelling a bank. We begin with an informal description of one customer's visit to the bank: You arrive at the bank, wait in line for an available teller, interact with that teller for a while, and then leave. Tellers may need to do some paperwork or computer work after a customer departs before they are ready for the next customer. From this description, we can determine that our model will have the following objects and classes:

Customers — a class. Each customer is a member of this class. Customers in the bank may be waiting or engaged in a transaction with a teller.
A queue of waiting customers; an object.
Tellers — a class. Each teller is a member of this class. Tellers may be idle, serving a customer, or doing paperwork.
A collection of tellers; an object.

A little more thought allows us to refine this by identifying events:

Customer arrival events enqueue customers in the queue. If the queue is empty and a teller is idle, the customer bypasses the queue and begins a transaction.
End transaction events release a customer and cause the teller to begin paperwork.
End paperwork events cause a teller to try to dequeue a customer and begin a transaction. If the queue is empty, the teller becomes idle.

Epidemics

We need a subject to use for the large-scale simulation exercise that will motivate the bulk of the machine problems assigned this semester. While, in the lectures, we will focus on the simulation of a road network, the machine problems will focus on something else.

This semester, we will focus on modelling the spread of an epidemic through a population, because that seems to be of current interest. To make such a model feasible, we will make a large number of simplifying assumptions; later refinements to the model could eliminate these assumptions, but such models can be very instructive even with such assumptions.

All solutions to programming assignments from my prior offerings of this course are available on line. Feel free to study them. Notably, we used the same problem last fall. In that class, we built a working simulation, and in the process, we learned many things, including that we can do a better job of it!

Places: We assume that the community includes a number of places; each place is occupied at any instance by a set of people. The likelihood that some person in a place will be infected is the product of the time they are there times the number of others who are infected times an infection risk multiplier. We call this multiplier the infectivity of the place. You might imagine the risk multiplier being increased by bad ventilation or tight packing of occupants. There are several types of places we might consider:

Households: Where people spend their "off" time.
Workplaces: Where employees go. Workplaces would have starting and stopping times on a daily and weekly cycle that determines when employes come and go.
Schools: A kind of workplace, where there is a second daily and weekly cycle that determines when students come and go.
Stores: A kind of workplace, where people on their "off" time may visit as customers. Different stores would have different typical visit durations. To avoid complex economic models, customers shop at randomly selected stores at random times, but each store may have a likelihood attached to it, so that big stores attract more customers.

In fact, the variety of places is unbounded, as is the association of people with places. We can, however, distinguish between residences and non-residential places. In last fall's models, we did this, but we had ony one kind of non-residential place, the workplace. Extending the model to handle schools, restaurants and the like proved to be too much because we failed to provide a general mechanism.

To keep the model simple, we completely avoid geometry and replace it with the concept that travel between places takes a random amount of time.

People: Each person is in a place except when traveling between places. Travel times could depend on geometry in a more refined model. People have an infection state, and once infected, the changes in infection state follow a pattern, moving through a period of latency (no symptoms, not infectious) to asymptomatic (infections) to symptomatic (infectious, but still feeling good enough to go to work) and bedridden (infectious, but staying home). We might also add hospitalized as a state, and an immune state for those who have recovered.

The disease state has several forks. Some fraction of those who are infected recover without ever becoming symptomatic. Some fraction recover without ever becoming bedridden. Some bedridden people die instead of recovering.

Public health policies can include closing some class of places when the infection rate exceeds some threshold, an it can include people deciding to stay home when they are symptomatic instead of going to various classes of places. We did not explore building any public health measures into our models last fall.

In any case, people come in several types. Consider these:

Homemakers stay home most of the time but go shopping more than others.
Students are characterized by a link to some specific school and arrive at school at the opening time and depart at the closing time. Students may shop in their off time. Last fall, we never got as far as including students in our models.
Employees: are characterized by a link to a workplace and arrive there at opening time and depart at closing time. Employees may shop in their off time.

We could build such a model by manually working out the details of each workplace and person, but that would be horribly tedious. Instead, we can use census figures and economic arguments so that we might specify a community to model with something like this:

Population
Average household size (and probability distribution)
Likelihood each person is employed
Likelihood each person is a student
Number of students per school (and probability distribution)
Student teacher ratio (determines how many teachers in each school)
Number of stores per household
Customer worker ratio (determines how many workers in each store)
Number of employees per non-store workplace (and distribuiton)

This is vague on purpose. The customer is telling us that they want something like this, not that they want exactly this. We will have to refine things before we actually get an input specification.

We also need to specify the disease progression as an input to the model, but we can defer that until we have managed to actually construct a community from a top-level description of that community.

Finally, we need a plan for moving foward! Where do we begin? We can't just start writing random code, we need some initial goal. For example, we might make an initial goal of reading a top-level description of a community and then writing out a list of the places generated for that community and a list of the people. These lists would be of no value to the customer in the end, but they provide a target far short of a working simulation so that we can see if we have part of the program working before we write the rest.