22C:116, Lecture Notes, Jan. 30, 1995

Douglas W. Jones
University of Iowa Department of Computer Science
Interprocess Communication -- Producers and Consumers
This lecture concludes our review of interprocess communication, followed by some abbreviated notes on scheduler variations.
In the previous lecture, we discussed mutual exclusion. This low level control over the sharing of data is essential but insufficient to allow productive parallel programming.
Consider a situation where one process produces data to be consumed by another:
```
    Producer --> Consumer
```
The producer can easily fill a shared variable, then let the consumer take the data, but controlled sharing is not enough. There must be some way for the producer to signal to the consumer that data is available.
Focus on Abstractions -- Signalling
One way to view this signalling problem is to think of the fork and join primitives.
   Process A       Process B
       |               |
    Produce            |
   some data           |
       |               |
       V               |
     ----- fork        |
       | \___________  |
       |             \ |
       |              ||
       |            __VV__ join
       |               |
       |            Comsume
       |            the data
       |               |
The control path between the two processes can be considered to be a
short-lived process that serves to synchronize the producer and consumer,
or it can be considered to be a synchronizing signal of some type.

In operation, if process A produces data before process B is ready to
consume it, the signal reaches the join and waits for process B to
arrive at that point.  If Process B is ready to consume before process
A has produced any data, Process B waits at the join for the signal to
arrive.

 Focus on Queues

What kind of data structure should be used to communicate between
the producer and consumer?  There are many options, but all boil down
to variations on the theme of bounded buffers.

A bounded buffer has a data capacity that may vary from 1 item (where
it becomes equivalent to a simple variable) to some large number.
If more than one item is allowed, the buffer may have either a linked
list or an array implementation.  Here, we'll look at the array
implementation:
   ________________________________
  |  |  |\\|\\|\\|\\|\\|\\|  |  |  |
  |__|__|\\|\\|\\|\\|\\|\\|__|__|__|
   Empty |     Data        | Empty
         |                 |
       Head               Tail

The operations on this structure are simple, if we ignore problems with
preventing enqueueing data in a full queue or dequeueing data from an
empty queue:
  enque(d)
     buffer[tail] = d
     tail = tail + 1 (modulo queue size)

  dequeue(d)
     d = buffer[head]
     head = head + 1 (modulo queue size)

We can detect empty and full queues fairly easily:
  empty := (head = tail)
  full := (head = tail)

But, these conditions are the same, which poses problems!  Despite these
problems, this approach to detecting the buffer full and buffer empty
conditions is actually used, on occasion, primarily in low-level applications
such as communication between processes and interrupt handlers.

 Shared Queues

When a queue is shared between multiple processes, there are two different
problems.  One is assurance of mutual exclusion, so that, for instance, no
two processes try to enqueue an item of data in the same queue slot, and the
other is keeping track of the number of available data items and free spaces.
All of these can be accomplished by semaphores:
  Semaphores used to provide mutual exclusion:
     headlock -- protects the head pointer
     taillock -- protects the tail pointer

  Semaphores used to count available resources:
     free -- counts number of free spaces
     data -- counts number of enqueued data items

Given these semaphores, with all mutual exclusion semaphores initialized
to one and the counting semaphores initialized to reflect the initial
state of the queue, the operations on a shared queue can be implemented
as follows:
  enque(d)
     P(free)     -- wait for free space to be available
     P(taillock) -- wait for exclusive use of the tail
     buffer[tail] = d
     tail = tail + 1 (modulo queue size)
     V(taillock) -- release the tail
     V(data)     -- signal that data is available

  dequeue(d)
     P(data)     -- wait for data to be available
     P(headlock) -- wait for exclusive use of the head
     d = buffer[head]
     head = head + 1 (modulo queue size)
     V(headlock) -- release the head
     V(free)     -- signal that free space is available

This implementation of shared FIFO queues is universally applicable,
but there are some special characteristics worth noting:

First, the mutual exclusion locks are only needed if there are multiple
producers or consumers.  When there is only one producer process, that
is the only process that will inspect and manipulate the tail pointer,
so no mutual exclusion semaphore is needed to protect the tail pointer.

Second, the mutual exclusion operations shown above protect very brief
critical sections, so it is unlikely that any process will wait long
for entry.  As a result, the best implementations of headlock and
taillock involve busy waiting at a spin-lock, not semaphores implemented
in the process manager.

Third, if the queues are very large, so there is a large probability that
the queue will be neither full nor empty, the counting semaphores will rarely
block the producer or consumer; thus, context switching will be very rare.

 Variations

If the buffer has a capacity of one, the head and tail pointers are
superfluous, as are the locks protecting them.  In this case, the
producer and consumer processes would be expected to operate in lock-step,
producing one data item, then consuming it before the next is produced.

Lock step operation of the producer and consumer is frequently innefficient
because unless the computational requirements of the two processes are
identical, either the producer will spend time waiting for the consumer,
or vicea versa.

From elementary queueing theory, we can predict the following:  If the average
time taken to produce an item is the same as the average time taken to
consume an item, and the times vary randomly, the performance of the system
will be maximized with an infinite queue capacity and an infinite average
number of items in the queue.  Furthermore, as the size of the bounded
buffer is decreased, the percentage of the time that the producer spends
waiting for space or the consumer spends waiting for data will increase
smoothly.  The conclusion is that, where memory allows, large buffers
are preferable.

The bounded buffer with a capacity of two is a commonly used special case,
particularly in 1960's vintage input/output subsystems.  In that context,
it is frequently described as "Double Buffered I/O".

When multiple FIFO queues are implemented using linked lists instead of
arrays, it is common to have a single shared pool of free list items
instead of a bounded supply of list items for each queue.  For a given
amount of memory, this can markedly improve performance, but it adds slightly
to the complexity of the code.  Specifically, instead of having a separate
space semaphore to count the amount of free space in each queue, there is
a single free space semaphore.

 Other problems

Do not be misled into thinking that the mutual exclusion and producer-consumer
problems are the only problems in parallel programming.  They are the most
important and most common problems, but there are many others:

The readers-writers problem is a common problem in file-system design.  A
file may either be closed, open for read access by any number of users, or
open for write access by exactly one user.  Implementing the read-open,
write-open and close operations using semaphores and shared variables is
an interesting exercise.

Deadlock, the situation that arises when multiple processes are blocked,
each awaiting some action by another blocked process, is a common failing
in parallel programs.  There are complex algorithms for both deadlock avoidance
and deadlock detection.

 Schedulers, a footnote

There are two basic kinds of low-level schedulers.  In the lecture on
schedulers, the ready queue was described as merely a memory with the
enqueue and dequeue operations, and no description of the scheduling
discipline.  In fact, there are many alternatives:

If the ready queue is a FIFO queue (for example, a linked list of process
descriptions with head and tail pointers), the scheduler is said to implement
the round-robin algorithm, where all processes have equal priority for access
to the CPU.

If the ready queue is a priority queue, where priority is a fixed attribute
of each process, the scheduler is a simple priority scheduler.  Such schedulers
are commonly used in real-time operating systems.  Typically, high priorities
are given to processes that must meet frequent deadlines, while low priorities
are given to processes with infrequent deadlines (this is called the rate
monotonic priority assignment).

If the ready queue is a priority queue, where priority is adjusted in
response to process behavior, we can develop an adaptive scheduler.
For example, the classic UNIX scheduler would raise the priority of processes
each time they voluntarily relinquished the CPU, perhaps to wait for I/O
completion, while periodically lowering the priorities of all processes.
The result is that interactive processes, which do little computing but
frequently await input, would naturally rise to a higher prioriy, while
compute bound processes would fall in priority.  This provides excellent
service to interactive users without the need for users to be aware that
there is any kind of priority structure.

Another category of scheduler attaches, as the priority of each process,
the deadline by which it must finish its operation.  The process with the
nearer deadline gets priority, and when a process finishes its operation,
it either terminates or resets its deadline to the time at which it must
finish its next operation.  Deadline based schedulers are provably optimal
for real-time systems, but they are rarely used in practice.