This lecture concludes our review of interprocess communication, followed by some abbreviated notes on scheduler variations.
In the previous lecture, we discussed mutual exclusion. This low level control over the sharing of data is essential but insufficient to allow productive parallel programming.
Consider a situation where one process produces data to be consumed by another:
Producer --> ConsumerThe producer can easily fill a shared variable, then let the consumer take the data, but controlled sharing is not enough. There must be some way for the producer to signal to the consumer that data is available.
One way to view this signalling problem is to think of the fork and join primitives.
Process A Process B | | Produce | some data | | | V | ----- fork | | \___________ | | \ | | || | __VV__ join | | | Comsume | the data | |The control path between the two processes can be considered to be a short-lived process that serves to synchronize the producer and consumer, or it can be considered to be a synchronizing signal of some type.In operation, if process A produces data before process B is ready to consume it, the signal reaches the join and waits for process B to arrive at that point. If Process B is ready to consume before process A has produced any data, Process B waits at the join for the signal to arrive.
What kind of data structure should be used to communicate between the producer and consumer? There are many options, but all boil down to variations on the theme of bounded buffers.
A bounded buffer has a data capacity that may vary from 1 item (where it becomes equivalent to a simple variable) to some large number. If more than one item is allowed, the buffer may have either a linked list or an array implementation. Here, we'll look at the array implementation:
________________________________ | | |\\|\\|\\|\\|\\|\\| | | | |__|__|\\|\\|\\|\\|\\|\\|__|__|__| Empty | Data | Empty | | Head TailThe operations on this structure are simple, if we ignore problems with preventing enqueueing data in a full queue or dequeueing data from an empty queue:
enque(d) buffer[tail] = d tail = tail + 1 (modulo queue size) dequeue(d) d = buffer[head] head = head + 1 (modulo queue size)We can detect empty and full queues fairly easily:
empty := (head = tail) full := (head = tail)But, these conditions are the same, which poses problems! Despite these problems, this approach to detecting the buffer full and buffer empty conditions is actually used, on occasion, primarily in low-level applications such as communication between processes and interrupt handlers.
When a queue is shared between multiple processes, there are two different problems. One is assurance of mutual exclusion, so that, for instance, no two processes try to enqueue an item of data in the same queue slot, and the other is keeping track of the number of available data items and free spaces. All of these can be accomplished by semaphores:
Semaphores used to provide mutual exclusion: headlock -- protects the head pointer taillock -- protects the tail pointer Semaphores used to count available resources: free -- counts number of free spaces data -- counts number of enqueued data itemsGiven these semaphores, with all mutual exclusion semaphores initialized to one and the counting semaphores initialized to reflect the initial state of the queue, the operations on a shared queue can be implemented as follows:
enque(d) P(free) -- wait for free space to be available P(taillock) -- wait for exclusive use of the tail buffer[tail] = d tail = tail + 1 (modulo queue size) V(taillock) -- release the tail V(data) -- signal that data is available dequeue(d) P(data) -- wait for data to be available P(headlock) -- wait for exclusive use of the head d = buffer[head] head = head + 1 (modulo queue size) V(headlock) -- release the head V(free) -- signal that free space is availableThis implementation of shared FIFO queues is universally applicable, but there are some special characteristics worth noting:
First, the mutual exclusion locks are only needed if there are multiple producers or consumers. When there is only one producer process, that is the only process that will inspect and manipulate the tail pointer, so no mutual exclusion semaphore is needed to protect the tail pointer.
Second, the mutual exclusion operations shown above protect very brief critical sections, so it is unlikely that any process will wait long for entry. As a result, the best implementations of headlock and taillock involve busy waiting at a spin-lock, not semaphores implemented in the process manager.
Third, if the queues are very large, so there is a large probability that the queue will be neither full nor empty, the counting semaphores will rarely block the producer or consumer; thus, context switching will be very rare.
If the buffer has a capacity of one, the head and tail pointers are superfluous, as are the locks protecting them. In this case, the producer and consumer processes would be expected to operate in lock-step, producing one data item, then consuming it before the next is produced.
Lock step operation of the producer and consumer is frequently innefficient because unless the computational requirements of the two processes are identical, either the producer will spend time waiting for the consumer, or vicea versa.
From elementary queueing theory, we can predict the following: If the average time taken to produce an item is the same as the average time taken to consume an item, and the times vary randomly, the performance of the system will be maximized with an infinite queue capacity and an infinite average number of items in the queue. Furthermore, as the size of the bounded buffer is decreased, the percentage of the time that the producer spends waiting for space or the consumer spends waiting for data will increase smoothly. The conclusion is that, where memory allows, large buffers are preferable.
The bounded buffer with a capacity of two is a commonly used special case, particularly in 1960's vintage input/output subsystems. In that context, it is frequently described as "Double Buffered I/O".
When multiple FIFO queues are implemented using linked lists instead of arrays, it is common to have a single shared pool of free list items instead of a bounded supply of list items for each queue. For a given amount of memory, this can markedly improve performance, but it adds slightly to the complexity of the code. Specifically, instead of having a separate space semaphore to count the amount of free space in each queue, there is a single free space semaphore.
Do not be misled into thinking that the mutual exclusion and producer-consumer problems are the only problems in parallel programming. They are the most important and most common problems, but there are many others:
The readers-writers problem is a common problem in file-system design. A file may either be closed, open for read access by any number of users, or open for write access by exactly one user. Implementing the read-open, write-open and close operations using semaphores and shared variables is an interesting exercise.
Deadlock, the situation that arises when multiple processes are blocked, each awaiting some action by another blocked process, is a common failing in parallel programs. There are complex algorithms for both deadlock avoidance and deadlock detection.
There are two basic kinds of low-level schedulers. In the lecture on schedulers, the ready queue was described as merely a memory with the enqueue and dequeue operations, and no description of the scheduling discipline. In fact, there are many alternatives:
If the ready queue is a FIFO queue (for example, a linked list of process descriptions with head and tail pointers), the scheduler is said to implement the round-robin algorithm, where all processes have equal priority for access to the CPU.
If the ready queue is a priority queue, where priority is a fixed attribute of each process, the scheduler is a simple priority scheduler. Such schedulers are commonly used in real-time operating systems. Typically, high priorities are given to processes that must meet frequent deadlines, while low priorities are given to processes with infrequent deadlines (this is called the rate monotonic priority assignment).
If the ready queue is a priority queue, where priority is adjusted in response to process behavior, we can develop an adaptive scheduler. For example, the classic UNIX scheduler would raise the priority of processes each time they voluntarily relinquished the CPU, perhaps to wait for I/O completion, while periodically lowering the priorities of all processes. The result is that interactive processes, which do little computing but frequently await input, would naturally rise to a higher prioriy, while compute bound processes would fall in priority. This provides excellent service to interactive users without the need for users to be aware that there is any kind of priority structure.
Another category of scheduler attaches, as the priority of each process, the deadline by which it must finish its operation. The process with the nearer deadline gets priority, and when a process finishes its operation, it either terminates or resets its deadline to the time at which it must finish its next operation. Deadline based schedulers are provably optimal for real-time systems, but they are rarely used in practice.