23. Interprocess Communication

Part of the 22C:116 Lecture Notes for Fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Introduction

Many interprocess communication ideas from the realm of uniprocessor systems apply equally well in the context of multiprocessor systems and networks! We will discuss these examples here to illustrate this:

Unix pipes
Multics shared segments
Demos style message passing

Whatever the primitives may be, and independent of the underlying system on which they are implemented, there are problems that are common to all interprocess communication mechanism: establishing the communication channel, distributing the rights to that channel to the processes that wish to communicate, and use of that channel. These problems are essentially independent; we will discuss each of them in the context of the 3 example systems in the following sections.

Classical Unix Interprocess Communication

The classical interprocess communication mechanism of Unix is the pipe. Later versions of Unix added other mechanisms such as shared memory, and shared files could always be used for communication, but here, we will focus on pipes.

Channel creation:: The pipe() system call creates a new bounded FIFO channel; both ends of this channel are held by the process that creates it. Each end of the channel is represented by an open file descriptor, largely indistinguishable from a descriptor returned by the open service.
Rights distribution:: The fork() system call creates a new process and gives it every open file descriptor the original process had, including descriptors for open pipes. Thus, under classic Unix, access to pipes is distributed at the time of process creation, and pipes can only be shared by processes have a common ancestor, where that ancestor created the pipe.
Channel use:: Once created, a UNIX pipe may be used with the read() and write() system calls. Write() appends one buffer full of information to the bounded FIFO channel, while read() fills a buffer from the channel; these are the conventional services used for reading and writing files.
Under certain circumstances, read() and write() will operate atomically, but under other circumstances they will not be atomic. The rules are complex; basically, atomic behavior is only guaranteed if the operation does not block on a full or empty pipe midway through copying the buffer. Blocking will only occur before copying the first byte if all readers and writers of a pipe use the same buffer size, and if that size evenly divides the size of the bounded buffer itself (this was 512 bytes in the original Unix, and is always a multiple of this in more modern systems.

Classic Unix pipes are not fully general purpose, in that they may only be shared with processes that were created by processes that already had access to the pipe. Typically, pipes are shared by a producer and a consumer. To set up such a relationship, the original process first creates the pipe, then forks into two processes. One process closes the read end of the pipe, becoming a producer because it can only write to the pipe. The other process closes the write end of the pipe, becoming a consumer because it can only read from the pipe.

Much later in the development of Unix, with Unix System V, named pipes were added; this required a new creation mechanism, mkfifo(), and it uses the Unix directory tree and the open() command for rights distribution. Once open, a named pipe behaves just like a classical pipe. This illustrates the fact that it is quite easy to change one aspect of communications channel creation, distribution or use without changing the other aspects.

It is worth noting that the Unix pipe, as an abstraction, is typical of the kinds of services you might expect from the session layer in the ISO OSI prottocol hierarchy. It is not at a higher level, because it clearly has none of the cosmetic packaging you would expect at the presentation level, and it is certainly not application dependent. It is not at a lower level because it maintains a logical connection from sender to receiver for as long as both have the pipe open, and it offers a data stream with no hint of any underlying message traffic.

Multics Interprocess Communication

Under Multics, interprocess communication was through shared memory segments. Under Multics, when a process opens a disk file, the file is inserted into the address space of the process as a segment. All processes opening the same file will share that segment. This is implemented using the Multics demand paged virtual memory mechanisms. The sectors of the open disk file are used as the backing store for that segment.

Channel creation:: A potential interprocess communications channel is created every time a file is created.
Rights distribution:: When a file is created, it is given a name that is universally meaningful. Thus, once a channel has been created, any process can attempt to open the file. The right to open the file, and therefore, the right to use the channel it represents may be controlled by the access control list on the file.
Channel use:: A process wishing to use an interprocess communication channel must open the associated file; this inserts the associated segment into the address space of the process. From that point until the file is closed, the process may communicate with any other processes that have the same file open. Of course, a file is only an interprocess communications channel if it is used in that way; read-only files are not interprocess communications channels, nor are files that are not shared between processes.

Note that classical Multics did not provide synchronization mechanisms for use on shared segments. This was left to the application. An obvious solution is to set aside some words of the shared segment for use as locks, perhaps using test-and-set instructions or some other mutual exclusion algorithm. The omission of good mutual exclusion tools from the original Multics user interface is no surprise, given that Multics development had been under way for over 5 years by the time Dijkstra published the papers that established the groundwork for our modern view of synchronization.

It is worth noting that the Multics shared segment model does not fit cleanly in the ISO OSI model, but in fact, shared segments have been implemented on networks, using very clever page fault handlers. Such a shared segment clearly represents some kind of connection between the processes sharing the segment, so we are talking about something analogous to a session. The actual presentation of the channel as a segment in address space suggests that we may be talking about the presentation layer here!

Demos Interprocess Communication

The Demos system was originally implemented on the Cray I computer by a group at Los Alamos (F. Baskett, et al, `Task Communication in DEMOS', ACM SIGOPS, Vol 11, No 5, Nov 1977, page 23-31). While Cray computers were, in their day, the fastest on earth, the operating systems they were delivered with were pretty bad. As a result, Baskett's group at Los Alamos set out to design a new system that made the best use they could manage of the Cray's eccentric architecture.

Demos (named for the Greek word for "the people") used message passing between processes as its fundamental interprocess communication mechanism. Shared memory was not allowed because the Cray I memory protection mechanism was too limited to support shared memory! All it had was a base register that was added to all user mode memory addresses and a bound register to limit the maximum user mode address. Even by the standards of the 1970's, this was seen by most system programmers as pathetic.

The designers of Demos chose some slightly odd terminology; communication channels in Demos were referred to as links, and each process had a link table to hold all of the links that process was allowed to use. Links in the link-table of a process were referenced by number, so a process could send a message to the destination referenced by the link in slot 5 of its link table, or it could send a message holding the link in slot 10, or it could read the return link from a message into slot 12. Conceptually, the link table of a process can be thought of as a capability list, but it is a list of capabilities that allow sending of messages, not capabilities that allow reading or writing of data stored in memory segments. Each Demos link allowed messages to be sent to one mailbox of a specific destination process. Each process could have any number of mailboxes.

Channel creation:: A process wishing to create a channel uses the create system service to create a link allowing messages to be sent to itself. This link is useless to the process that created it, since it may only be used to send messages to that process! Every process has multiple incoming message queues. To create a link, the creating process specifies which of its incoming message queues will be used for any messages sent over that link. All of the links held by a process are stored in that process's link-table, and referenced by link-table index numbers.
Rights distribution:: Messages on DEMOS are sent over links and may contain contain a block of data plus a single link. Thus, if process A has a link it wants to give process B, it must have a link allowing it to send messages to process B; using this link, it can distribute any other links it has to process B.
Channel use:: The fundamental primitives for sending and receiving messages are send which composes an optional link with a message buffer and transmits it over a link, and receive which awaits a message on one or more incoming message queues of a process and returns the message buffer, any link contained in the message, and the queue identity, when a message becomes available.

Since the only way a process can get a link is to have some process send it in a message, there is a bootstrapping problem: How does a process get the links it needs to start operation?

The answer lies in the way DEMOS processes are created. The DEMOS process manager is itself a process, and the right to use the process manager for process creation is conferred to a process by giving that process a link to the process manager's create-process incoming message queue.

To create a process, the creator sends the process manager a message containing the size of the desired memory region for the created process and a link to be used by the process manager to send a reply message. The process manager's reply to this request contains a newly created link to the process manager that uniquely identifies the new process. Messages to this link can be used to load or inspect the memory image of the new process, to load or inspect the link table of the new process, and to start or stop the process.

Once a process is created, the process that created it must use this new link to "download" code and links into the new process image before starting that process. The DEMOS system imposed no rules about how this is done, but as in UNIX, conventions were established: By convention, the first few slots in each process's link table were setup containing entries such as the following:

Link to an open file for standard input
Link to an open file for standard output
Link to the file-system root
Link to the Process manager for process creation

Frequently, a parent process would simply pass a copy of its own link table to the child, but DEMOS processes were free to customize the environment of any process they created, substituting their own links for system links.

Open files under DEMOS are represented by links to the file manager, another process, with a unique link for each file opened. Links to files that are directories allow operations such as opening files in that directory; links to data files allow operations such as read and write. Thus, link table entries may refer to many kinds of objects and it is up to the application program to remember what kind of thing each link represents.

(the latter is not a new problem -- we are completely comfortable with similar constraints on main memory, where it is up to the application program to make appropriate interpretations of each byte).

Problems: Channel Symmetry and Location

Interprocess communication channels may be symmetrically named by both sender and receiver in the same way, as with Unix pipes and Multics segments. In this case, each communication channel exists somewhat independantly of its users.

Alternately, channels may be bound to one or the other processes involved, as with Demos links. We send messages to a link which identifies a receiving process. Ownership of a link may be transferrable, as in Demos, or it may not be transferrable. Bindings of links to recipient processes may be transferrable, or it may not, as in Demos.

Where is an interprocess communication channel located, physically? This question is of secondary importance with conventional systems, but when a system is distributed, it becomes a significant issue!

    ____________                   ___________
                |   ___________   |
                |  |           |  |       
         P1 ----|--|--Channel--|--|---- P2
                |  |           |  | 
    ____________|  |           |  |___________
                   |           |

With an asymmetric channel, it is obvious that the storage associated with the channel should be located with the process to which the channel is bound.

With a symmetrical channel, it is not obvious where the storage should be located! For a file, as in Multics, the problem is not so bad -- the storage is physically shared between all users, and when there are no users of the file, the storage is all on disk under the management of the file system.

For a Unix pipe, on the other hand, nothing is obvious. The pipe may have many readers and many writers, and on a distributed system, the reasonable location for the buffer associated with the pipe may vary continuously as readers and writers are created and destroyed during the life of the pipe.

The example of multiple readers and writers on a UNIX pipe is not entirely artificial. Under Unix, pipes may be used to implement both shared memory and semaphores. Writing a byte to a pipe (and using fsync to make sure that the receiver gets the byte) can be used as a Signal operation, and reading a byte from a pipe can be used as a Wait operation, thus implementing semaphores. If this is done, all processes using the pipe must be able to both read and write.

To implement shared memory using a pipe, consider the pipe to be a single shared variable containing up to 512 bytes of data. Reading the contents of the variable is done with a read operation, which is destructive -- therefore, every user always follows each read with a write restoring the value of the variable. Write is also troublesome -- the variable must be empited of its old value prior to the recording of a new value, so again, changing the variable is always done by first reading out the old value and then writing in the new one.

This implementation of shared variables and semaphores using UNIX pipes is stupid, but it is part of the proof of a folk theorem that any system supporting an adequate interprocess communication mechanism can be used to implement any other interprocess comunication mechanism.

Style of Interprocess Communication

The communication services offered by a system tend to dictate the style of distrubuted applications. For example, Unix pipes encourage structuring applications as pipelines:

    A|B|C (in Unix shell notation)

In contrast, the message passing facilities of Demos encourage client-server relationships.

    P1 <---> Server <---> P2

It is fair to ask what models of interprocess communication are really useful? The answer lies not in theory, but in what real programmers have successfully found uses for. The semantic structures of programming languages also give strong hints, since communication structures that are widely used within one program are also likely to be useful between programs in a distributed system.

Clearly, both pipelines and client-server relationships have proven themselves to be useful. Pipelines are fairly simple, conceptually, but client-server relationships require deeper study.

Client Server Models

All of the following have significant similarities:

Dijkstra's notion of a secretary process
Hoare's monitors
Ada tasks and the rendezvous concept
Remote procedure calls
Client-server models

Dijkstra developed the notion of a secretary in the context of his THE operating system in the late 1960's (THE is an acronym for Techniche Hochschule Eidenhoven). Processes requesting a service would communicate with the secretary process (using semaphores). When the secretary was done, it would issue a signal saying it had done the requested service.

Hoare formalized this with his idea of a monitor (more about this later). The Ada rendezvous (literally, a get-together between tasks) allows the construction of monitor-like processes.

Remote procedure calls are one way of implementing monitors and Ada rendezvous entries on a distributed system.

All of the above are examples of client-server interactions:

          ________   request    ________
         | Client |----------->|        |
         |________|<-----------|        |
                     reply     |        |
                               | server |
          ________   request   |        |
         | Client |----------->|        |
         |________|<-----------|________|
                     reply

Clients are processes. The server may be a process providing multiple services, a process providing a single service, or in the case of object oriented systems, not a process.

Client server models are quite old. For example, the classic disk driver, with its request queue, is an example of a server, where the processes requesting disk I/O are clients.

In all cases, the server may serve many clients, all of whom make requests that contain enough information to send a reply when the service is complete.

If the server is not a process, it must execute on the same machine as the client, and each entry to the server is essentially a procedure call. Any synchronization between entries to the server must be provided by conventional interprocess synchronization, for example, using semaphores.

If the server is a process, it waits for a request for service, then takes the requested action, then sends a reply indicating that the service is done.

If clients immediately await a reply after sending a request, and if the server does no computation when it is not acting on behalf of a client, then the server is imitating a procedure call model as it might be implemented on a uniprocessor, except that no two clients may enter the server at the same time -- the limitation of the server to a single process ensures that there is mutual exclusion between different service requests.

If the server provides multiple services, as is common when the overhead of large nubmers of small processes is high, this just adds a degree of unneeded mutual exclusion.

Demos, A Client-Server Example

The Demos system was completely structured around client-server interactions. For example, the file system was a server. To open a file, a process would send a message to the file server's open link (the link to the file server's message queue from which it expects to get open messages) containing the name of the file to be opened. The file server would reply with a message containing a link designating an open file.

When the file server opens a file, it allocates one of its message queues to that file and creates a link allowing users of that file to send messages to that queue. Messages to this queue are interpreted by the file server as requests to read, write, seek or close that particular file.

Read, write, seek and close messages are sent to the file server by clients that have open files, and in each case, the server is expected to send a reply after it completes the requested service.

Client-server interactions under Demos almost always follow the following scheme:

Client allocates a reply queue and creates a reply link using that queue.
Client sends message to server containing
a) Identity of service requested
b) Data needed to perform that service
c) The reply link
Client awaits a message on the reply queue.
Server receives the message and does the work requested;
Server sends reply message containing the results of requested operation, and then deletes the reply link.
Client receives the reply on the reply queue.

This sequence is so common that the designers of Demos extended the set of system services to include call, a service that combines the creation of a reply port and link with the sending of a request and waiting for the reply.

The designers of Demos also added an attribute to links, delete. Links with this attribute are automatically deleted from the holder's link table when they are used, and reply links are usually of this type. Thus once a process receives a message on a reply port, it may assume that it will never receive another message there.

Finally, the designers of Demos added a send-delete service that automatically deletes the link it uses as a side effect of sending a message. This is generally used by servers to send replies.

Note that a typical server, for example, the Demos file server, must always be willing to accept an incoming message on any of a number of ports! In the case of the file server, incoming messages are expected on the open port and on the ports for each open file. This is why the Demos service for receiving a message allows a set of ports to be specified.