22C:116, Lecture 23, Spring 1999

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Interprocess Communication

    Many old ideas from uniprocessor systems apply to multiprocessor systems and networks!

    There are many other examples, but we will focus on these for the moment. Whatever the primitives may be, there are problems that are common to all interprocess communication mechanisms.

  2. Establishing a Communication Channel

    To establish an interprocess communications channel, it must be created, the rights to use the channel (or knowledge about it) must be distributed to the processes that wish to communicate, and then the channel must be used. These three aspects of channel establishment are conceptually independant.

    Here is an example from Classic UNIX, where the commonly used interprocess communication channel is the pipe.

    Channel Creation
    The pipe system call creates a new bounded FIFO channel; both ends of this channel are held by the process that creates it. Each end of the channel is represented by an open file descriptor, largely indistinguishable from a descriptor returned by the open service.

    Rights Distribution
    The fork system call creates a new process and gives it every open file descriptor the original process had, including descriptors for open pipes. Thus, under classic UNIX, access to interprocess communications channels is distributed only at the time of process creation.

    Channel Use
    Once created, a UNIX pipe may be used with the read and write system calls. Write appends one buffer full of information to the bounded FIFO channel, while read fills a buffer from the channel; these are the conventional services used for reading and writing files.

    Classic UNIX pipes are not fully general purpose, in that they may only be shared with processes that were created by processes that already had access to the pipe. Typically, pipes are shared by a producer and a consumer. To set up such a relationship, the original process first creates the pipe, then forks into two processes. One process closes the read end of the pipe, becoming a producer because it can only write to the pipe. The other process closes the write end of the pipe, becoming a consumer because it can only read from the pipe.

    UNIX pipes are not, strictly speaking, FIFO queues of characters. Most older UNIX implementations buffer 512 bytes on behalf of the sending process before delivering anything to the receiver, and the sender is not re-awakened until the receiver has finished consuming all 512 bytes. This default behavior could be modified by the sender -- if the sender issues a fsync(p) system command on pipe p, the receiver gets whatever was written in p prior to the flush operation. This is the same operation that a user can apply to a disk file to force all cached data to disk.

    Note that it is quite easy to change one aspect of communications channel creation, distribution or use without changing another aspect. For example, when AT&T brought out System V UNIX, named pipes were introduced. These could be placed in a directory, just like a file, thus changing the way channels are distributed and created, but leaving the semantics of their use unchanged.

  3. Communication Channel Examples

    Under MULTICS, processes may communicate through shared memory. Files, under multics are named segments, and when a file is opened, that segment is inserted in the address space of the process. If two processes open the same file, they share the segment and may use it to communicate.

    Channel Creation
    A potential interprocess communications channel is created every time a file is created.

    Rights Distribution
    When a file is created, it is given a name that is universally meaningful. Thus, once a channel has been created, any process can mention its name. The right to use the channel may be controlled by the access control list on the file.

    channel use
    A process wishing to use an interprocess communication channel must open the associated file; this inserts the associated segment into the address space of the process. From that point until the file is closed, the process may communicate with any other processes that have the same file open. (Of course, the file is only a communications channel if some process writes data on it that others read; read-only files cannot be usefully viewed as communications channels.)

    The DEMOS system, originally implemented on the CRAY I computer by a group at Los Alamos (F. Baskett, et al, `Task Communication in DEMOS', ACM SIGOPS, Vol 11, No 5, Nov 1977, page 23-31) used message passing between processes as its fundamental interprocess communication mechanism. Messages were passed over links, where a link confers the right to pass messages to a particular "port" (incoming message queue) of a particular process. The set of links each process may use to send messages was held in a "link table", really a capability list holding links.

    Channel Creation
    A process wishing to create a channel uses the create system service to create a link allowing messages to be sent to itself. This link is useless to the process that created it, since it may only be used to send messages to that process! Every process has multiple incoming message queues. To create a link, the creating process specifies which of its incoming message queues will be used for any messages sent over that link. All of the links held by a process are stored in that process's link-table, and referenced by link-table index numbers.

    Rights Distribution
    Messages on DEMOS are sent over links and may contain contain a block of data plus a single link. Thus, if process A has a link it wants to give process B, it must have a link allowing it to send messages to process B; using this link, it can distribute any other links it has to process B.

    Channel Use
    The fundamental primitives for sending and receiving messages are send which composes an optional link with a message buffer and transmits it over a link, and receive which awaits a message on one or more incoming message queues of a process and returns the message buffer, any link contained in the message, and the queue identity, when a message becomes available.

    Since the only way a process can get a link is to have some process send it in a message, there is a bootstrapping problem: How does a process get the links it needs to start operation?

    The answer lies in the way DEMOS processes are created. The DEMOS process manager is itself a process, and the right to use the process manager for process creation is conferred on a process by giving that process a link to the process manager's create-process incoming message queue.

    To create a process, the creator sends the process manager a message containing the size of the desired memory region for the created process and a link to be used by the process manager to send a reply message. The process manager's reply to this request contains a newly created link to the process manager that uniquely identifies the new process. Messages to this link can be used to load or inspect the memory image of the new process, to load or inspect the link table of the new process, and to start or stop the process.

    Once a process is created, the process that created it must use this new link to "download" code and links into the new process image before starting that process. The DEMOS system imposed no rules about how this is done, but as in UNIX, conventions were established: By convention, the first few slots in each process's link table were setup containing entries such as the following:

    Frequently, a parent process would simply pass a copy of its own link table to the child, but DEMOS processes were free to customize the environment of any process they created, substituting their own links for system links.

    Open files under DEMOS are represented by links to the file manager, another process, with a unique link for each file opened. Links to files that are directories allow operations such as opening files in that directory; links to data files allow operations such as read and write. Thus, link table entries may refer to many kinds of objects and it is up to the application program to remember what kind of thing each link represents.

    (the latter is not a new problem -- we are completely comfortable with similar constraints on main memory, where it is up to the application program to make appropriate interpretations of each byte).

  4. Problems: Symmetry and Location

    Interprocess communication channels may be symmetrically named by both sender and receiver in the same way, as with UNIX pipes and MULTICS segments. In this case, the channel exists somewhat independantly of its users.

    Alternately, channels may be bound to one or the other processes involved, as with DEMOS links. We send messages to a link which identifies a receiving process. Ownership of a link may be transferrable, as in DEMOS, or it may not be transferrable. Bindings of links to recipient processes may be transferrable, or it may not, as in DEMOS.

    Where is an interprocess communication channel located, physically? This question is of secondary importance with conventional systems, but when a system is distributed, it becomes a significant issue!

        ____________                   ___________
                    |   ___________   |
                    |  |           |  |       
             P1 ----|--|--Channel--|--|---- P2
                    |  |           |  | 
        ____________|  |           |  |___________
                       |           |    
    
    With an asymmetric channel, it is obvious that the storage associated with the channel should be located with the process to which the channel is bound.

    With a symmetrical channel, it is not obvious where the storage should be located! For a file, as in MULTICS, the problem is not so bad -- the storage is physically shared between all users, and when there are no users of the file, the storage is all on disk under the management of the file system.

    For a UNIX pipe, on the other hand, nothing is obvious. The pipe may have many readers and many writers, and on a distributed system, the reasonable location for the buffer associated with the pipe may vary continuously as readers and writers are created and destroyed during the life of the pipe.

    The example of multiple readers and writers on a UNIX pipe is not entirely artificial. Under UNIX, pipes may be used to implement both shared memory and semaphores. Writing a byte to a pipe (and using fsync to make sure that the receiver gets the byte) can be used as a Signal operation, and reading a byte from a pipe can be used as a Wait operation, thus implementing semaphores. If this is done, all processes using the pipe must be able to both read and write.

    To implement shared memory using a pipe, consider the pipe to be a single shared variable containing up to 512 bytes of data. Reading the contents of the variable is done with a READ command, which is destructive -- therefore, every user always follows each read with a write restoring the value of the variable. Write is also troublesome -- the variable must be empited of its old value prior to the recording of a new value, so again, changing the variable is always done by first reading out the old value and then writing in the new one.

    This implementation of shared variables and semaphores using UNIX pipes is stupid, but it is part of the proof of a folk theorem that any system supporting an adequate interprocess communication mechanism can be used to implement any other interprocess comunication mechanism.

  5. Style of Interprocess Communication

    The communication services offered by a system tend to dictate the style of distrubuted applications. For example, UNIX pipes encourage structuring applications as pipelines:

        A|B|C (in UNIX shell notation)
    
    In contrast, the message passing facilities of DEMOS encourage client-server relationships.
        P1 <---> Server <---> P2
    
    It is fair to ask what models of interprocess communication are really useful? The answer lies not in theory, but in what real programmers have successfully found uses for. The semantic structures of programming languages also give strong hints, since communication structures that are widely used within one program are also likely to be useful between programs in a distributed system.

    Clearly, both pipelines and client-server relationships have proven themselves to be useful. Pipelines are fairly simple, conceptually, but client-server relationships require deeper study.

  6. Client Server Models

    All of the following have significant similarities:

        Dijkstra's notion of a secretary process
        Hoare's monitors
        Ada tasks and the rendezvous concept
        Remote procedure calls
        Client-server models
    
    Dijkstra developed the notion of a secretary in the context of his THE operating system in the late 1960's (THE is an acronym for Techniche Hochschule Eidenhoven). Processes requesting a service would communicate with the secretary process (using semaphores). When the secretary was done, it would issue a signal saying it had done the requested service.

    Hoare formalized this with his idea of a monitor (more about this later). The Ada rendezvous (literally, a get-together between tasks) allows the construction of monitor-like processes.

    Remote procedure calls are one way of implementing monitors and Ada rendezvous entries on a distributed system.

    All of the above are examples of client-server interactions:

              ________   request    ________
             | Client |----------->|        |
             |________|<-----------|        |
                         reply     |        |
                                   | server |
              ________   request   |        |
             | Client |----------->|        |
             |________|<-----------|________|
                         reply  
    
    Clients are processes. The server may be a process providing multiple services, a process providing a single service, or in the case of object oriented systems, not a process.

    Client server models are quite old. For example, the classic disk driver, with its request queue, is an example of a server, where the processes requesting disk I/O are clients.

    In all cases, the server may serve many clients, all of whom make requests that contain enough information to send a reply when the service is complete.

    If the server is not a process, it must execute on the same machine as the client, and each entry to the server is essentially a procedure call. Any synchronization between entries to the server must be provided by conventional interprocess synchronization, for example, using semaphores.

    If the server is a process, it waits for a request for service, then takes the requested action, then sends a reply indicating that the service is done.

    If clients immediately await a reply after sending a request, and if the server does no computation when it is not acting on behalf of a client, then the server is imitating a procedure call model as it might be implemented on a uniprocessor, except that no two clients may enter the server at the same time -- the limitation of the server to a single process ensures that there is mutual exclusion between different service requests.

    If the server provides multiple services, as is common when the overhead of large nubmers of small processes is high, this just adds a degree of unneeded mutual exclusion.

  7. DEMOS, A Client-Server Example

    The DEMOS system was completely structured around client-server interactions. For example, the file system was a server. To open a file, a process would send a message to the file server's open link (the link to the file server's port from which it expects to get open messages) containing the name of the file to be opened. The file server would reply with a message containing a link designating an open file.

    When the file server opens a file, it allocates one of its ports to that file and creates a link allowing users of that file to send messages to that port. Messages to this port are interpreted by the file server as requests to read, write, seek or close that particular file.

    Read, write, seek and close messages are sent to the file server by clients that have open files, and in each case, the server is expected to send a reply after it completes the requested service.

    Client-server interactions almost always follow the following scheme:

    1. Client allocates a reply port and creates a reply link using that port.

    2. Client sends message to server containing
      a) Identity of service requested
      b) Data needed to perform that service
      c) The reply link

    3. Client awaits a message on the reply port.

    4. Server receives the message and does the work requested;

    5. Server sends reply message containing the results of requested operation.

    6. Server sends reply message containing the results of requested operation, and then deletes the reply link.

    7. 6) Client receives the reply. the reply port.

    This sequence is so common that the designers of DEMOS extended the set of system services to include call, a service that combines the creation of a reply port and link with the sending of a request and waiting for the reply.

    The designers of DEMOS also added an attribute to links, delete. Links with this attribute are automatically deleted from the holder's link table when they are used, and reply links are always of this type. Thus once a process receives a message on a reply port, it may assume that it will never receive another message there.

    Finally, the designers of DEMOS added a send-delete service that automatically deletes the link it uses as a side effect of sending a message. This is generally used by servers to send replies.

    Note that a typical server, for example, the DEMOS file server, must always be willing to accept an incoming message on any of a number of ports! In the case of the file server, incoming messages are expected on the open port and on the ports for each open file. This is why the DEMOS service for receiving a message allows a set of ports to be specified.