22C:116, Lecture Notes, Apr. 24, 1995

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Server Implementation

    Consider a distributed system, with many machines scattered around a net. Some machines have disks, some have none. What are the alternative approaches for implementing file service on this system? Here, we are not so concerned with what service is offered by each server, but rather, with the internal architecture of the servers and the impact of the particular communications primitives used on how the servers function.

  2. Servers with One Incoming Queue

    The simplest form of message addressing we will consider involves addressing messages directly to a process. This implies that all messages to a process end up in the same incoming message queue; if the process is a server, for example, a file server, this implies a structure like the following:

            repeat
               await request
               process request
               send reply
            forever
    
    This simple model is sufficient for servers that provide very simple services, specifically, if it isn't possible for the processing step to block. On the other hand, if the server must send messages to a subsidiary server, for example, when a file server communicates with a disk I/O server, problems arise:
            repeat
             * await request from client
               process step 1
               send request to sub server
             * await reply from sub server
               process step 2
               send reply to client
            forever
    
    This structure fails if there is only one incoming FIFO queue of messages per process! In this case, the two starred lines (await) have no way of distinguishing between incoming requests from clients and replies from servers.

  3. Event Processing in Single Queue Servers

    One solution to the problem outlined above is to rewrite the server so that it classifies incoming messages as either requests or replies. This can get quite complex:

            repeat
               await message
               case message-type of
                 request:
                   allocate transaction record (TR)
                   process step 1 relative to TR
                   send request + TR to sub server
                 reply:
                   get TR from reply
                   process step 2 relative to TR
                   send reply to client recorded in TR
                   deallocate TR
               end case
            forever
    
    This scheme, although complex, offers one important advantate! It allows multiple transactions to be processed at the same time. Consider the situation where two transactions arrive in quick succession: The server will process step 1 of both before it awaits answers from the subsidiary server on behalf of either. If there are multiple independant subsidiary servers, or if the subsidiary server can optimize the order in which it processes requests, it is possible for the resulting system to have faster throughput than a naive system that finished one transaction before starting another.

    Unfortunately, this "event processing loop" involves complex code, and most programmers would prefer to be able to write something simpler. This can be done, as follows:

            repeat
             | if deferred-queue nonempty
             |    dequeue request from deferred-queue
             | else
             |    await message
             |    (assert it is a request from a client)
             | endif
               process step 1
               send request to sub server
             | repeat
             |    await message
             |    if it is a request from a client
             |       enqueue it in the deferred-queue
             |    endif
             | until it is a reply from the subserver
               process step 2
               send reply to client
            forever
    
    This bit of code is also ugly, but the uglyness is confined to the marked sections of code, each of which substitutes for a starred line in the naive code. The resulting code does not offer the possibility of high throughput, as in the previous example, but it is significantly easier to debug, and will serve very well for simple low performance servers.

    There are alternatives to the above based on "selective wait" primitives, that convert the incoming message queue into a non-FIFO queue. On some systems, these allow newly received messages to be inspected and put back on the queue if they don't fit some criteria, or they allow a wait for messages from a particular correspondant.

  4. Multi Thread, Multi Queue Servers

    The problems with the above code are largely caused by the fact that there is only one queue per server. Since we have associated one queue of incoming messages with each process, a natural way to create more queues is to create processes (or threads). This leads to the following server design:

            repeat
               await request from client
               fork a new process to do the following
                  process step 1
                  send request to sub server
                  await reply from sub server
                  process step 2
                  send reply to client
                  exit
               end new process
            forever
    
    This piece of code has some interesting properties. First, note that the process that receives requests from clients is not the same process as the one that eventually sends replies to clients! The process that receives requests creates one short lived process per request, and all further work resulting from that request is done by that new process.

    The second useful property of this code is that it allows for parallelism in exactly the same way as the event processing code. The processing of different requests can be carried on in parallel, but most of the benefit of this is likely to come from overlapping interactions with subservers.

    Note that the only processes that know about the short-lived request processing processes are the main process of the server, which creates them, and the sub-server, which learns their identity as part of the reply address that is provided with the request messages. As a result, the short-lived processes never need to worry about receiving any messages other than the expected replies.

    This code has a problem! If requests arrive faster than they can be processed, large numbers of short-lived processes may be created. In most systems, there is a limit on the number of processes that may be created, and to avoid exceeding this limit, the server must generally avoid creating too many processes, for example, by using a general semaphore to count processes:

            repeat V(pop) to initialize the semaphore
            repeat
               P(pop)
               await request from client
               fork a new process to do the following
                  process step 1
                  send request to sub server
                  await reply from sub server
                  process step 2
                  send reply to client
                  V(pop)
                  exit
               end new process
            forever
    
    On a UNIX based server, the main server process can use process termination signals to await the termination of a short-lived processes in the event that too many short-lived processes have been created.

  5. Single Thread, Multi Queue Servers

    On systems where process creation is expensive, a multithreaded server is impractical. An alternative is to introduce multiple queues. This was done on DEMOS, and the following simple server structure results:

            repeat
               await message in main mailbox
               (assert it is a request from a client)
               process step 1
               create new mailbox
               send request+newbox to sub server
               await reply in new mailbox
               destroy new mailbox
               process step 2
               send reply to client
            forever
    
    This code is amost identical to the naive (and incorrect) code give at the beginning of this section; the big change is the use of a new (shortlived) mailbox for each transaction with a subsidiary server.

    The above code does not exploit the opportunities for parallelism that were exploited by the event processing code or by the multithreaded code, but this can be done if there is a "multiwait" operation, as on DEMOS. In this case, the control structure is that of the event processing code:

            boxset = { main mailbox }
            repeat
               await message in boxset
               case box id of
                 main mailbox:
                   create new mailbox B
                   boxset := boxset + { B }
                   client[B] := client
                   process step 1 relative to B
                   send request+B to sub server
                 other:
                   B := box id
                   process step 2 relative to B
                   send reply to client[B]
                   destroy B
               end case
            forever
    
    The advantage of this scheme over the pure event processing scheme is that the simple code can be used for simple servers, and only if these begin to limit system performance need the code be rewritten as above.