22C:116, Lecture 35, Spring 1999

Douglas W. Jones
University of Iowa Department of Computer Science

Server Implementation
Consider a distributed system, with many machines scattered around a net. Some machines have disks, some have none. What are the alternative approaches for implementing file service on this system? Here, we are not so concerned with what service is offered by each server, but rather, with the internal architecture of the servers and the impact of the particular communications primitives used on how the servers function.
Servers with One Incoming Queue
The simplest form of message addressing we will consider involves addressing messages directly to a process. This implies that all messages to a process end up in the same incoming message queue; if the process is a server, for example, a file server, this implies a structure like the following:
```
        repeat
           await request
           process request
           send reply
        forever
```
This simple model is sufficient for servers that provide very simple services, specifically, this may be the right model to use if the server never blocks during the processing phase. On the other hand, if the server must send messages to a subsidiary server, for example, when a file server communicates with a disk I/O server, problems arise:
```
        repeat
         * await request from client
           process step 1
           send request to sub server
         * await reply from sub server
           process step 2
           send reply to client
        forever
```
As examples, consider a directory server that stores its directories as a file, using a low-level file server to do the work, or consider a low-level file server that uses a disk server to store user files. These examples all suggest the above structure.
This structure fails if there is only one incoming FIFO queue of messages per process! In this case, the two starred lines (await) have no way of distinguishing between incoming requests from clients and replies from subsidiary servers.
Event Processing in Single Queue Servers
One solution to the problem outlined above is to rewrite the server so that it classifies incoming messages as either requests or replies. Each message must therefore have a prefix identifying the message type. The software structure of such a server can get quite complex:
```
        repeat
           await message
           case message-type of
             request:
               allocate transaction record (TR)
               process step 1 relative to TR
               send request + TR to sub server
             reply:
               get TR from reply
               process step 2 relative to TR
               send reply to client recorded in TR
               deallocate TR
           end case
        forever
```
This scheme, although complex, offers one important advantage! It allows multiple transactions to be processed at the same time. Consider the situation where two transactions arrive in quick succession: The server will process step 1 of both before it awaits answers from the subsidiary server on behalf of either. If there are multiple independent subsidiary servers, or if the subsidiary server can optimize the order in which it processes requests, it is possible for the resulting system to have faster throughput than a naive system that finished one transaction before starting another.
Unfortunately, this "event processing loop" involves complex code, and most programmers would prefer to be able to write something simpler. This can be done, as follows:
```
        repeat
          ----
         | if deferred-queue empty
         |    await message
         |    (assert it is a request from a client)
         | else
         |    dequeue request from deferred-queue
         | endif
          ----
           process step 1
           send request to sub server
          ----
         | repeat
         |    await message
         |    if it is a request from a client
         |       enqueue it in the deferred-queue
         |    endif
         | until it is a reply from the sub server
          ----
           process step 2
           send reply to client
        forever
```
This bit of code is also ugly, but the uglyness is confined to the marked sections of code, each of which substitutes for a starred line in the naive code. The resulting code does not offer the possibility of overlapping the processing of multiple transactions, so the resulting throughput may not be as good as in the previous example. On the other hand, it may be significantly easier to debug, and it should serve well for simple low performance servers.
In fact, the "uglyness" in the above code is a special case of a more general solution, in which the prefix on each message identifies the logical queue to which that message is directed. Thus, we might write a general wait routine as:
```
	typed_wait( msg-type )
          if empty( queue[ msg-type ] )
	    repeat
              await an incoming message
              t = type of received message
              if not( t = msg-type )
                 enqueue message on queue[ t ]
              endif
            until t = msg-type
          else
            dequeue message from queue[ msg-type ]
          endif
```
This is, in effect, how systems like DEMOS implement multiple mailboxes for each process. Furthermore, it is not difficult to generalize the above to allow a selective wait for any of a set of message types. There are alternatives to this model; for example, on some systems, there are primitives that allow newly received messages to be inspected and put back on the queue if they don't fit some criteria, or they allow a wait for messages from a particular sender.
Multi Thread, Multi Queue Servers
Given only one queue per thread instead of one queue per process, we can solve the problems outlined above in another way, using multiple threads per server (or equivalently, multiple processes sharing an address space, with one queue per process). This leads to the following server design:
```
        repeat
           await request from client
           create a new process to do the following
              process step 1
              send request to sub server
              await reply from sub server
              process step 2
              send reply to client
              exit
           end new process
        forever
```
This piece of code has some interesting properties. First, note that the process that receives requests from clients is not the same process as the one that eventually sends replies to clients! The process that receives requests creates one short lived process per request, and all further work resulting from that request is done by that new process.
The second useful property of this code is that it allows for parallelism in exactly the same way as the event processing code. The processing of different requests can be carried on in parallel, but most of the benefit of this is likely to come from overlapping interactions with subservers.
Note that the only processes that know about the short-lived request processing processes are the main process of the server, which creates them, and the sub-server, which learns their identity as part of the reply address that is provided with the request messages. As a result, the short-lived processes never need to worry about receiving any messages other than the expected replies.
This code has a problem! If requests arrive faster than they can be processed, large numbers of short-lived processes may be created. In most systems, there is a limit on the number of processes that may be created, and to avoid exceeding this limit, the server must generally avoid creating too many processes, for example, by using a general semaphore to count processes:
```
        repeat V(pop) to initialize the semaphore
        repeat
           P(pop)
           await request from client
           fork a new process to do the following
              process step 1
              send request to sub server
              await reply from sub server
              process step 2
              send reply to client
              V(pop)
              exit
           end new process
        forever
```
On a UNIX based server, the main server process can use process termination signals to await the termination of a short-lived processes in the event that too many short-lived processes have been created.
Single Thread, Multi Queue Servers
On systems where process creation is expensive, a multithreaded server is impractical. An alternative is to introduce multiple queues. This was done on DEMOS, and the following simple server structure results:
```
        repeat
           await message in main mailbox
           (assert it is a request from a client)
           process step 1
           create new mailbox
           send request+newbox to sub server
           await reply in new mailbox
           destroy new mailbox
           process step 2
           send reply to client
        forever
```
This code is amost identical to the naive (and incorrect) code give at the beginning of this section; the big change is the use of a new (shortlived) mailbox for each transaction with a subsidiary server.
The above code does not exploit the opportunities for parallelism that were exploited by the event processing code or by the multithreaded code, but this can be done if there is a "multiwait" operation, as on DEMOS. In this case, the control structure is that of the event processing code:
```
        boxset = { main mailbox }
        repeat
           await message in boxset
           case box id of
             main mailbox:
               create new mailbox B
               boxset := boxset + { B }
               client[B] := client
               process step 1 relative to B
               send request+B to sub server
             other:
               B := box id
               process step 2 relative to B
               send reply to client[B]
               destroy B
           end case
        forever
```
The advantage of this scheme over the pure event processing scheme is that the simple code can be used for simple servers, and only if these begin to limit system performance need the code be rewritten as above.