22C:116, Lecture 28, Spring 2002

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Clients, Servers and Remote Procedures

    When a client calls on a server to perform some operation, it is very tempting to make the analogy with a procedure call. Conceptually, the server offers the client one or more procedures; when the client calls one of these procedures, the server performs the desired action.

    In fact, this is a form of data abstraction. In languages like C++ or Ada, the representation of an object (in Ada, a package) may be hidden from users of that object. If a user wishes to operate on the object, the user must call on one of the procedures that provide the public interface to the object (in object oriented languages such as C++, these are usually called methods, and to add to the confusion, calling a method is sometimes described as sending a message to the object).

    If an object is implemented locally, calls to the procedures making up the public interface to the object may be made using simple procedure call instructions. If an object is implemented on a remote machine, it is desirable to view the operations on the object as being performed by procedure calls to that remote machine. Obviously, such calls must be implemented using messages passed over the network that interconnects the machines in question! Nonetheless, we can go a long way towards hiding this from users of the procedures in question.

    Consider the following view of the system, as the user prefers to see it:

    User                    Remote System
                 |     |
      :          |     |
      :          |     |      Procedure P(in A; out B)
    Call P(X,Y)  |     |        :
      :          |     |      Return
      :          |     |
                 | Net |
                 |     |
      :          |     |    Object X
      :          |     |      Method P(in A)
    X.P(Y)       |     |        :
      :          |     |      Return
      :          |     |
    
    Users typically want to ignore the presence of the network and just write code that calls P. Initially, we will ignore the possibility that P is a method of an object and focus on the issue of control transfer for remote procedure calls; keeping in mind that calls to methods of objects are usually implemented as procedure calls, it is not difficult to generalize this to calling methods of remote objects.

    Our goal is to allow the author of the procedure P on the remote machine and the author of the code that calls P on the local machine to ignore the presence of the network, as much as is possible, and write their code as if P and its caller ran on the same machine.

    Assuming that the parameter X is being sent to P, and that the parameter Y is to be changed by P, clearly there must be at least two messages sent over the network. The first message, from the user to the remote system, carries the parameter X and indicates a call to P. The second message, from the remote system back to the user, carries the result of the call to P, including the new value of Y.

    These messages constitute the core of a remote procedure call communication protocol.

  2. Hiding the Network from the Users

    It is common to package the remote procedure call protocol with two pieces of stub code, a client stub that hides the network from the user, and the server stub that hides the network from the remote procedure. Thus, instead of calling a remote procedure P, the user calls a local procedure P that acts as the local agent for the remote procedure. This is the client stub. At the server end, the server stub is a main program that substitutes for the remote code and calls the real procedure P in response to messages from the client stub. Thus, the following picture emerges:

    User's Machine                    Server Machine
    
    User         Stub          |   |  Stub          Server
                               |   |
      :          Proc P(X,Y)   |   |  Repeat        Proc P(X,Y)
      :        /   Send(X) ----|->-|--- Receive(X) /  :
    Call P(X,Y)    and         |Net|    Call P(X,Y)   code
      :        \   Receive(Y) -|-<-|--- Send(Y)    \  :
      :          Return        |   |  Forever       Return
    
    Note that the minimal client stub sends one message to the server stub and awaits one reply message before returning control to the user's code. The minimal server stub repeatedly awaits service requests from various clients, calls the actual code on the client's behalf, and sends the results.

    Here, we have completely ignored how the messages to and from the server are addressed; in a real network, of course, we must take care to address the messages properly, for example, including a return address in the message with X so that the server can send Y back to the right calling process.

    The network protocols typically used to connect client and server are usually at or above the session or transport layers! Thus, messages are addressed not to a particular machine, but to a particular process, or perhaps an incoming message port of a particular process, as on DEMOS. In fact, the protocol used by a DEMOS client to communicate with a DEMOS server is an excellent example of a very simple client server protocol.

    At the opposite extreme, the commonly used protocol to communicate between world wide web clients and servers (called HTTP, the Hypertext Transfer Protocol) involves opening a session from client to server, where the session is implemented using the Terminal Connection Protocol, part of the Internet's TCP/IP protocol suite. Once the bidirectional communication channel that implements the session is established, the client sends the outgoing parameters (the name of a web page) to the server, and then awaits the results (the text of the web page) on the same session's communications link.

    Exercise: Write code using the DEMOS message passing primitives (see Lecture 27) for the client and server stubs for a remote procedure call. Assume that the client has an appropriate link to the server, and make sure that the client creates a new link for each remote procedure call. One feature of DEMOS was that links had a special access right, multiple use; if this right was not present, the link would self-destruct after a single use. Your solution should make effective use of this behavior to prevent servers from reusing old links to clients after a remote procedure call has been completed. Discuss how your answer must be extended if the parameter list of the remote procedure includes multiple links that must be passed in or out.

    Exercise: What parameter passing modes make sense for parameters to remote procedures? Call by reference? Call by value? Call by value/result? Call by name? If you allow thunks to be implemented by remote procedures, does this change your answer?

  3. Lost Messages

    In a centralized system, most failures cause the system to stop working correctly. Such a system is called a fail-hard system, or in the extreme case, where all failures cause a system to stop entirely, a fail-stop system. In centralized systems, failures that cause only occasional mild difficulty are so rare that when they do occur, they become the subject of folklore. The infamous floating point error in the first generation of Intel's Pentium chips is a good example. In contrast, with distributed systems and networks, partial failures are very common and must be dealt with routinely. What we want are fail-soft or fault tolerant systems, where failures in one component have only limited effects on the entire system.

    The biggest problem distributed systems may encounter is the loss or corruption of messages. We can deal with corruption by adding a checksum to each message, and if the message is delivered with an incorrect checksum, we can treat it as if it had been lost. Therefore, we will focus on lost messages.

    In the basic remote procedure call protocol, there are two messages, an outgoing message indicating a call to a remote procedure and carrying the parameters, and a reply message indicating a return from a remote procedure and carrying the results. Either of these messages may be lost, and the first question this raises is how to detect message loss. Perhaps the only realistic alternative for this purpose is to start a timer when the call message is transmitted and then wait a reasonable length of time. What is a reasonable delay clearly depends on the application! If the delay expires before a reply is received, the message is considered to be lost.

    This use of timers to detect failures is technically an example of fault-tolerant real-time programming. This technique is quite general and is applicable to fault detection in a wide variety of centralized as well as distributed systems, but it only applies where the time taken by a computation can be bounded.
    The second question that arises is what to do if a message is lost. Only the client stub knows that a loss occurred, and there is only one reasonable action to take on loss detection, retransmission of the request message. (If there are multiple equivalent servers, under some circumstances, it may be reasonable to address a different server on the retry.) This results in the following basic outline for the client stub:
    Procedure P(in X; out Y) -- Client Stub
      RA = return network address
      Repeat
        Send (X,RA) to RPC server for P
        Await either reply(Y) or timeout
      Until not timeout
    Return
    
    If the lost message was the request, this will correctly send a duplicate request to the server, and if this gets through, the server will correctly perform the service and send a reply.

    If the lost message was a reply, though, the second request will cause the server to repeat the service. For a read-only service, this will not cause problems, but for some read-write services, repeating the service is a mistake. The operation of writing a particular sector of a paricular file can be repeated without problems, but the operation of appending a buffer to a file, if repeated, causes the file to be corrupted. Operations which are safe to repeat are called idempotent operations because the sequence PPP has the same effect as P alone. (Recall that the mathematical function f is idempotent if f(x) and f(f(x)) are identical for all x).

    A more complex protocol is required for non idempotent operations! We must provide the server with sufficient information that it can determine whether a request is a new one or a duplicate of a request previously acted on. To assure this, the request must include a new field, for example, a unique request identifier. The client stub using this will typically be something like the following:

    Procedure P(in X; out Y) -- Client Stub
      RA = return network address
      Compute a unique request ID
      Repeat
        Send (X,RA,ID) to RPC server
        Repeat
          Await reply (Y,ID') or timeout
        Until timeout or (ID = ID')
      Until not timeout
      Assert that ID = ID'
    Return
    
    New complexity was added here because of the possiblity that the reply received was a duplicate reply to a previous request. Duplicate replies will be received when no messages are lost, but network delays cause the timer to expire before the first reply is received. In this case, after the client transmits its second request, it will receive the reply to its first request. At this point, the client stub will return to its caller, and the second reply may be received by the next RPC call.

    The duplicate must be discarded, so we must be able to detect duplication. If we include the request ID in each reply, we can do this by comparing the ID from the reply with the ID we expect, discarding any replies that contain the wrong ID.

    In order to guarantee that no request is processed more than once, the server must maintain a cache of transaction IDs and the replies sent in response to them. If a new request is received with an ID that matches a cached ID, the corresponding reply is retransmitted. If a request is received with a new ID, the associated action is taken and, in addition to returning the reply, a copy is cached.

    Repeat -- Server Stub for P
      Receive(X,RA,ID) from some RPC client
      if there is a server cache entry (Y,ID)
        -- this is a duplicate request (the client had a timeout)
        Send reply (Y,ID) to RA
      else
        -- this is the first occurance of this request
        Call P(X,Y)
        Put (Y,ID) in the server cache
        Send reply (Y,ID) to RA
      endif
    Forever
    

    How do we limit the memory requirements of the server's cache? With this protocol, it is never necessary to cache more than one reply on behalf of any particular remote client process, so if the transaction ID includes a process ID field, this allows the cache to be cleaned. In a large distributed system, such as the World-Wide-Web on the Internet, this is still insufficient.

    The solution is to find the upper bound on the total time a clients will wait for a reply. If client i never makes more than Ni retries, giving up and declaring the server to be unreachable after that number of attempts, and if the interval between retries is Ti, what you want is max[over all i](NiTi). The server never needs to cache any replies for longer than this interval! How long should clients wait before they give up? This may depend on the application, but it should not be longer than the expected worst case time to repair the server in the event of a hard failure in the server's machine. Therefore, the server can always retire cache entries after the expected worst case repair time. Conveniently, for critical applications, this repair time is usually much shorter than it is for noncritical applications, where sloppy error recovery is typically acceptable.

    Exercise: Write the fault tolerant server stubs that go with each of the client stubs presented above.

    Exercise: How can the final server-side stub be simplified if the function P is idempotent.

    Exercise: Assume an object O with methods P, Q and R. How would you generalize the final server-side logic to handle this?

  4. Parameters

    The above discussion focused on remote procedure calls with relatively simple parameters. If the in and out parameters are each just a few integers, it is perfectly reasonable to make remote procedure calls where the messages used for control flow are identical to the messages used for parameter transfer.

    On the other hand, what if the parameters are huge arrays?

    In C, the usual way to pass an array is to pass a pointer to the array, so that no copy is made. In C++, passing an object as a parameter does not involve copying the object, only the handle is copied, and the handle is a pointer.

    In our discussion above, the procedure P had parameters passed by value (in parameters) and by result (out parameters). It is easy to generalize to parameters passed by value-result (in-out parameters), but it is not easy to pass pointers across the network! The problem is simple. If the object the pointer refers to is in the memory of one machine and the called procedure is running elsewhere on the network, the pointer has no obvious use to the called procedure.

    There are two options, we could pass something more complex than a pointer, or we could pass a copy of the data structure into which the pointer refers, and if the called function makes any changes to that structure, pass it back.

    The classic solution to the problem of passing pointers across the network is to pass an RPC thunk. A thunk, in general, is a function passed as a parameter for the purpose of allowing access to some object through function calls. If, on a single machine, we would pass a pointer to an object, on a distributed system, we can pass an RPC thunk that supports two operations, load and store.

    Thus, when the called remote procedure wishes to follow a pointer that was passed to it, it makes an RPC to the thunk that was used to pass that parameter, and the thunk takes care of the rest. As a result, the RPC client machine -- the machine running the calling code, must also serve as an RPC server, running the process that serves calls to the thunk.

    What if we wish to simply pass a huge parameter to the RPC? In that case, the problem is where do we put the parameter while the message is awaiting receipt by the RPC server. If we demand that the server hold the parameter in its queues while it is serving some other client, this would allow large numbers of clients to shut down the server by filling its memory with parameters.

    The solution is a more complex RPC protocol, with the following messages:

    1. from client: Request RPC entry
    2. from server: Acknowledge entry request, please send parameters
    3. from client: Pass the parameters
    4. from server: Here are the results
    This allows the client to hold the parameters until the RPC server is ready for them. There are other modifications. As given, the protocols requrie the client to assume failure if the server does not respond in a timely way. We can add an "are you done yet with ID" message from client to server, and replies "no, I'm still working on ID" and "what, I've never heard of ID". On timeout, the client sends the former instead of assuming the worst. Only if it gets no reply or if the server claims ignorance does the client redo the request.