When a client calls on a server to perform some operation, it is very tempting to make the analogy with a procedure call. Conceptually, the server offers the client one or more procedures; when the client calls one of these procedures, the server performs the desired action.
In fact, this is a form of data abstraction. In languages like C++ or Ada, the representation of an object (in Ada, a package) may be hidden from users of that object. If a user wishes to operate on the object, the user must call on one of the procedures that provide the public interface to the object (in C++, these are called methods).
If an object is implemented locally, calls to the procedures making up the public interface to the object may be made using simple procedure calls. If an object is implemented on a remote machine, it is desirable to view the operations on the object as being performed by procedure calls to that remote machine. Obviously, such calls must be implemented using messages passed over the network that interconnects the machines in question! Nonetheless, we can go a long way towards hiding this from users of the procedures in question.
Consider the following view of the system, as the user prefers to see it:
User | | Remote System | | : | | : | Net | Procedure P(in A; out B) Call P(X,Y) | | : : | | Return : | |Users typically want to ignore the presence of the network and just write code that calls P. Furthermore, the writer of the procedure P on the remote machine is equally likely to want to ignore the network and just write P as if it was called locally.
Assuming that the parameter X is being sent to P, and that the parameter Y is to be changed by P, clearly there must be at least two messages sent over the network. The first message, from the user to the remote system, carries the parameter X and indicates a call to P. The second message, from the remote system back to the user, carries the result of the call to P.
These messages constitute the core of a remote procedure call communication protocol.
It is common to package the remote procedure call protocol with two pieces of stub code, a client stub that hides the network from the user, and the server stub that hides the network from the remote procedure. Thus, instead of calling a remote procedure P, the user calls a local procedure P that acts as the local agent for the remote procedure. This is called the client stub. At the server end, the server stub is a main program that substitutes for the remote code and calls the real procedure P in response to messages from the client stub. Thus, the following picture emerges:
User's Machine Server Machine User Stub | | Stub Server | | : Proc P(X,Y)| | Repeat Proc P(X,Y) : / Send(X) |Net| Receive(X)/ : Call P(X,Y) and | | Call P(X,Y) code : \ Receive(Y) | Send(Y) \ : : Return | | Forever ReturnNote that the minimal client stub sends one message to the server stub and awaits one reply message before returning control to the user's code. The minimal server stub repeatedly awaits service requests from various clients, calls the actual code on the client's behalf, and sends the results.
The network protocols typically used to connect client and server are at or above the session layer! Thus, messages are addressed not to a particular machine, but to a particular process, or perhaps an incoming message port of a particular process, as on DEMOS. In fact, the protocol used by a DEMOS client to communicate with a DEMOS server is an excellent example of a very simple client server protocol.
At the opposite extreme, the commonly used protocol to communicate between world wide net clients and servers (called HTTP, the Hypertext Transfer Protocol) involves opening a session from client to server, where the session is implemented using the Terminal Connection Protocol, part of the internet's TCP/IP protocol suite. Once the bidirectional communication channel that implements the session is established, the client sends the outgoing parameters (the name of a web page) to the server, and then awaits the results (the text of the web page).
In a centralized system, most failures cause the entire system to stop working. Such a system is called a fail-hard system, or in the extreme case, where all failures cause a system to stop, fail-stop system. When a rare failure causes only some parts of the system to stop, it is sufficiently uncommon to become the subject of folklore. The infamous Intel Pentium floating point error is a good example. In contrast, with distributed systems and networks, partial failures are common and must be dealt with. Such systems are fail-soft.
The biggest problem an RPC protocol may encounter is lost messages. In the basic protocol, there are two messages, an outgoing message indicating a call to a remote procedure, and a reply message indicating a return from a remote procedure.
Either of these messages may be lost, and the first question this raises is how to detect message loss. Perhaps the only realistic alternative for this purpose is to start a timer when the call message is transmitted and then wait a reasonable length of time. What is a reasonable delay clearly depends on the application! If the delay expires before a reply is received, the message is considered to be lost.
The second question that arises is what to do if a message is lost. Only the client stub knows a loss occurred, and there is little choice. It must retransmit the request. (If there are multiple equivalent servers, it may address a different server on the retry.) This results in the following basic outline for the client stub:
Procedure P(in X; out Y) -- Client Stub RA = return network address Repeat Send (X,RA) to RPC server Await reply(Y) or timeout Until not timeout ReturnIf the lost message was the request, this will correctly send a duplicate request to the server, and if this gets through, the server will correctly perform the service and send a reply.
If the lost message was a reply, though, the second request will cause the server to repeat the service. For a read-only service, this will not cause problems, but for some read-write services, repeating the service is a mistake. The operation of writing a particular sector of a paricular file can be repeated without problems, but the operation of appending a buffer to a file, if repeated, causes the file to be corrupted. Operations which are safe to repeat are called idempotent operations because the sequence PPP has the same effect as P alone.
A more complex protocol is required for non idempotent operations! The key is that the server must have sufficient information to determine whether a request is new one or a duplicate of a request previously acted on. To assure this, the request must include a new field, for example, a unique request identifier. The client stub using this will typically be something like the following:
Procedure P(in X; out Y) -- Client Stub RA = return network address Compute transaction ID Repeat Send (X,RA,ID) to RPC server Repeat Await reply(Y,ID') or timeout Until timeout or (ID = ID') Until not timeout Assert that ID = ID' ReturnNew complexity was added here because of the possiblity that the reply received was a duplicate reply to a previous request. This duplicate reply will contain the wrong transaction ID, and this allows it to be ignored. Such a duplicate reply can be created, for example, by a reply that was sufficiently delayed in transit that a duplicate request was sent.
The server must maintain a cache of transaction IDs and the replies sent in response to them. If a new request is received with an ID that matches a cached ID, the corresponding reply is retransmitted. If a request is received with a new ID, the associated action is taken and, in addition to returning the reply, a copy is cached.
This raises the problem of how to limit the memory requirements of the server's cache? With this protocol, it is never necessary to cache more than one reply on behalf of any particular remote client process, so if the transaction ID includes a process ID field, this allows the cache to be cleaned. In a large distributed system, such as the World-Wide-Web on the Internet, this is still insufficient.
The solution is to find the upper bound on the total time a clients will wait for a reply. If client i never makes more than Ni retries, and if the interval between retries is Ti, what you want is max[i](NiTi). The server never needs to cache any replies for longer than this interval! How long should clients wait before they give up? This may depend on the application, but it should not be longer than the expected worst case time to repair the server in the event of a hard failure in the server's machine. Therefore, the server can always retire cache entries after the expected worst case repair time. Conveniently, for critical applications, this repair time is usually much shorter than it is for noncritical applications, where sloppy error recovery is typically acceptable.