22C:116, Lecture 37, Fall 1999

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Amoeba Capabilities

    Tannenbaum's Amoeba system supports capabilities formatted as follows:

              | server id | resource id | rights |  check   |
              |    48     |      24     |    8   |   48     |
    Amoeba uses server-side authentication; thus, the kernel is only concerned with the server ID field; this is used to send messages to the appropriate server. The interpretation of the other fields is entirely up to the server!

    Amoeba provides a standard set of interpretations for these fields, but it is not enforced. In the standard interpretation, these fields are used as follows:

    The resource ID field is used by the server to identify which of many objects it is being asked to operate on. Amoeba assumes that each server process will serve many different objects. This, in turn, implies large servers, which implies an assumption that process creation will be a heavyweight operation and should be done infrequently.

    (Contrast this design with a hypothetical system where process creation is very inexpensive, and therefore, it would be reasonable to represent objects by processes, so the server-id field would identify the object itself and the resource-id field could therefore be either very small or entirely absent.)

    The rights field confers access rights to the object that the server has extended to the client. Each bit is assumed to confer one right; if the rights field is all ones, the user has all rights. If the object is a file, for example, the rights might refer to such operations as read and write.

    The check field is used by the server to assure that the capability is valid and that the rights field has not been corrupted. For each object allocated by a server, it will typically assign a random check number that must match the check number provided with a capability by the client. This prevents the accidental forgery of capabilities.

    The problem of preventing clients from asserting rights they do not have is solved by an ingeneous cryptographic trick. The check field given to the client is not an exact copy of the check number held by the server; instead if C is the server's check number for the object, and the user's rights are R, the user's check field must be f(C xor R). When the user presents a capability to the server, it may easily compute f(C xor R) and compare it with the check field. The standard function f is a trapdoor function, that is, a function which is very easy to compute in the forward direction but with a very difficult to compute inverse.

    (For an ideal trapdoor function operating on an n-bit argument, the time to compute f would be O(nk), while the time-space product for computing f-1 would be O(2n). (Note that an add can be done in O(log n) time using O(n log n) gates in parallel hardware, or it can be done in O(n) time using O(n) or O(1) gates. Note multiplication can also be done in O(log n) time using O(n2 log n) gates. In contrast, table-lookup evaluation for an n-bit function takes takes a table of size O(2n) and requires at least O(log n) time for address decoding.)

    The use of a trapdoor function to verify capabilities means that only the server may restrict the rights on a capability. This, in turn, means that every standard Amoeba server includes a "restrict" operation that takes a new rights mask and a capability as parameters, and returns a restricted version of that capability. To avoid the expense of this operation on capabilities with a rights mask of all ones, the server's check number C is included literally in such capabilities and the one-way function f is not used. Because there is a standard one-way function available to all users, a user of a capability with a rights mask of all ones may restrict it locally. This introduces no security loophole precisely because that user already had all rights to the object, and so, that user gained nothing by being given the "secret key" used by the server to manipulate that object.

  2. Amoeba Message Delivery

    Amoeba allows resource migration, and it allows users to hold capabilities anywhere in their address space. As a result of the latter, the DEMOS approach to migration cannot be used because the kernel cannot rewrite a user's capabilities for an object when that object moves. Therefore, the server ID field in a capability cannot be relied on to include the network address of the server! As a result, the kernel on every Amoeba machine must support a translation table between logical server addresses and current addresses.

    No kernel is expected to have complete knowledge of all resource locations; instead, each kernel holds only a subset of this in its local address translation cache; this subset consists of the most recently used network addresses. If a local client sends a message to a remote server and the server is listed in the cache, the message can be sent directly to the server. If there is a cache miss, additional network traffic will be required to locate the server.

    The local cache can be, for example, some kind of hash table, perhaps with some approximation of LRU replacement to assure that old entries are deleted when they are no-longer needed.

    It is important to note that, in a distributed system, the system itself may stay up and running for far longer than any individual machine supporting it. Furthermore, we desire distributed systems that scale well. These two constraints pose significant problems for message delivery systems.

    One naive approach to handling the problem of network address translation cache misses is to have the kernel that can't find a process's address broadcast a message: "Does anyone have server X". On receiving such a message, each kernel must scan the list of servers that have registered locally to see if any have registered as X. This scheme is perfectly suited to a small system, but it scales very poorly -- eventually, as the system grows, you would expect the broadcast traffic from address translation cache misses to saturate the interconnection network bandwidth and computing power of the entire system.

    Alternately, a name server could be installed. Whenever a server begins to offer some service, it registers that service with the name server; whenever a kernel has a network address translation cache miss, it asks the name server to resolve the problem. This solution also scales poorly. As the system ages, the storage capacity of the name server is consumed, and the rate of resource consumption grows as the system grows. Furthermore, as the system grows, it is easy to imagine the computational demand on the name server becoming a bottleneck in the entire system's performance.

    The Amoeba system solves this in a two-fold way. First, name servers are potentially arranged in a hierarchy, with local, regional and perhaps global servers. Secondly, name servers themselves are only extensions to the address translation cache, and a broadcast protocol is available if no name server knows of the server being addressed.

  3. Amoeba Threads

    Just by inspecting the Amoeba capability representation, it is easy to see that Amoeba servers are intended to be heavyweight entities. The capability format allows each server to implement as many as 16 million distinct objects, and this requires a table of as many as 16 million check codes, one per object. Thus, each Amoeba server process is expected to manage a huge numbers of objects on behalf of its clients. Huge servers imply heavyweight processes.

    Amoeba is based on a blocking model of interprocess communication using remote procedure calls. Thus, the primary services offered to user processes are get_request -- await an RPC from a client, put_reply -- send a reply to a client, and trans -- send an RPC request and await a reply. Both get_request and trans are blocking primitives.

    The use of blocking primitives limits the opportunities for parallelism in Amoeba; they make it difficult for a program to perform other useful computing while awaiting a response, whether the program is a client or server. To solve this problem, Amoeba supports multiple threads per process.

    Amoeba threads are implemented by the kernel. An Amoeba process is a heavyweight entity with a list of segment descriptors; an Amoeba thread is an attribute of a process with a stack pointer, program counter and register set. The threads of a program may interact using semaphores, binary semaphores (mutexes), and asynchronous interrupts.

    If an Amoeba process needs to perform a nonblocking operation, it must spawn a thread to perform that operation. Threads are sufficiently lightweight that this can be done at reasonable cost.

  4. Amoeba Migration

    The Amoeba kernel does not initiate migration, but servers may migrate. To allow this, Amoeba allows a process to take over the network address previously served by some other process. The right to use this service is protected by a one-way hash function -- that is, the public name of a service is derived from its private name by such a function, so knowledge of the public name does not disclose the private name of the service, and the private name is the only one that can be used to take over the service.

    For a service to migrate from one server to another, the servers must agree on a migration protocol. Typically, the old server must cease serving client requests, then transfer its internal state to the new server, then let the new server take over the service and begin serving requests.

    The Amoeba group communications primitives allow a group of processes to keep their internal states synchronized in order to offer a reliable service, with one process actually serving user requests while the others monitor it and prepare to elect a successor in the event that the primary server fails.