22C:116, Lecture Notes, Apr. 28, 1995

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Amoeba Capabilities

    Tannenbaum's Amoeba system supports capabilities formatted as follows:

              | server id | resource id | rights |  check   |
              |    48     |      24     |    8   |   48     |
              |___________|_____________|________|__________|
              |___________|_____________|________|__________|
    
    The kernel is only concerned with the server ID field; this is used to send messages to the appropriate server. The interpretation of the other fields is entirely up to the server!

    Amoeba provides a standard set of interpretations for these fields, but it is not enforced. In the standard interpretation, these fields are used as follows:

    The resource ID field is used by the server to identify which of many objects it is being asked to operate on. Amoeba assumes that each server process will serve many different objects. This, in turn, implies large servers, which implies an assumption that process creation will be a heavyweight operation.

    The rights field confers access rights to the object that the server has extended to the client. Each bit is assumed to confer one right; if the rights field is all ones, the user has all rights. If the object is a file, for example, the rights might refer to such operations as read and write.

    The check field is used by the server to assure that the capability is valid and that the rights field has not been corrupted. For each object allocated by a server, it will typically assign a random check number that must match the check number provided with a capability by the client. This prevents the accidental forgery of capabilities.

    The problem of preventing clients from asserting rights they do not have is solved by an ingeneous cryptographic trick. The check field given to the client is not an exact copy of the check number held by the server; instead if C is the server's check number for the object, and the user's rights are R, the user's check field must be f(C xor R). When the user presents a capability to the server, it may easily compute f(C xor R) and compare it with the check field. The standard function f is a trapdoor function, that is, a function which is very easy to compute in the forward direction but with a very difficult to compute inverse.

    This approach to verifying capabilities means that only the server may restrict the rights on a capability. This means that every standard Amoeba server includes a "restrict" operation that takes a new rights mask, a capability, and returns a restricted version of that capability. To avoid the expense of this operation on capabilities with a rights mask of all ones, the server's check number C is included literally in such capabilities and the one-way function f is not used. Because there is a standard one-way function available to all users, a user of a capability with a rights mask of all ones may restrict it locally.

  2. Amoeba Message Delivery

    Amoeba allows resource migration, and it allows users to hold capabilities anywhere in their address space. As such, the server ID field in a capability cannot be relied on to include the network address of the server! As a result, the kernel on every Amoeba machine must support a translation table between logical server addresses and current addresses.

    A complete table is impossible, so each Amoeba kernel keeps a local cache of recently used addresses. If a local client sends a message to a remote server and the server is listed in the cache, the message can be sent directly to the server. If there is a cache miss, additional network traffic will be required to locate the server.

    The local cache can be, for example, some kind of hash table, perhaps with some approximation of LRU replacement to assure that old entries are deleted when they are no-longer needed.

    It is important to note that, in a distributed system, the system itself may stay up and running for far longer than any individual machine supporting it. Furthermore, we desire distributed systems that scale well. These two constraints pose significant problems for message delivery systems.

    One naive approach to handling the problem of network address translation cache misses is to have the kernel that can't find a process's address broadcast a message: "Does anyone have process X". On receiving such a message, each kernel must scan the list of local processes to see if any of them are X. This scheme is perfectly suited to a small system, but it scales very poorly -- eventually, as the system grows, you would expect the broadcast traffic from address translation cache misses to saturate the interconnection network bandwidth and computing power of the entire system.

    Alternately, a name server could be installed. Whenever a particular kernel begins executing a particular process, it registers it with the name server; whenever a kernel has a network address translation cache miss, it asks the name server to resolve the problem. This solution also scales poorly. As the system ages, the storage capacity of the name server is consumed, and the rate of resource consumption grows as the system grows. Furthermore, as the system grows, it is easy to imagine the computational demand on the name server becoming a bottleneck in the entire system's performance.

    The Amoeba system solves this in a two-fold way. First, name servers are potentially arranged in a hierarchy, with local, regional and perhaps global servers. Secondly, name servers themselves are only extensions to the address translation cache, and a broadcast protocol is available if no name server knows of the process being addressed.

  3. Threads

    Just by inspecting the Amoeba capability representation, it is easy to see that Amoeba processes are intended to be heavyweight entities. The capability format allows each process to implement as many as 16 million distinct objects, and this requires a table of as many as 16 million check codes, one per object. Thus, each Amoeba server process is expected to manage a huge numbers of objects on behalf of its clients. Huge processes imply heavyweight processes.

    Amoeba is based on a blocking model of interprocess communication using remote procedure calls. Thus, the primary services offered to user processes are get_request -- await an RPC from a client, put_reply -- send a reply to a client, and trans -- send an RPC request and await a reply. Both get_request and trans are blocking primitives.

    The use of blocking primitives limits the opportunities for parallelism in Amoeba; they make it difficult for a program to perform other useful computing while awaiting a response, whether the program is a client or server. To solve this problem, Amoeba supports multiple threads per process.

    Amoeba threads are implemented by the kernel. An Amoeba process is a heavyweight entity with a list of segment descriptors; an Amoeba thread is an attribute of a process with a stack pointer, program counter and register set. The threads of a program may interact using semaphores, binary semaphores or mutexes, and asynchronous interrupts.

    If an Amoeba process needs to perform a nonblocking operation, it must spawn a thread to perform that operation. Threads are sufficiently lightweight that this can be done at reasonable cost.