Trapdoors, Cookies and Amoeba

Part of 22C:169, Computer Security Notes
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Server-Side Protection

The server-side model of protection was the first successful application of cryptography to the protection of capabilities in distributed systems. The best example of the use of this model lies in Andrew Tannenbaum's Amoeba system.

With server-side protection, the kernel no-longer tries to guarantee that capabilities are valid. It merely helps detect the most blatant violations, and since most programmer errors are blatant, this model detects most of those very quickly. The server-side model puts the final responsibility for protection on the server processes themselves!

the basic idea is simple: On receipt of a message, a process is responsible for determining, using the content of that message, that the message was sent using an intact valid capability, that is, one that has not been corrupted or forged. The tool we use for this purpose has come to be called a cookie.

The term cookie was apparently coined by the developers of the C input/output stream package. Specifically, the first known use of the term occurs in the documentation for the fseek() function, which was defined as returning a cookie, an opaque value that only had meaning when passed back to fseek() to position the same stream back to the same position. Since that time, the term cookie has come to refer to any opaque value or token given to a user that should be held by the user and passed back to its originator when the user wishes a certain function.

When a process creates a capability to be used for sending messages back to that process, it includes, in the capability, a cookie that is used to authenticate the capability. When the capability is used to send a message to the originating process, the recipient must check the cookie. Posession of the cookie itself guarantees that the sender had the right to send the message!

Servier Side Authentication for Capabilities

If we try to apply this idea for server-side authentication to a secure capability for message transmission in a distributed system, we get the following structure:

      link = < process-id, queue-number, cookie >

We refer to a capability for sending a message to some other process as a link because it is a communications link. For the sake of this example, each process has a queue table containing incoming message queues, and each process has a cookie table holding the cookie associated with each incoming message queue. As each process begins, it initializes its queue table with random numbers:

      for each q, cookies[ q ] = random();

The make-link operator is now trivial:

      make_link( q ):
          return < caller's Id, q, cookies[ q ] >

The send operation over a link must pass, to the receiving process not only the queue number but also the cookie from the link used to send the message; after receipt of a message, the recipient must check that the cookie is the correct one, for example, as follows:

      < bytecount, cookie > = receive( q, buf, len )
      if cookie != cookies[q]
          sender used an invalid link
      else if bytecount > len
          receiver's buffer was too small
      else
          we received all of a genuine message

At any time, a process using this scheme may invalidate all links referring to some particular incoming message queue by assigning a new cookie for that queue!

With this scheme, the kernel need not keep a link table on behalf of each process. So long as the processes can trust the memory protection mechanisms that prevent other processes from peeking at their data structures, processes may store their links and cookie arrays anywhere within their own data structures.

If a process tries to forge a link to an incoming message queue it is not authorized to use, it must successfully guess the cookie that the owner of that queue has assigned to that queue. If it cannot guess this, the most damage it can do is to send masses of easily ignored invalid messages to that particular recipient.

Of course, there is a chance that some process will pass one of its cookies, stripped from the link it came with, to some other process, where that process will then use that cookie to construct a forged link. This is not a serious security threat, since the process that passed just the cookie could have passed the entire link. Therefore, this kind of abuse does not involve any new security risks.

Note that this model (borrowed from an operating system called Demos that did not use Cookies) does not include any access rights associated with links! The solution to including access rights within capabilities protected by server-side authentication requires a more complex model.

Probabilistic Versus Absolute Protection

Note that all server-side authentication rests on a probabilistic model of security. When we segregated capabilities from data using capability lists maintained by the kernel, or when we used tag bits to prevent programs from corrupting the representations of capabilities, we had an absolute guarantee that a user program could not corrupt or forge capabilities. When we protect the capability by including a cookie within its representation, there is always a chance that a user will accidentally create the correct value.

Therefore, the strongest guarantee we can make with server-side authentication is that a user will have, on any particular trial, only a certain probability of success with a forgery. If we know that, given the speed of the server, the user can only make t trials per second, and if we want our system to stand up to an attack for s seconds, then we must use:

bits-per-cookie > log2ts

Thus, for example, if our server can handle 1,000,000 trials per second, where each trial involves receipt of and discarding one illegal message, and if we want the system to survive an attack lasting 1,000,000 seconds (over 11 days), we need at least 40 bit cookies. Seen from another perspective, 40 bit cookies on this system will be more than adequate to protect objects with lifetimes shorter than 11 days, but they may be inadequate to protect longer-lived objects.

Of course, in a real system, if some client started saturating some server with trials, we would probably notice that the service offered by this server was being degraded, and we might start investigating and find the attacker. A real attacker seriously intent on cracking a particular server's security could not generally afford to saturate that server with trials for an extended period. Therefore, in reality, our 40 bit cookie is probabily good for a significantly longer period than 11 days.

A clever server that wanted to prevent such random trials might defend itself in several ways. For example, it might track the return addresses sent with recent rejected messages. If it receives more than some number of invalid messages from the same process, it could report a probable attack. This would force the attacker to create multiple processes for the attack, increasing the total load placed by the attack on the system and slowing the attacker by the time taken to create these new processes.

Another serious threat is that an adversary process could guess the cookies on one link from known cookies on other links. The key to this attack is the fact that genuine random numbers are hard to find, but pseudorandom numbers, generated using an algorithm, are easy to use.

If the sender knows what pseudorandom number generator was used by a server to assign cookies to its links, and the sender knows the cookie on one link, it may be able to infer the seed used with that pseudorandom number generator, and from that seed, it may be able to infer the cookies assigned to the other links serviced by that server. This danger is very real if a simple pseudorandom number generator is used by the server (something like the old Unix rand() function).

Therefore, the server should use what is known as a cryptographically secure pseudorandom number generator to generate the cookies. Cryptographically secure generators have the property that, where Pi is the ith pseudorandom number returned in sequence by the generator, knowing Pi and the algorithm used by the generator does not help an observer guess Pi+1. With a classical multiplicative congruence pseudorandom number generator such as the old Unix rand() function, exactly the opposite was true! Each Pi uniquely determined the next Pi+1.

One easy way to increase the security of a pseudorandom number generator is to make sure that the generator generates many more bits than required. Thus, if we wanted 40 bits, we might use a generator that maintained an 80 bit seed, and so had a period of about 280. This generates a very predictable sequence of 80 bit pseudorandom numbers, but if we return only the least significant n bits, then each number has 280-n possible successors. This is not enough, though! Some widely used pseudorandom number generators have very predictable patterns in their least significant bits, and the old Unix rand() was one of them. A full treatment of pseudorandom number generation algorithms is far beyond what we can give here!

Amoeba

Andrew Tannenbaum's Amoeba operating system illustrates a complete server-side implementation of capability-based security in a distributed system. Unlike the example presented above, access rights are included in each capability, and the integrity of the access rights field of each capability is guaranteed by the server-side checking mechanism.

Amoeba Capabilities

Tannenbaum's Amoeba system supports capabilities formatted as follows:

          | server id | resource id | rights |  check   |
          |    48     |      24     |    8   |   48     |
          |___________|_____________|________|__________|
          |___________|_____________|________|__________|
Amoeba uses server-side authentication; thus, the kernel is only concerned with the server ID field; this is used to send messages to the appropriate server. The interpretation of the other fields is entirely up to the server! The resource ID field is similar to the queue-number field of a Demos link, and the check field serves essentially the same role of the cookies used for the server-side Demos like model presented in the previous lecture.

Notice that this capability is too short to use really secure cryptographic protection. It is only 128 bits in size, so it cannot contain any 400 bit components, the minimum size needed for really secure trapdoor functions.

The server ID used in Amoeba is not the machine ID; rather, it is a "free floating" number with no direct connection to any particular machine or process. Servers may register themselves with the local kernel as serving some particular server ID; if the registration is legitimate, the process will receive requests for service at that ID. The model Amoeba uses to determine the legitimacy of these registrations is also cryptographic, but we will ignore it for the time being.

Amoeba provides a standard set of interpretations for the fields described above, but this interpretation is not enforced. Processes may depart from the standard, although there is no good reason to do so. In the standard interpretation, these fields are used as follows:

The resource ID field is used by the server to identify which of many objects it is being asked to operate on. Amoeba assumes that each server process will manage many different objects. This, in turn, implies large servers, which implies an assumption that process creation will be a heavyweight operation and should be done infrequently.

Contrast this design with a hypothetical system where process creation is very inexpensive, and therefore, it would be reasonable to represent objects by processes, so the server-id field would identify the object itself and the resource-id field could therefore be either very small or entirely absent. Here, we obviously assume that each server will support operations on a large number of objects. In effect, an Amoeba server is expected to represent an entire class and not just one object.

The rights field confers access rights to the object that the server has extended to the client. Each bit is assumed to confer one right; if the rights field is all ones, the user has all rights. If the object is a file, for example, the rights might refer to such operations as read and write.

The check field is used by the server to assure that the capability is valid and that the rights field has not been corrupted. For each object allocated by a server, it will typically assign a random check number that must match the check number provided with a capability by the client. This prevents the accidental forgery of capabilities. The server must store the check numbers for all objects it creates so that it can check the legitimacy of capabilities presented to it.

The problem of preventing clients from asserting rights they do not have is solved by an ingeneous cryptographic trick. The check field given to the client is not an exact copy of the check number held by the server; instead if C is the server's private check number for the object, and the user's rights are R, the user's check field must be f(C xor R). When the user presents a capability to the server, it may easily compute f(C xor R) and compare it with the check field. The standard function f is a trapdoor function, that is, a function which is very easy to compute in the forward direction but with a very difficult to compute inverse.

For an ideal trapdoor function operating on an n-bit argument, the time-space product to compute f would be O(nk), while the time-space product for computing f-1 would be O(2n). Note for comparison that an add can be done in O(log n) time using O(n log n) gates in parallel hardware, or it can be done serially in O(n) time using O(1) gates. Similarly, multiplication can also be done in parallel in O(log n) time using O(n2 log n) gates, or using a shift and add algorithm in O(n log n) time using O(n log n) gates or using the same algorithm with bit-serial addition in O(n2) time using O(1) gates. In contrast, table-lookup evaluation for an n-bit function takes takes a table of size O(2n) and requires at least O(log n) time for address decoding.

The use of a trapdoor function to verify capabilities means that only the server may restrict the rights on a capability. This, in turn, means that every standard Amoeba server includes a "restrict" operation that takes a new rights mask and a capability as parameters, and returns a restricted version of that capability. To avoid the expense of this operation on capabilities with a rights mask of all ones, the server's check number C is included literally in such capabilities and the one-way function f is not used. Because there is a standard one-way function available to all users, a user of a capability with a rights mask of all ones may restrict it locally. This introduces no security loophole precisely because that user already had all rights to the object, and so, that user gained nothing by being given the "secret key" used by the server to manipulate that object.

Amoeba includes a clever trick to allow the owner of an object, that is, to allow a process with all rights to an object to perform restriction without consulting the server. In the special case where the rights field is 11111111, the Amoeba system includes the server's private check number as the check field, literally. This gives nothing away, because the owner of an object has all rights to that object, so could not use this information to forge additional rights. The trick with the trapdoor function is only needed to protect capabilities that confer some subset of the rights.

Amoeba Message Delivery

The Amoeba system does not use the actual ID of a process for message delivery. Instead, the server ID field of an Amoeba capability is simply an integer, with no fixed connection to either the recipient process or to any particular machine.

Each machine in an Amoeba system runs a local kernel, a large part of which is concerned with message delivery. This kernel maintains a table, local to each machine, showing what local processes are currently registered to receive messages addressed to particular server IDs.

The correctness of Amoeba's message delivery system rests on the fact that, when a machine wishes to send a message using an unknown server ID, the machine can broadcast a request to all kernels asking "what machine is currently running a server for this ID?" If one of the kernels has that ID in its local server table, it replies with its network address, allowing the sending machine to deliver the message.

For efficiency, this broadcast protocol must be used only rarely. The first step in ensuring that it is only rarely used is a local cache maintained by each kernel. This cache lists recently encountered server IDs and the network addresses where the servers reside. These cache entries are created by replies to the kinds of broadcasts mentioned above, and they are retained using something like an LRU replacement policy.

If an Amoeba system was enlarged to the point where this cache scheme was insufficient to limit the use of this broadcast protocol traffic to an acceptable level, the system could easily be modified to use a low-level name server, with broadcasts being used only when both the local cache and the name server had no record of some server. If the system was enlarged even more, so that the name server became a bottleneck, it could be broken down into a server hierarchy, with local, regional and global servers.

Amoeba Server Registration

Amoeba servers must register their server ID with the local kernel. This registration would open a security loophole if it were done the obvious way, by allowing any process to register an ID it wanted. The Amoeba system avoids this potential problem by using mechanisms very similar to those used to create the check field of a capability.

In the Amoeba system, a distinction is made between the private server ID, typically a value that is a closely held secret of the server, and the public server ID that is used in all capabilities that address resources held by that server. The Amoeba kernel service allowing a server to register a new ID takes the private ID as a parameter and computes, inside the kernel, the public ID that is being registered. As a result, the only way to register a particular public ID is to know the corresponding private ID.

The public ID is derived from the private ID by a trapdoor function, the same function that is used to protect capabilities. Therefore, even if a user knows the trapdoor function and the public ID of a server, the user cannot easily compute the private ID of the server.

Amoeba Fault Tolerance

The reason Amoeba does not have a hard connection between the server ID and the actual process that serves the requests is because a fault tolerant system must allow for server failure. The Demos system contained no such provisions!

In Amoeba, a fault tolerant service would typically be delivered by a cluster of server processes, each running on a different machine, with all processes monitoring each other. If these processes detect that a failure has occured, the survivors would conduct an election to determine which survivor registers to offer the service.

Each transaction between a client and the current public representative of such a fault tolerant server would typically be conducted as a distributed atomic transaction, with updates to parallel data structures in all of the member processes of the server cluster. As a result, after any failure, the elected representative of the survivors would have all of the data necessary to continue offering the service.

Amoeba Process Migration

Process migration under Amoeba rests on the same registration mechanism. If an Amoeba process wishes to migrate, it communicates with a process management server to launch a new copy of itself, then communicates with that new copy, installing all necessary state information in the new copy before telling the new copy to begin execution.

Immediately after starting the new copy, the old copy of the process terminates itself. Immediately after it is started, the new copy registers itself to receive all requests that would otherwise have gone to the old copy. This is possible because the old copy gave the new copy its private ID along with all the other state information it passed along.

Once the old copy terminates, any messages in its incoming queues may be simply deleted, leaving it to the fault tolerance mechanisms of the Amoeba remote procedure call model to recover from this. After migration, the message delivery cache entries in other machines referring to the old copy of the process are invalid, and this will be discovered when an attempt is made to use them. When a cache entry is found to be invalid, the sender falls back on a broadcast attempting to locate the server that was lost.

Note that, in the spirit of server-side authentication, this migration mechanism is entirely a matter of user-level code. Applications may be written that notice when they are running on overloade machines and automatically migrate away, but the kernel itself remains simple, offering only the necessary tools and doing none of the work.

Amoeba Server Discovery

Note that the Amoeba mechanisms do pose one problem: How does a user application discover the public ID of a server. There are two solutions to this problem. First, a process may register with a name server. In the case of Amoeba, the directory server is a high-level name server, associating textual names with capabilities. While these are usually capabilities for files or directories, a process wishing to register, for example, a random-number service, could enter a capability for itself under the name /servers/random (assuming, of course, that everyone had agreed to use the directory /servers to hold capabilities for the public request ports of various servers.

Another solution common on Amoeba is to simply publish the public ID of a server. Typically, these IDs are encoded as textual constants in the include files that are used to compile programs that use those servers, so if the server is ever moved to a new private ID, user code relying on that server can be simply recompiled to make use of the new public ID.

The risk of publishing the public ID of a server in an include file is that it prevents the ID from being changed more frequently than applications are recompiled. This means that the ID remains exposed to long-term attack by potential adversaries. In contrast, IDs that are registered in directory servers may be changed routinely, and need only remain valid for as long as the longest expected lifetime of a capability referencing a resource managed by that server.

Because Amoeba was never intended for critical applications, it was not designed to stand up to protracted attacks from outside, and therefore, the level of security described above was considered sufficient. The customer was, after all, the European Space Agency, with an interest in inexpensive supercomputers built from clusters of commodity machines for use in analysis of scientific data. If the customer had been the defense department of any large country, the design would verly likely have been quite different!

Amoeba Threads

Just by inspecting the Amoeba capability representation, it is easy to see that Amoeba servers are intended to be heavyweight entities. The capability format allows each server to support as many as 16 million distinct objects, and this requires a table of as many as 16 million private check codes, one per object. Thus, each Amoeba server process is expected to manage a huge numbers of objects on behalf of its clients. Huge servers imply heavyweight processes.

Amoeba is based on a blocking model of interprocess communication using remote procedure calls. Thus, the primary services offered to user processes are get_request -- await an RPC from a client, put_reply -- send a reply to a client, and trans -- send an RPC request and await a reply (a function that really ought to have been named call). Both get_request and trans are blocking primitives because all data is passed using synchronous non-buffered message passing.

In Amoeba, the thread that does a get_request must follow (after any number of trans operations) with a put_reply. It may not get other requests, and only the thread that got a request may put a reply. Reply transmission is always to a particular process on a particular machine, with none of the complexity of abstract resource IDs that characterized client to server communication. This efficiency is possible because Amoeba uses a custom protocol stack and not a general purpose protocol stack such as TCP/IP.

The use of blocking primitives limits the opportunities for parallelism in Amoeba; these primitives make it difficult for a client to perform other useful computing while awaiting a response from a server, and they make it difficult for a server to perform useful computation while awaiting a request from a client. If applications wish to do this, they must do it with threads.

Amoeba threads are implemented by the kernel. An Amoeba process is a heavyweight entity with a list of segment descriptors; an Amoeba thread is an attribute of a process with a stack pointer, program counter and register set. The threads of a program may interact using semaphores, binary semaphores (mutexes), and asynchronous interrupts.

If an Amoeba process needs to perform a nonblocking operation, it must spawn a thread that will be blocked by that operation while the other threads of the process continue working. Threads are sufficiently lightweight that this can be done at reasonable cost.