16. Protection Problems

Part of the 22C:116 Lecture Notes for Fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Introduction

While the Multics system solved a great number of problems in resource protection, it was not perfect! The Multics combination of access control lists controlling the right to access files, capability based addressing of segments (open files), and gate crossing to move from user to subsystem to system was remarkably flexible, but it was not universal.

Multics solved the basic problems of resource sharing -- preventing unauthorized access and controlled sharing of resources, but it failed to solve a problem known as the mutual suspicion problem. This problem arises whenever two users must share information but neither user trusts the other.

For an example of mutual suspicion, suppose user A writes a proprietary database management system which user B wishes to use to manage confidential data. User A wants an assurance that user B will not pirate the database management system, so user A demands that B's access to the system be through protected gateways. User B wants an assurance that user A will not use the database management system to access or manipulate any of user B's data that is not in the system. Thus, user B demands that the database management system access user B's data through protected gateways.

The Multics system creates a hierarchy. A protection gateway between system components A and B can protect A from B or it can protect B from A, but not both.

Strong Typing and Data Abstraction

The mutual syspicion problem is easily illustrated in the context of programming languages, and the solutions used in programming languages are a useful model for the problem as seen from the operating system perspective:

Consider the classic data abstraction problem, as illustrated by the problem of implementing a stack. Every programmer knows that there are two natural implementations of a stack, the linked list and the array implementation.

     ____       ____       ____
    |Top |     |\\\\|     |Top |      ____
    |____|--   |free|     |____|---->| A  |
            |  |\\\\|                |____|--
             ->| A  |                 ____   |
               |____|                | B  |<- 
               | B  |                |____|--
               |____|                 ____   |
               | C  |                | C  |<- 
               |____|                |____|----//

As a programmer, it is useful to hide the implementation from the user, and to protect the user from the implementation. We do this using, for example, the abstract data type facilities of a programming language:

Consider an Ada package implementation of a stack abstract data type:

   package STACK   -- the interface
     entry PUSH      -- public "methods"
     entry POP       -- of the abstraction
   end

   package body STACK -- the implementation
     entry PUSH         -- details private
       implementation   -- and entirely hidden
     entry POP          -- from users
       implementation
   end

   some user code

We rely on the scope rules of the language to ensure that the implementation is unable to interfere with the user of the abstraction and that the user is unable to interfere with the implementation.

Within the package body, the user code is outside the current scope, so the implementation of PUSH and POP cannot manipulate any of the user's variables except when they are passed as explicit parameters to the routines that operate on the stack. Within the user code, the package body is outside the current scope, so the user cannot manipulate the implementation of the stack. The public interface to the stack is shared -- that is, it is within the scope of both user and package body, so it is possible to call the PUSH and POP procedures from the user program.

The call to an entry of the package is an example of switching protection domains -- gate crossing -- in a manner that solves the mutual suspicion problem.

Solutions to the mutual suspicion problem at the programming language level generally impose only minimal run-time costs, if any. Scope rules are largely enforced by the compile-time semantics of the programming language, with very lightweight run-time implementations, so the cost of using a language that offers strong support for data abstraction is minimal.

However, not all abstraction problems can be solved at compile-time. An open file is an example of an abstraction that must be implemented by run-time mechanisms, both because we expect operating systems to support different languages with differing levels of support for data abstraction, and because we dynamically bind files to programs -- it is common to find devices that were manufactured after the program that uses them was compiled, and similarly, device drivers may be written after the program that eventually uses them was compiled. This happens, for example, when you buy a new disk and its driver and add them to an older system as an upgrade.

Protection of such an abstraction by programming language semantics is a difficult problem, at best! The world of Java applets takes this requirement for dynamic binding and dynamic resource protection to an extreme.

Fortunately, there are general purpose solutions to the mutual suspicion problem that have been known for years. Unfortunately, few of these are supported by modern operating systems, so most of the protected abstractions with which operating systems must deal are protected by special purpose mechanisms.

The Unix Setuid Mechanism

The Unix setuid mechanism provides an example of a domain switching mechanism that solved the mutual suspicion problem and achieved significant market penetration. Under Unix, there is one bit associated with each executable file indicating what user ID that file is to be run under. Recall that the user ID is needed to determine what files a running program may open.

Under Unix, the default user ID used when a new program is executed (by some version of the exec family of system calls) is the user ID at the time of the call to exec. If the setuid bit on an executable file is set, however, executing that file will cause it to run under the user ID of the file's owner.

The Setuid bit completely solves the mutual suspicion problem; It was patented by Ken Thompson, the originator of Unix. The developers of Unix have said that the Setuid bit is the only genuine innovation in their system; all of the other aspects of that system were based on ideas that were first developd elsewhere.

To illustrate the use of the Setuid mechanism to solve the mutual suspicion problem, consider the following scenario: Joe has developed a proprietary a database manager that Fred needs to manage his database, but Fred has files that Joe must not be able to see, and Joe does not want to allow Fred to steal the data base manager's code.

If Fred were to run the database manager under his own ID, the database manager would be able to open any of Fred's files, perhaps stealing data and sending it to Joe. Fred could, of course, demand the right to read the code of the program so he could verify that it was safe, but this would allow him to steal the code.

The classic solution to this problem under Unix is for Joe to release the program with the setuid bit set and the access rights set so that Fred has execute-only access to the code. When Fred runs the program, it will run in Joe's domain, able to open and close files as if it was Joe. Since Joe wrote the code, he has complete control over the threat it poses in his own domain. Fred cannot inspect the code, only execute it, so there is no threat that Fred will steal the code.

Of course, there remains the problem of how the database package is to gain access to Fred's database! The key to this is that Unix allows open files to be passed to a program. When you execute a program under Unix, all files open at the time of the call to exec will remain open in new program, unless they have been marked close on exec. Normally, standard input, standard output and standard error are passed to new programs in this way, but the same mechanism may be used for any file. What Fred must do is open his database prior to the call to exec that is used to launch Joe's program.

The Unix setuid bit is considered dangerous, and it is. Users who need to solve the mutual suspicion problem must use it, but they should not do so until they understand exactly what they are doing. Careless use of this bit has created serious problems in the past, including major security holes in Emacs.

Unfortunately, all recent implementations of Unix have largely defeated some of the functionality of the Setuid bit by distinguishing between a user's effective user ID (the ID used for opening files) and the user's real user ID. This distinction could be considered useful, since the real user is the one who should be billed for the CPU time. Unfortunately, the system call setuid() allows a user to change the effective user ID to the real user ID. In the context of our example, this allows Joe's database package to revert its user ID to that of Fred, allowing it free access to all files in Fred's domain.

The setuid() call can also be used to set the real user ID to the effective user ID. If Joe's database manager does this immediately on startup, Fred can trust it, but this means that Joe must make this startup code open for inspection by Fred. This is feasible, but awkward.

Exercise: How can Joe allow Fred to audit the gateway code to Joe's database manager so that Fred can be sure that Joe's code makes correct use of the setuid feature to solve the mutual suspicion problem. Hint: Joe must divide his database manager into two programs, one that is used to launch the other. What access rights does Fred need to each of these? Give complete C code for the smaller of these programs.

The CAP Protection Mechanism

The Cambridge CAP system, designed by Wilkes and Needham and described in a book by the same name, is one of the best available examples of a system using capability based addressing to solve the mutual suspeicion problem. This system was designed during the outburst of creative operating system design in the mid 1970's, before the Unix model and the demand for Unix compatability put an end to a large body of operating systems research.

In the CAP system, each protection domain was represented by a capability list, and all addressing, whether addressing of code, data or other resources, was accomplished through this capability list. Physically, the machine was a standard minicomputer with a very nonstandard memory management unit. The memory management unit directly interpreted not only memory references to segments, but it also implemented capability manipulation operations.

In the CAP system, there were three object types relevant to this discussion, data segments containing data, C-lists containing capabilities, and sealed objects containing protected implementations of data abstractions. The following access rights relevant to this discussion were associated with each capability:

data segments -- read, write, execute.
C-lists -- read-capability, write-capability, enter.
sealed-objects -- unseal.

Capabilities could refer to any of these object types! A C-list containing only references to segments appeared, to a CAP programmer, as a conventional memory address space in a system supporting capability based addressing. The inclusion of C-lists containing other objects added new possibilities.

When a process had a capability for a C-list that allowed read or write access to the capabilities in it, the process could move capabilities from the C-list to its own address space; this allowed a process to follow chains of capabilities from one C-list to another.

When a process had a capability for a C-list with only enter rights, it could not inspect the capabilities in that list, but it could transfer control from its own domain to that new domain. Parameter passing mechanisms were provided allowing a process to pass data and capabilities to the new domain as it transferred control, and when control entered a new domain, the program counter was appropriately set to the first executable instruction in that domain. This was the fundamental gate-crossing mechanism in the CAP system, and it was equally applicable to crossing between user and system domains or between one user's domain and another.

The most interesting feature of the CAP system was the sealed object. Sealed objects were created by the seal operation as follows:

     sealed_object = seal( capability, key )

The only legal operation on a sealed object under the CAP system was unseal:

     capability = unseal( sealed_object, key )

On unsealing a sealed object, if the key used is the same as the key used to seal that object, the capability that was sealed inside the object is returned; otherwise, the unseal operation fails.

The central use of sealed objects is to hide and protect the implementations of abstract data types. Thus, the file manager would export the data structures representing open files as sealed objects. If a user wanted to read or write an open file, the user would pass the open file to the read or write routine, and that routine would unseal the object to gain access to the data needed to actually perform the transfer. The key used to seal and unseal file objects would be held by the open routine and shared with the read and write routine, but this key would never be passed to users.

Keys for sealing objects in the CAP system were capabilities -- any capability could be used as a key, but the useful keys were those capabilities that were held closely within the implementation of an abstract data type.