22C:116, Lecture 16, Fall 1999

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Protection Problems

    While the MULTICS system solved a great number of problems in resource protection, it was not perfect! The MULTICS combination of access control lists for files, capability based addressing of segments (open files), and gate crossing to move from user to subsystem to system was remarkably flexible, but it was not universal.

    MULTICS solved the basic problems of resource sharing -- preventing unauthorized access and controlled sharing of resources, but it failed to solve a problem known as the mutual suspicion problem:

    Suppose there are two users who must share information but neither user trusts the other. For example, suppose user A writes a proprietary database management system which user B wishes to use to manage confidential data. User A wants an assurance that user B will not pirate the database management system, so user A demands that B's access to the system be through protected gateways. User B wants an assurance that user A will not use the database management system to access or manipulate any of user B's data that is not in the system. Thus, user B demands that the database management system access user B's data through protected gateways.

    The MULTICS system creates a hierarchy. A protection gateway between system components A and B can protect A from B or it can protect B from A, but not both.

  2. Strong Typing and Data Abstraction

    The mutual syspicion problem is easily illustrated in the context of programming languages, and the solutions used in programming languages are a useful model for the problem as seen from the operating system perspective:

    Consider the classic data abstraction problem, as illustrated by the problem of implementing a stack. Every programmer knows that there are two natural implementations of a stack, the linked list and the array implementation.

     ____       ____       ____
    |Top |     |\\\\|     |Top |      ____
    |____|--   |free|     |____|---->| A  |
            |  |\\\\|                |____|--
             ->| A  |                 ____   |
               |____|                | B  |<- 
               | B  |                |____|--
               |____|                 ____   |
               | C  |                | C  |<- 
               |____|                |____|----//
    
    As a programmer, it is useful to hide the implementation from the user, and to protect the user from the implementation. We do this using, for example, the abstract data type facilities of a programming language:

    Consider an Ada package implementation of a stack abstract data type:

       package STACK   -- the interface
         entry PUSH      -- public "methods"
         entry POP       -- of the abstraction
       end
    
       package body STACK -- the implementation
         entry PUSH         -- details private
           implementation   -- and entirely hidden
         entry POP          -- from users
           implementation
       end
    
       some user code
    
    We rely on the scope rules of the language to ensure that the implementation is unable to interfere with the user of the abstraction and that the user is unable to interfere with the implementation.

    Within the package body, the user code is outside the current scope, so the implementation of PUSH and POP cannot manipulate any of the user's variables. Within the user code, the package body is outside the current scope, so the user cannot manipulate the implementation of the stack. The public interface to the stack is shared -- that is, it is within the scope of both user and package body, so it is possible to call the PUSH and POP procedures from the user program.

    The call to an entry of the package is an example of switching protection domains -- gate crossing -- in a manner that solves the mutual suspicion problem.

    Solutions to the mutual suspicion problem at the programming language level generally impose only minimal run-time costs, if any. Scope rules are largely enforced by the compile-time semantics of the programming language, with very lightweight run-time implementations, so the cost of using a language that offers strong support for data abstraction is minimal.

    However, not all abstraction problems can be solved at compile-time. An open file is an example of an abstraction that must be implemented by run-time mechanisms, both because we expect operating systems to support different languages with differing levels of support for data abstraction, and because we dynamically bind files to programs -- it is common to find devices and device drivers that were manufactured and written after the program that uses them was compiled -- for example, when you buy a new disk and a new operating system that is compatable with a software package you've used for years. Protection of such an abstraction by programming language semantics is a difficult problem, at best! The world of Java applets takes this requirement for dynamic binding and dynamic resource protection to an extreme.

    Fortunately, there are general purpose solutions to the mutual suspicion problem that have been known for years. Unfortunately, few of these are supported by modern operating systems, so most of the protected abstractions with which operating systems must deal are protected by special purpose mechanisms.

  3. The UNIX Setuid Mechanism

    The UNIX setuid mechanism provides an example of a domain switching mechanism that solved the mutual suspicion problem and achieved significant market penetration. Under UNIX, there is one bit associated with each executable file indicating what user ID that file is to be run under. Recall that the user ID is needed to determine what files a running program may open.

    Under UNIX, the default user ID used when a new program is executed (by the exec system call) is the user ID of the program that requested the exec. If the setuid bit on an executable file is set, however, executing that file will cause it to run under the user ID of the file's owner.

    The Setuid bit completely solves the mutual suspicion problem; It was patented by Ken Thompson, the originator of UNIX.

    To illustrate this, consider the following scenario: Joe has a program that Fred needs to run, but Fred doesn't trust Joe and Joe doesn't trust Fred. If Joe releases the program so Fred has execute-only access to the code and the setuid bit is turned on, then Fred cannot steal Joe's secrets by examining the code, and when Fred runs the program, the program cannot access any data that Joe could not access directly.

    Furthermore, if the program requires data from some file of Fred's, Fred can preven Joe from directly reading that file while still allowing the program to access the file by opening the file prior to executing the program and then letting the program read the file as, for example, standard input.

    The result is that the only files of Fred that Joe's program can damage are the ones Fred left open when he called the program (of course, it can also access any of Fred's data that is publically accessible). Even though Fred ran it, Joe's program runs under Joe's user ID and has free access Joe's data.

    The UNIX setuid bit is considered dangerous, and it is. Users who need to solve the mutual suspicion problem must use it, but they should not do so until they understand exactly what they are doing. Careless use of this bit has created serious problems in the past, including major security holes in Emacs.

    Unfortunately, all recent implementations of UNIX have actually defeated some of the functionality of setuid by distinguishing between a user's effective user ID (the ID used for opening files) and the user's real user ID. This distinction is useful, since the real user is the one who should be billed for CPU time, but then, probably without fully understanding the consequences, the setuid system service was extended to allow a user process to set its effective user ID to its real user id, thus defeating the solution to the mutual suspicion problem.

  4. The CAP Protection Mechanism

    The Cambridge CAP system, designed by Wilkes and Needham and described in a book by the same name, is one of the best available examples of a system using capability based addressing to solve the mutual suspeicion problem. This system was designed during the outburst of creative operating system design in the mid 1970's, before the UNIX model and the demand for UNIX compatability put an end to a large body of operating systems research.

    In the CAP system, each protection domain was represented by a capability list, and all addressing, whether addressing of code, data or other resources, was accomplished through this capability list. Physically, the machine was a standard minicomputer with a very nonstandard memory management unit. The memory management unit directly interpreted not only memory references to segments, but it also implemented capability manipulation operations.

    In the CAP system, there were three object types relevant to this discussion, data segments containing data, C-lists containing capabilities, and sealed objects containing protected implementations of data abstractions. The following access rights relevant to this discussion were associated with each capability:

    Capabilities could refer to any of these object types! A C-list containing only references to segments appeared, to a CAP programmer, as a conventional memory address space in a system supporting capability based addressing. The inclusion of C-lists containing other objects added new possibilities.

    When a process had a capability for a C-list that allowed read or write access to the capabilities in it, the process could move capabilities from the C-list to its own address space; this allowed a process to follow chains of capabilities from one C-list to another.

    When a process had a capability for a C-list with only enter rights, it could not inspect the capabilities in that list, but it could transfer control from its own domain to that new domain. Parameter passing mechanisms were provided allowing a process to pass data and capabilities to the new domain as it transferred control, and when control entered a new domain, the program counter was appropriately set to the first executable instruction in that domain. This was the fundamental gate-crossing mechanism in the CAP system, and it was equally applicable to crossing between user and system domains or between one user's domain and another.

    The most interesting feature of the CAP system was the sealed object. Sealed objects were created by the seal operation as follows:

    sealed_object = seal( capability, key )
    
    Once a sealed object was created, it could not be manipulated. Capabilities for that object could, however, be stored by the program that created it. Given the correct key, any program could unseal a sealed object:
    capability = unseal( sealed_object, key )
    
    On unsealing a sealed object, the capability that was sealed inside the object is returned.

    The central use of sealed objects is to hide and protect the implementations of abstract data types. Thus, the file manager would export the data structures representing open files as sealed objects. If a user wanted to read or write an open file, the user would pass the open file to the read or write routine, and that routine would unseal the object to gain access to the data needed to actually perform the transfer. The key used to seal and unseal file objects would be held by the open routine and shared with the read and write routine, but this key would never be passed to users.

    Keys for sealing objects in the CAP system were capabilities -- any capability could be used as a key, but the useful keys were those capabilities that were held closely within the implementation of an abstract data type.