Memory protection is a key element in protecting the resources of a single-CPU or shared memory machine. It is essential that users not be able to arbitrarily manipulate the variables or code of the system; if I/O is memory-mapped, memory protection can even be used to control access to I/O devices.
If access to the memory management unit is protected so that only the system can change the memory management registers, then the memory management registers can be considered to be a protected part of the user's process state.
This trivially allows each user to be given access to a disjoint region of the systems virtual memory, and this trivial policy can be supported with almost any MMU hardware. Here's an illustration, using a paged system:
Trivial policy USER 1 USER 2 ____ ____ MMU mapping | o-|----- | o-|----- registers | o-|--- | | o-|--- | | o-|- | | | o-|- | | |____| | | | |____| | | | | | | | | | _______ | | | _______ | | | Pages on | |<- | | | |<- | | disk or in |_______| | | |_______| | | main memory _______ | | _______ | | | |<- | | |<- | |_______| | |_______| | _______ | _______ | | |<- | |<- |_______| |_______|To switch between the two users shown here, the system must not only change their CPU registers (Accumulator, Program counter, etc), but it must change the contents of the MMU registers.
If the user address space doesn't contain any system code, then the MMU must have a second set of mapping registers that apply in system state, or alternately, the MMU must be turned off in system state. The latter is simple, on machines that allow it, but it eliminates the possibility of having parts of the system run in virtual memory. The former (with two sets of mapping registers) is easy -- but it may make it difficult for system code to address arguments to system calls, since they're in the wrong address space. The alternative of allowing system and user code to both address the same memory requires that there be some mechanism that changes the addressability of system pages when system state is entered -- for example, the MMU could declare that access to system pages is legal only when the program is in system state.
A paged memory management unit can do much more than provide each user with a distinct address space. For example, consider the following:
USER 1 USER 2 ____ ____ MMU mapping | o-|----- | o-|----- registers | o-|--- | | o-|--- | | o-|- | | | o-|- | | |____| | | | |____| | | | | | | | | | _______ | | | _______ | | | Pages on | |<- | ->|Shared!|<- | | disk or in |_______| | |_______| | | main memory _______ | _______ | | | |<- | |<- | |_______| |_______| | _______ | | |<- |_______|How do users arrange to share pages (or segments)? What if shared pages are not at the same virtual address? This simple picture does not answer these questions!
In most UNIX systems, read-only code segments are always shared between all processes that are executing the same program. Shared read-write segments are allowed under System V UNIX, using a scheme borrowed from the 1966 vintage Berkeley Timesharing System. This required the following two system calls:
Create shared segment ( pages, name, security ) Insert shared segment ( pages, name, security, address )When a shared segment is created, it is given a name. Unfortunately, this name is in a flat name-space (it is unstructured) and there are no tools provided to prevent users from selecting names that conflict.
Sharable segments have a size, an identifying name, and some security attributes (on the Berkeley system, a password; on UNIX, a short access control list with the same format as that used for files).
When a process needs to address a shared segment, the user specifies the segment, the number of pages desired (which may be less than the size), and the virtual address where the segment should appear in the user's address space.
If two different processes insert the same segment at different virtual addresses, then pointers that refer to part of the segment from within the address space of one process will not refer to the same place from within the virtual address of the other process.
On the Encore Multimax under the UMAX operating system, a variant of UNIX, there was only one system service used for memory sharing:
Share( page ) marks the page as shared, but has no immediate effect.Marking a page as shared had no effect until a process executed a fork system call. At that point, all non-shared pages were copied (at least, that was the user's impression of what happened), while pages marked as shared remained accessable by both the parent and child.
This scheme allowed pages to be shared between processes only if the common ancestor of those processes established the sharing. This was sufficient to allow multiple processes to cooperate in solving some problem on behalf of a single user, but it didn't allow sharing between users.
This scheme guarantees that a shared page will occur in the same place in all address spaces that share it.
On demand paged UNIX systems, the trivial implementation of Fork is expensive -- copying all non-shared pages of a process requires that all pages be pulled into memory and copied, a job that may require many disk accesses in a row. To avoid this, most virtual memory UNIX systems use what is called lazy fork or copy-on-write semantics. After a fork, all pages are physically shared but marked as read-only. When a user tries to write on a such a page, the trap service replicates the page on behalf of the process that tried to change it, making a private read-write copy for that process. Thus, the expense of the fork operation is distributed over some number of page faults, and the only pages that get copied are those that are actually modified.
The MULTICS system had only one operation that supported sharing:
OPEN( file, address ) Maps pages of the indicated file into the process's address space at the indicated address.This relies on the file system to provide the necessary protection of files to determine who can open what.
The MULTICS system was built on the GE 600, a 36 bit machine, by a consortium of MIT and GE. Honeywell bought out GE's computer division and continued selling MULTICS systems through the 1970's. There may still be a few MULTICS systems running, and other manufacturers, notably Prime, have copied most of the good ideas from MULTICS.
The name UNIX is a pun on MULTICS. UNIX was written at Bell Labs after Bell dropped out of the MULTICS consortium.
MULTICS used a paged-segmented model of virtual memory, and each segment could address one file. Mapping a file into the virtual address space of a process was the only way to read or write that file. (Sequential I/O to terminals or mag-tape was done through different mechanisms.)
If two users opened the same file at different virtual addresses (different segment numbers), then major problems would arise with data structures containing pointers in the segment they shared.
Shared memory under Windows NT is based on the Multics model, in the sense that all memory sharing under Windows NT is done by mapping shared files into the virtual memory address spaces of the processes that wish to share memory.
From a theoretical point of view, we can talk about any kind of object and any kind of user. It is worth asking if users are themselves a kind of object, but what matters is that users act on objects using operations, where the type of each object may determine the set of applicable operations.
Here, we are not concerned with the nature of objects, other than the fact that objects may be operated on by users using operations selected from a finite set. Files fit this model, with the operations read and write. Directories fit this model, with the operations make-directory-entry and delete-directory-entry. Memory pages or segments also fit this model, with the operations load and store.
We are also not concerned with the nature of users. From the the point of view of the theory, users may be humans, programs, processes or anything else that may envoke operations on objects.
For example purposes, it is useful to think of files as the objects and humans as the users, but it is equally useful to think of pages or segments as the objects and processes as the users.
The access matrix model is a universal model of the instantaneous state of a protected system. It shows, at one instant, all users, all objects, and the set of operations each user is allowed to apply to each object in the system. Pictorially, access matrices are generally presented as follows:
| Cake | Tea | Objects ------+------+-------| Fred | Eat | Drink | ------+------+-------| Lucy | Bake | Drink | | | Brew | --------------------- Allowed Users OperationsThe above example illustrates the (rather steriotypical) interaction of two users, Fred and Lucy, centered around two objects, Cake and Tea. The operations Eat and Bake apply to cake; Fred can Eat Cake, but not Bake it, and Lucy can Bake Cake, but not Eat it. The operations Drink and Brew apply to Tea. From the point of view of the access matrix, neither the names of users, the names of objects, nor the names of operations have any meaning other than the explicit relations between users, objects and operations detailed in the matrix.
Access matrices may be used to control access to any resource, by any kind of user. Their use is not limited to computers; for example, they have been used in the Russian Gulag to control the access of prisoner/laborers to rooms in a prison/factory.
The access matrix describes allowed actions, but it does not describe an enforcement mechanism. In the case of the Russian prison, the rules of the matrix were enforced by guards with guns.
On a computer, one could store the access matrix inside the operating system, and have the system check the matrix whenever a user tried to access a resource. The problem with this is that the matrix is usually quite sparse -- that is, on most systems, most matrix entries will say "no access".
It is also important to note that the access matrix is a static description of who can do what operations on which resources. The model must be extended to describe how access rights are changed, and these extensions frequently differ markedly depending on how the access matrix is implemented.
There are two practical ways to implement the access matrix, access control lists and capabilty lists. Each of these allows efficient storage of the information in the access matrix.
The access control list implementation of the access matrix given above is:
Access Control Lists -------- -------- Objects | Cake | | Tea | -------- -------- | | -------- -------- Access | Fred | | Fred | Control | Eat | | Drink | Lists |--------| |--------| | Lucy | | Lucy | | Bake | | Drink | -------- | Brew | --------An access control list or ACL is what results when the access matrix is stored colum-wise. An access control list is a column of the matrix is stored with a protected object. Each entry in an access control list specifies a user and the operations that user may perform on that object.
One problem with access control lists is that they may be quite large. It is easy to imagine the access control lists for some widely used files occupying more storage than the data in the file. The usual solution to this is to introduce named sets of users -- for example, OTHERS to specify the rights given to all users not explicitly mentioned in the list.
Under the classic MULTICS system, which was also used by the PRIMOS operating system, each file has an ACL, with access such as the following:
The traverse right is used in the context of hierarchical directory structures. A user who has no right to read or write a directory can mention that directory in the path name of a file as long as the user has traverse rights to that directory -- the user is just passing through, but may not open the directory to list its contents.
The right to edit an access control list illustrates one approach to allowing the contents of the access control matrix to be modified. In a sense, all users with the right to edit the access control list of an object can be thought of as co-owners of the object. If one user (the legitimate owner of an object) grants the right to edit that object's ACL to another user, it is like granting the other user a power of attourny over the object.
Under UNIX, each file has an ACL, but the ACL is strictly limited in structure. Each UNIX ACL has 3 fixed entries:
The simple to implement ACL scheme of UNIX is an example of this. Unlike the MULTICS ACL, it is not general, but it meets the needs of most users of small computer systems. As a result, every UNIX ACL fits exactly in 9 bits of the I-node of the file it describes!
MULTICS was intended to meet the needs of a general information utility, and as such, it had to deal with the problems of large numbers of users, including users from competing organizations. The generality of the MULTICS access control list is well justified by this environment, and the success of PRIMOS in meeting the needs of compaines like Compuserve or The Source (modern examples of computer utilities) amply illustrates the appropriateness of the MULTICS design.
When UNIX systems are extended beyond single departments, the short ACL of UNIX becomes a problem. Distributed versions of UNIX such as the Appolo Domain operating system have been forced to add general ACL's back onto the UNIX file model in order to meet the demands of the large user base that such a system can serve.
As mentioned above, capability are the alternative to access control lists. The capability list implementation of the sample access matrix given above is as follows:
------ -------------------- | Fred |-| Tea | Cake | ------ | Drink | Eat | -------------------- ------ -------------------- | Lucy |-| Tea | Cake | ------ | Drink | Bake | | Brew | | -------------------- Users C-listsA Capability List is the dual (in the topological sense) of an access control list. Instead of storing one column of the access matrix with a protected object, a row of the access matrix is stored with the user.
A capability list entry, known as a capability, consists of the name of an object and the rights one user has to that object.
This does not change the rights a user has to a resource, and it does not change the set of policies that can be implemented, but it does change the economics of certain operations.
With a capability-list based protection mechanism, it is clear that users should not be granted the right arbitrarily edit their own capability lists! Instead, it is common to consider the capability list itself to be an object to which rights can be granted. Thus, Fred in the above example might have the right to edit Lucy's capability list. Typically, users might have the right to move capabilities from any list they can read to any list they can write, but users would not be able to modify a capability, except by removing access rights from it.
The most important application of capability lists in modern computer systems is to addressing. With capability-based addressing, each address is formed from two parts, as follows:
ADDRESS __________________ |_______|__________| | | | Index into an addressed object | Index into a capability list used to select an objectCapability based addressing, as such, was invented for the Chicago Magic Number computer, a paper machine that was described by people at the University of Chicago in the late 1960's. Capability lists were invented by people involved with the MULTICS project, although MULTICS was not itself a purely capability based machine.
Capability based addressing solves some interesting problems. For example, on UNIX, users can frequently name objects that they may not access, and users may find themself in the odd situation where they own a file (and pay for the storage it occupies) yet they can't address it because they have no rights to the directory where that file is listed.
Capability based addressing solves this problem by associating the right to access an object with the ability to name it. User's names of objects are always relative to some capability list, and any global absolute name for an object is hidden inside the capabilty and is never visible to the user.
The most common use of capability based addressing is in paged or segmented virtual memory address translation. Each entry in a page table clearly indicates both the object to which that entry grants access and the access rights granted, and the page table associated with each user is the page table for that user.
VIRTUAL ADDRESS __________________ |_______|__________| | | | Word in page | Index into the process's C-list for pages the objects referenced by capabilities are pagesNote that it is difficult to imagine an efficient implementation of paged virtual address translation using one access control list per page. This would require either an associative memory or a search of some kind with each memory reference. In contract, a simple table lookup suffices to check the access rights in a conventional (capability list based) virtual address translation system.
Another common example of the use of a capability list is in accessing open files. Even when access control lists are used to control the right to open a file, once the file is open, it is common to simply enter the file description into a list of open files attached to a process. Files in this list are referenced not by name, but by their position in the list, and the entries in this list indicate not only what file is involved but how it was opened -- that is, what access rights apply. Clearly, the list of open files is a capability list.
Read( f, buffer, address ) | | | address of a block in the file | index into the processes open file C-list the objects referenced by capabilities are filesThese two examples illustrate a problem with the way capabilities are used in modern systems. Instead of a single capability list per user, where that list describes all the resources the user is allowed to access, we tend to give users a collection of lists, one list for each type of resource. This is awkward.
Some systems have been built that solve this problem, notably the IBM AS/400 (and System 38). Another commercial system in this class is the Plessy System 250. Experimental systems that have solved this problem include the Cambridge CAP system and the Carnegie-Mellon Hydra and C.mmp systems. These will be discussed later.
It is worth noting that, on Multics, when a file is opened, the sectors of that file become visible as a segment in the virtual address space of the process that requested the open operation. Thus, for Multics, there is no distinction between the capability list describing the segments a process may access and the capability list describing the files the process may access.
The domain of a process is the set of all objects that process may manipulate. In a simple capability based system,
Domain = C-listIn the abstract, domain switching involves a change of the current working C-list. Domain switching is very important! It is very common, and systems must control it.
For example, when a trap occurs, the system typically changes from a user domain to the system kernel's domain, and on return from trap, the system switches back to the user's domain.
The entire reason for introducing two states in order to build a simple protected system is to create two domains, a system domain, that includes all objects, and a user domain, that includes only some objects.
One of the most complex domain switching mechanisms was introduced on the Multics system. On this system, each process has a level between 0 and 16. In addition, each segment has a level, and when a process attempts a memory access, the hardware checks the process level against the level of the accessed segment.
A Multics memory access is legal if the process has level higher than (or the same as) the segment, and illegal if the process level is below the segment level.
The Multics hierarchy of domains is a generalization of the two domain model of the primitive systems discussed previously. Multics allowed the operating system itself to be subdivided, so that only the process manager operated at the highest level, level 0. The virtual memory manager was at level 1, accounting was at a more restricted level, things like line-printer spoolers were even more restricted, leaving a variety of levels were available to users.
A user might run undebugged code at level 15, so that bugs in the code wouldn't damage other parts of the user's program.
To further complicate the Multics model, some pages can be marked as gateways. A procedure call to address zero in a gateway page is the only legal memory reference from a process running at a numerically higher level.
This causes the current level of the process to be pushed on the stack along with the return address. The called procedure then runs at the level of the segment containing it, and the level is reset to the original level when the procedure returns.
A MULTICS gateway is the generalization of a trap in a two-level security system. In addition to allowing system calls to run at the system level, it allows users to divide their code into more secure and less secure segments.
For example, consider a system where Joe, one of the users, owns a private database (say Chem Abstracts), and wishes to sell access to that database on a per-use basis. This means that Joe can't allow direct access to the database, he only wants other users to access it through his access code (which produces an appropriate billing record every time someone calls it).
Joe would simply arrange for his access code to run at a low numbered level (also called an inner ring), and he would require that users wishing to run his code do so from a higher numbered level (or outer ring), crossing a gateway as they call his code.
The above kind of consideration was a natural outcome of thinking about a computer utility -- the motivation behind MULTICS.
Note, however, that the strictly hierarchic scheme of MULTICS does not solve some problems. For example, it fails to solve the problems of two mutually suspicious users. Either one or the other must be given greater access rights if code from both users must share an address space.