22C:116, Lecture 38, Spring 1999

Douglas W. Jones
University of Iowa Department of Computer Science

Amoeba Memory
The design of Tannenbaum's Amoeba system is based on the assumption that memory will be sufficiently inexpensive that paged access to memory and sector-by-sector access to files is unreasonable. Thus, the kernel's model of memory is in terms of contiguous memory segments, and the Bullet file server stores files that are usually read or written by single indivisible operations. Parts of files may be read and parts of non-committed files may be written, but no sector structure is imposed and the assumption is that such partial reads and writes will be rare.
As a result, if virtual memory ideas are to be applied to Amoeba, each segment would be stored as a single file, and the only demand transfers performed would involve reading or writing entire segments!
Given that there is an optimal sector size (or page size) that is a function of the data transfer rate, access time and CPU speed, where increasing the CPU speed relative to the access time increases the optimal sector size, it is clear that there has been an upward trend in the optimal sector size over time.
```
        |                          32K
    ?   |
Optimal |                   4K
 Sector |   . . . . . . . . . . . . . . . typical file?
   Size |          512                    (based on DEMOS
    ?   |   128                            research)
        |
      --+--------------------------------
        |  1960    1970    1980    1990
         
              Time
```
Tannenbaum's design of Amoeba is based on the assumption that the optimal sector size is now or will soon be equal to the typical file or segment size, and thus, that designing a file system or memory architecture to operate in terms of sectors or pages is probably inappropriate.
My counter argument to this is that the file size we use is strongly tied to the speed of the available system. In the days when systems were small and slow, the common advice was to use a separate file for each procedure or function in a program; this allowed separate compilation (minimizing the compile time when making small incremental changes to a program), and it reduced the time taken to enter and exit the cumbersome editors of the day. Now, with relatively fast editors and compilers, there is no such reason for maintaining each routine in a separate source file. (Note, however, that poor programming language design may still encourage it, and that there are other good reasons to modularize programs.) Furthermore, as systems have gotten faster, we have used them to store larger and larger data objects, for example, images, animations, simulation models, and databases.
In 1970, an author of a novel using the crude text editors of the era would have maintained each chapter as a separate text file. Today, there are many editors that have no difficulty working on the entire text of a novel as a single source file approaching 1 Megabyte in size. In 1970, very few computers manipulated high resolution images. Today, image files frequently involve 1 million pixels, and use of computer systems to store and edit video image streams is growing in popularity, despite the fact that the files involved can easily grow to sizes of well over a gigabyte.
The result of these changes is that the files we are interested in manipulating seem to have grown at a rate comparable to or even exceeding the rate of growth of memory, and as a result, the arguments for doing file I/O in sectors or other units smaller than an entire file seem to remain as valid today as they were in 1960.
Amoeba's Directory Servers
The directory servers provided with Amoeba map textual names to capabilities. Given a capability for a directory and a textual name to look up, the server returns one or more capabilities associated with that name. Typically, the entries in a directory will either be file capabilities or capabilities for other directories, but this is not necessarily the case. The directory server can, in principle, map names to any type of capability, including capabilities for new applicaton-dependant object classes managed by user-created servers.
The Amoeba directory server has the interesting property that it maps names not to single capabilities but to sets of capabilities. This is done in order to support replicated resources. Thus, if some file is popular, copies of it could be included on a number of servers, and the directory entry could list all the copies. Crude users would typically pick one of the copies at random; sophisticated users could select the copy on the nearest server (measuring distance in terms of communication delay).
The Amoeba directory server includes Unix style access-control lists because it is intended to support a Unix model of files, but much of the protection in the Amoeba scheme comes from the fact that each user is given a home directory that lists the resources that that user may access. Thus, two different users might have different "/etc" directories and different "/bin" directories, while sharing the same "/public" directory.
Storage Reclamation
Amoeba's directory structure doesn't allow the simple approach to storage reclamation used in UNIX. Recall that, in UNIX, directories contain pointers to files or directories, but these are not allowed to form an arbitrary graph. Instead, UNIX directories are constrained to be arranged into tree structures, with a steriotyped system of self-pointers and back-pointers. There may, however, be multiple pointers to files, which are otherwise thought of as leaves in the tree.
Reference counts are maintained for each UNIX file and directory, so creating a link or opening the file increments the count, and deleting the link or closing the file decrements the count. The space occupied by the file is reclaimed when the count reaches zero. This scheme cannot reclaim space occupied by circularly connected groups of directories, so the creation of such groups is strictly controlled. The mkdir system call creates a the only legal form of circular linkage, and the rmdir system call deletes this cycle correctly.
This scheme cannot be used in Amoeba because there is no constraint on what links are placed in directories, and because the resource servers, usually file servers, are not informed when capabilities are duplicated or stored. Thus, Amoeba servers are unable to manage reference counts for the objects they manage, and therefore, they are unable to determine from such counts when the objects should be deleted.
Instead, Amoeba servers are expected to remember the time at which an object was last referenced and when the last reference was sufficiently far in the past, the servers are expected to reclaim the objects. Any user wishing to guarantee the preservation of an object must occasionally use that object, and, most noteworthy, the directory servers are expected to reglarly (but slowly) sweep through each directory they maintain touching the objects listed in that directory. When a directory goes untouched for a sufficient time, the directory server will delete it, and the result will be that the objects it references will no-longer be touched and will themselves eventually be deleted unless some other process still has a capability for those objects and touches them.
Exceptions
Exception handling is an important area of programming language and operating system design which is frequently handled very poorly! The UNIX and C signal mechanism (see the man page for sigvec and related pages) is basically a warmed over version of the exception handling features of the PL/I programming language, the language originally designed by the IBM SHARE user group in the mid 1960's, and first used for operating system implementation with the MULTICS system.
A better exception handling model is found in the Ada programming language, Here, exception handlers are not procedures, called when an exception occurs; rather, a handler is effectively entered by a longjump or non-local goto, causing the stack to unwind to the environment of the procedure that installed the handler.
It is important to ask about the purpose of user installed exception handlers. There are two common uses for these:
- First, the user might need to handle error conditions such as divide by zero, array out-of-bounds, or user initiated abort. Here, the Ada model of unwinding the stack makes excellent sense and the procedure model is of uncertain use -- in fact, the procedure model is best used by having the exception handler procedure perform a longjump or non-local goto to get out of the code that raised the exception.
- Second, the user may be using exception handlers to virtualize a resource. User implemented virtual memory, user implemented emulation of missing machine instructions, and similar functions are all examples of something that is best handled by a procedure-like or callback model of exception handler.
Sadly, exception handling in operating systems today is essentially frozen with the UNIX or PL/I model, and exploration of new directions in this area seems to be frozen by market forces that demand standardization of the operating system interface.