22C:116, Lecture 39, Fall 2000

Douglas W. Jones
University of Iowa Department of Computer Science

Mach
Mach was designed at Carnegie Mellon University with DARPA funding with the specific goal of producing a distributed foundation on which secure UNIX systems could be implemented. The goal of providing absolute compatability with UNIX forced some compromizes in the design of this system, but nonetheless, it incorporates many useful ideas.
Mach is based on a microkernel, a term that is perhaps a misnomer. As originally proposed, kernels were supposed to be small, in the way that Mach is. The developers of UNIX, however, took over this term and applied it to a fairly large lump -- the UNIX kernel, forcing more recent developers to emphasize with the prefix micro that their kernels were small, as all kernels were supposed to be when the term was coined.
The Mach kernel provides process, thread, memory object, communications port and message management. As with Amoeba and other modern kernel-based systems, it does not include file systems or other high-level abstractions.
Mach has been available for almost 20 years, and it has been installed on many machines. The NeXT system used Mach, and MacOS X is also Mach based. (Notably, Steve Jobs founded NeXT when he left Apple, and MacOS X was developed at Apple after Jobs returned). Mach, however, was already fairly stable by the time the first versions of MacOS, were introduced, and it was very stable long before Microsoft Windows came to market; nobody should treat Mach as an untried experimental system of some kind.
Mach Processes
Each Mach process is made up of a virtual address space and a collection of threads. Processes are passive! The kernel maintains, on behalf of each process, 4 communications ports. Each process has an exception port; whenever a thread in the process raises an exception, for example, by divide by zero, illegal memory reference, or similar problems, the kernel sends a message to the exception port. This message identifies the cause of the exception in detail.
Messages sent to a process's process port are interpreted by the kernel. All kernel operations on processes that may be carried out by other processes are implemented by messages to the process port instead of by kernel calls. As such, no kernel calls take process ID's as parameters; instead, operations that might take that form are formulated as messages to the process port of the process being manipulated. Process termination, suspension, resumption, priority etc all are handled this way.
The bootstrap port is used to start a process. The initial thread of the process typically begins by reading from the bootstrap port; when a process is started, the first thing its creator does is send a message to that process's bootstrap port containing the information that process needs in order to run. Typically, this message will include the ports that make up its standard environment and the parameters with which it was created.
Finally, note that each process is initially created with one thread, and this thread has a thread port. Messages sent to the thread port of a thread are interpreted by the kernel as operations on that thread. Because capabilities for all threads of a process will typically be found in the C-list of that process, it is possible for threads to operate on each others' states as well as on their own.
Memory
Mach has a paged model of memory, with the virtual address space of a process divided up into regions, where regions are separated by undefined areas of the address space, and regions are divided up into pages. The Mach kernel does not handle page-faults! Regions may correspond to segments on those machines where the address space is paged and segmented, but the term region is used in order to decouple the user-level notions of addressing from the implementation in terms of any particular memory managemet unit.
Instead, when a thread attempts to reference a memory location that is not currently addressable, a message is sent by the kernel to the exception port of the process in which that thread is running. The message contains the information necessary for the fault handler -- a thread of some process -- to handle the fault and then restart the thread that caused the fault.
Note that this implies that all threads of a process will have the same fault handler! This makes sense because they share the same address space!
The kernel services for memory management are quite primitive: Physical page frames may be allocated and associated with a page of a region, and they may be deallocated. Regions are first class objects and may be shared.
Threads
Mach threads are lightweight, but implemented by the kernel. Thread creation and management are all done by kernel calls, either directly or by sending messages to the process's management port. Mach threads are fairly conventional kernel threads, so little more needs to be said about them.
Ports
Mach ports are not, as in DEMOS or even Amoeba, attributes of a process. Instead, ports are first class objects. Thus, both the sender and receiver of a message must name the port through which it is sent. Ports may have multiple recipients, for example, processes that cooperatively share the load of processing messages to a particular port.
Mach processes have capability lists, where the capability lists indicate what ports the process may reference!
All threads within one process, and all processes that share one memory region must run on the same shared-memory machine, but ports may be shared by processes running on many different machines in the network environment. Port creation and destruction are primitive kernel operations, and capabilities for ports may be included in messages, along with data.