22C:116, Lecture 40, Fall 1999

My last formal lecture of the millenium
by the odometer rollover definition of millenium

Douglas W. Jones
University of Iowa Department of Computer Science

  1. Distributed Virtual Memory

    One of the interesting ideas that was fully developed on Mach involves the support of shared segments on a distributed system. The key to this development is that the Mach kernel does not handle page faults, but rather, the kernel simply hands faults to the appropriate exception handler, outside the kernel. This allowed a number of new approaches to page fault handling to be explored, including the construction of fault handlers that allow segments to be shared across a network.

    The basic idea is most easily understood if we assume that there are only two states for each page of the virtual address space: The page is either associated with a page frame, allowing read-write access, or the page is not associated with a page frame, so any attempt to access the page will produce an exception.

    Given this minimal virtual addressing model, shared segments can be implemented on a network as follows:

    Note that, from the point of view of a user of the shared segment, the performance of this scheme will be similar to using demand paged virtual memory for that segment, with pages copied to and from disk. The difference is that the pages of the segment are not stored on backing storage, they are stored in other copies of the segment on other machines; as a result, when a page is not in the local physical segment, it may be changed by some other process running on some other machine.

  2. Improved Performance

    The simple scheme outlined above allows only one copy of each page of a shared segment. In real shared memory applications, most of the accesses to shared variables are to read the variable, and only occasionally does a process change the value of a variable. This simple implementation of shared segments supports this common access pattern very poorly!

    The system would perform better if multiple copies of a page could be created when multiple processes are reading that page. This can be done! The solution involves use of read-only protection for duplicated pages, as follows:

    This model allows any number of copies of a page to be shared, so long as no process attempts to write that page. As soon as a process writes a page, all other copies of that page will be invalidated and all future attempts to read that page will result in new copies of the updated page being sent over the net to the machines where read operations were being done.

  3. Write Conflicts

    There is one problem with the protocol outlined above: What if two machines attempt simultaneous writes to a shared page. As written above, the result will be that the fault handlers on each machine send "invalidate" messages to the others, and all copies of the page will be invalidated!

    We can fix this by adding a conflict resolution rule: If fault handler A receives an "invalidate page x" message from B after it has sent "invalidate page x" messages and while it is awaiting replies saying "invalidated", it compares the unique ID of A with that of B. If A has the winning ID, A does not invalidate page x, and replies saying "not invalidated". If A has the winning ID, A invalidates page x and replies normally. This guarantees that, in case of a conflict between two machines, exactly one machine will retain a read-write copy of the page.

  4. Connections to other work

    The idea outlined above was originally invented as a hardware algorithm for cache coherency in shared memory machines that use snooping cache technology. Such machines are in widespread use today! In these machines, a local cache is implemented, in hardware, for each CPU, and the only memory accesses that use the shared main memory are those that result from cache misses. The problem of maintaining a coherent view of shared memory through all of these caches is called the cache coherency problem, and the solution outlined above is one example of a cache coherency protocol.