22C:18, Lecture 34, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

Memory Management and Virtual Memory

On small computers, the memory addresses issued by the CPU are the same addresses actually delivered to the memory hardware, but one characteristic of all large scale machines, whether they are mainframes or high performance microcomputers, is that they incorporate virtual memory mechanisms. Virtual memory mechanisms typically rest on a memory management unit (MMU) that translates addresses issued by the CPU, known as virtual addresses, into addresses useful in real memory, or physical address. The following figure describes this:

            A very simple machine

   _____           Address           ________
  |     |-------------------------->|        |
  | CPU |                           | Memory |
  |_____|<------------------------->|________|
                    Data 
           A typical large machine

          Virtual   _____   Physical
   _____  Address  |     |  Address  ________
  |     |--------->| MMU |--------->|        |
  | CPU |          |_____|          | Memory |
  |_____|<------------------------->|________|
                    Data 
The first machine to incorporate something analogous to a modern memory management unit was the Feranti Atlas Computer introduced around 1960; the unsuccessful IBM System 360/67 introduced around 1968 and the successful IBM 370 mainframe family all incorporate such mechanisms. Minicomputers introduced at around the same time, between 1970 and 1980, incorporated such mechanisms, and these finally appeared on the Intel 80386 and Motorola 68020 in the late 1980's.

The combination of a memory management unit and operating system software usually largely hides the memory management unit from programmers. Typically, the memory management unit serves the following functions:

Paged Address Translation

Most memory management units translate virtual addresses into physical addresses in a fairly standard way. A virtual address is divided into two fields, a virtual page number, and a word-in-page (or byte in page) field. Physical addresses are divided similarly into a physical page number and a word-in-page field. Typically, translation only applies to the page numbers, while the word-in-page field is passed through unmodified:

	Virtual Address
      _____________________
     |___________|_________|
          page     word-in
         number     page
           |          |_____________________
           |    _____________               |
           |___|             |              |
               |   Address   |              |
               | Translation |___           |
               |_____________|   |          |
                                page     word-in
                               number     page
                            _____________________
                           |___________|_________|
	                      Physical Address
Historically, virtual addresses were frequently thought of as being larger than physical addresses. This is because the original motivation for the incorporation of virtual address translation into hardware was to allow the operating system to provide the illusion of a large memory on a machine that only had a small physical memory. In the 1970's and 1980's, a number of minicomputers and microcomputers were sold with physical addresses larger than their virtual addresses; on these machines, the virtual memory address translation hardware was used primarily to aid the operating system in managing and protecting the memory resources. Today, many machines are sold with virtual and physical addresses that are formatted identically; typically, both are 32 bits.

Programmers are usually not concerned with the structure of addresses -- that is, programmers usually think of an address as indivisible without some bits being assigned to word-in-page and others to page-number. Page boundaries show up occasionally, for example, in the layout of memory, where regions marked as read-only or read-write must be aligned on page boundaries, but this is rarely of great concern to the programmer because we leave it to system software such as the linker to determine what goes where.

Most memory management units use very similar approaches to address translation: In effect, the heart of the MMU is a table of virtual page numbers and their corresponding physical page numbers, and the fundamental job of the MMU, on each and every memory cycle, is to search this table for the virtual page number and output the corresponding physical page number.

If the search fails, the MMU requests a trap -- typically a bus trap or an address translation trap. It is entirely up to the system software to handle such traps. If the search is successful, the CPU can access memory as if nothing unusual happened during the memory cycle. Typically, the table entries also include bits to indicate, for each page, whether the data in that page is read-only or read-write, and whether or not it is legal to executed instructions from that page. The MMU responds to violations of these restrictions by requesting a trap.

As described here, address translation sounds slow, since it involves a search of all entries in an address translation table. In fact, all successful MMU designs are very fast, involving only a small fraction of the memory cycle time for address translation. The reason for this is that the translation process is done using special hardware that typically uses indexing or hashing to narrow the range of the search and then uses fast parallel hardware to compare search all likely candidates in parallel.

Memory Management on the Hawk

The Hawk memory management unit is fairly typical of modern memory managers. We divide both virtual and physical addresses as follows:

 _______________________________________ ___________________ ___
|_______________________________________|___________________|___|
|31                                   12|11                2|1 0|
                  page                         word in      byte
                 number                          page      in word
The Hawk MMU only deals with page numbers, translating virtual page numbers (as issued by user programs) to physical page numbers. In effect, the memory manager contains an array of up to 1 million 32 bit registers, indexed by the virtual page number, where the registers contain the following information:
 _______________________________________ _______________________
|_______________________________________|_______________|_|_|_|_|
|31                                   12|11            4|3     0|
              physical page                              R W X V
                 number
If the V bit is zero, the contents of the register is invalid. If the register is valid, the R, W and X bits indicate, respectively, whether it is legal to read data from this page, write data to this page or execute instructions from this page.

Because the Hawk virtual address structure allows a million distinct virtual page numbers, a table containing the address translation information for the entire address space contains one million entries. Such a table is called the page-table for the address space, or sometimes the memory map for the address space.

Memory management units that can hold the full page-table are very rare. There are a number of reasons for this: The cost of the unit itself, the fact that few users bother attaching 4 gigabytes of memory to their machines, and the fact that it would takes too long to update the state of the memory management unit if a million entries had to be changed with each update. As a result, the page table for a user's address space is usually stored in memory, and when a user program attempts to address a page that is not known to the MMU, a trap occurs and it is up to the system software to load the required entry into the MMU.

The subset of the page table stored inside the MMU is sometimes called the TLB or translation lookaside buffer. The MMU looks in this buffer for the information it needs to translate an address, and if it does not find that information, it requests a trap. Inserting a new page into the TLB registers may overwrite a previously used register, and because of constraints on register usage imposed by the fast search algorithms used, this may occur even if only a few TLB registers are in use.

The MMU guarantees that writing a new value into the TLB will not overwrite the most recently inserted value, and it independently guarantees that writing a value into the TLB for virtual page x will not overwrite the TLB entry for virtual pages x+1 or x-1! These guarantees are actually needed!

Many memory management units were designed as afterthoughts to architectures that were originally designed with no intent of incorporating such a feature, but in the case of the Hawk, we know that most computers on the market require such units, so it is natural to design for one from the start and to assume that the design can be integrated with the CPU.

On the Hawk, access to the memory management unit is through the TMA register, which is used to hold a virtual address -- either to inquire about the address translation for that address or to set up how that address should be translated, and a pair of instructions, MMUSET and MMUGET. MMUSET is used to set the address translation for the page addressed by the TMA register, and MMUGET is used to inquire about how this page's address is currently translated.

Memory Management Fault Handling

The handler for a typical memory management fault must do the following: First, determine whether the virtual address in question is known to the MMU (using MMUGET). If it is unknown, the cause of the fault is simple, and the solution is to look in the operating system data structures (usually an address map) and tell the MMU how the address should be translated, using MMUSET, then return from trap.

If the address is known to the MMU, then the cause must be that the user tried to access the page in question in an illegal way, for example, by writing on a read-only page. In this case, a typical operating system will abort the user program.