Assignment 12, Solutions

Part of the homework for 22C:60, Fall 2009
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

  1. Background: Look at the illustration entitled "A complete system" in chapter 14 of the notes. Note that the figure includes 2 L1 caches upstream of the MMU, the I cache and the D cache and one L2 cache downstream.

    Consider this program. It works if run in RAM on the Hawk emulator, but it will not work if it is run on the system described in the figure referenced above. If you try to run it in ROM, of course, it fails with a bus trap:

    SHIFT:  ; expects R3 = register to shift (from 6 to 15)
            ;         R4 = shift count (from 0 to 15)
            LOAD    R5,CODE
            OR      R5,R3
            SL      R4,8
            OR      R5,R4
            STORE   R5,INSTR
            NOP
    INSTR:  NOP			; do the shift
            NOP
            JUMPS   R1
    	ALIGN	4
    CODE:   SL	R0,16
    	NOP
    

    a) Why are so many NOP instructions needed? (0.5 points)

    The STORE instruction interprets the address as being word aligned, while instructions are halfwords. Therefore, the address of the label INSTR could be the second halfword of an instruction, in which case the prior NOP is in the same word, or it could be the first halfword of an instruction, in which case, the next NOP is in the same word.

    b) Why doesn't this code work correctly on the system in the figure? (0.5 points)

    It is possible that the instruction at the label INSTR is in the i-cache. If this is the case, the fetch of this instruction will fetch the cached value and not the value just stored.

  2. Background: Consider a system designed so that cache memory circuit cards can be plugged into the memory bus in parallel with RAM cards. You plug in a 4K Acme brand brand cache card, and the system speeds up by a factor of 5.

    a) You liked this improvement, so you buy a second Acme 4K cache card and plug it into the bus. You get no improvement. Why? (0.5 points)

    The two caches are identical. Therefore, they cache copies of the same memory locations, so the second cache and the first cache both miss at the same time and both hit at the same time.

    b) You trade in the second Acme 4K cache card for a 4K card from Zenith, Acme's competitor, and suddenly your system is 30 percent faster. Why was it possible that a competitor's cache card would improve things when a second identical card did not? And why are two caches in parallel only likely to offer a small speedup? (0.5 points)

    Acme and Zenith caches, while interchangable, must use different cache replacement policies, so it is sometimes the case that one cache holds the desired memory location while the other does not. The improvement is not spectacular because their cache replacement policies agree with each other 70 percent of the time. (This is not surprising; both caches are probably trying to approximate least-recently-used replacement, but they use different approximations).

  3. Background: Trap handlers sometimes return to the caller, for example, when a trap handler implements some aspect of a virtual machine. In other cases, the trap handler might kill the running application, for example, when the running application has accessed a nonexistant memory address.

    Programmers sometimes want a third option, the option of having the trap handler raise a software exception. In this case, the default handler for the exception would terminate the application, but the user would be free to install their own handlers.

    Consider using the exception model from Chapter 13 to implement the ILLEGAL_INSTRUCTION exception in the context of Machine Problem 5 where the trap handler sometimes returns after emulating a shift instruciton and other times raises an exception.

    a) Explain why the code fragment from Chapter 13 entitled "Raising an exception, the general case" should not be used to raise an exception from within a trap handler. (0.5 points)

    The code jumps to the handler directly, with R2 set correctly, but it does not restore the processor status word. If there is a MMU, it might jump wildly, failing to turn the MMU back on as it returns. Even without an MMU, it may leave the system running at a privileged level when the handler is user-level code.

    b) Explain how you would raise the exception from within the trap handler. (0.5 points)

    The easiest solution is to use the normal return from trap instruction sequence to restore the registers and PSW. Therefore, the code to set R2 to the trap handler's activation record would change the saved value of R2 and not the contents of the actual register, and the code to jump to the handler would change the saved PC and not the actual program counter.

    Notes: Do not turn in code for this, the problem asks for a explanations. Adequate explanations can be written in 3 sentences or less. We reserve the right to disregard or penalize long essays offered in answer either part.

    Finally, if you imagine that the user had the MMU turned on at the time of the trap, it may actually help you understand the answers.