Assignment 12, due May 5

Solutions

Part of the homework for 22C:60 (CS:2630), Spring 2014
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On every assignment, write your name legibly as it appears on your University ID card! Homework is due on paper at the start of class on the day indicated (usually Friday). Exceptions will be made only by advance arrangement (excepting "acts of God"). Late work must be turned in to the TA's mailbox (ask the CS receptionist in 14 MLH for help). Never push homework under someone's door!

  1. Background: Consider the skeleton for a trap handler given in section 13.3 of the Hawk manual. In that skeleton, the label myCODE is used for the subroutine that actually handles the trap, while the body of the skeleton itself is devoted to saving the complete CPU state before calling myCODE and restoring the entire CPU state after return from myCODE.

    A problem: Assume you are working on an undefined instruction trap handler. Your job is to make the instruction trap handler redefine all undefined instructions as being equivalent to a halfword no-op instruction. That is, whenever the CPU encounteres an undefined instruction, the trap handler should increment the PC by 2 and return from trap making no other changes to the registers. Your answer should be the complete code of myCODE for this trap handler.

    Hint: It is not big code, perhaps 6 instructions, including subroutine entry and return. (1.5 points).

    The following code exactly follows the instructions.

    ; the myCODE field of the INSSAVEAREA points to this subroutine
    INSCODE:; link through R3 - the code in 13.3 does that! (why?)
            ; no parameters
            LIL     R4,INSSAVEAREA  ; base of saved registers
            LOAD    R5,R4,svPC
            ADDSI   R5,2
            STORE   R5,R4,svPC      ; savedPC = savedPC + 1
            JUMPS   R3              ; return
    

    Some students, as an alternative to following the instructions, tried to write the minimal trap service routine that would convert all undefined instructions to a 16-bit no-op. Here is a good solution doing that:

    LCSAVE  =       .
    ; part 1, lifted with minimal modification from 13.3
    .       =       INSTRTRAP
            CPUSET  R2,TSV
            LIL     R2,INSSAVEAREA
            STORE   R1,R2,svR1
            LIL     R1,INSCODE
            JUMPS   R1
    
    ; part 2, the bare bones continuation of part 1
    .       =       LCSAVE
            CPUGET  R1,PSW
            STORE   R1,R2,svPSW     ; -- save the PSW
            CPUGET  R1,TPC
            ADDSI   R1,2            ; -- this changes the PSW
            CPUSET  R1,TPC          ; savedPC = savedPC + 1
            LOAD    R1,R2,svPSW
            CPUSET  R1,PSW          ; -- restore the PSW
            LOAD    R1,R2,svR1
            CPUGET  R2,TSV
            RTT                     ; return from trap
    

    Writing the kind of code given in the second example is like walking barefoot on broken glass -- you have to be very very careful. This is why the code in the manual simply saves everything and then sets things up so that the real trap service routine looks just like a normal subroutine (except for the unnecessary oddity of linking through R3).

    On the other hand, the second solution given here would be much more efficient -- if the efficiency of no-op instructions is something you care about.

  2. Background: Consider a problem where your program is manipulating an intensively used data structure with a size of n bytes, where your computation takes t time, and computation theory says that t=O(n).

    Your machine has a cache with a capacity of 10,000 bytes where the cache allows memory operations in 1 ns. Main memory is much much larger, and has an access time of 20 ns. Assume that the program is small, so you can ignore the size of the program itself, paying attention only to the data.

    A problem: Sketch a graph of t as a function of n. Theory predicts a straight line (and we didn't give you enough information for you to give the slope of the line. In practice, the line is not straight, and there is a kink in the line. On your graph, indicate the correct value of n where the line kinks, make the line kink in the correct direction, and give the ratio of the slopes above and below the kink. (0.5 points).

    As the data structure grows from 0 to 10,000 bytes, the running time t rises at a constant slope. Above 10,000 bytes, the curve is also a straight line, but the slop is 20 times steeper.

    The sketch given here has a very sharp break from the side where the data structure fits in cache to the side where the speed of the main RAM dominates. With real data structures, the break is rarely that abrupt. The cache usually continues to contribute to the performance for data structures that are too large to fit entirely in the cache, but the contribution falls as the data structure grows. The net result is a rounded elbow in the curve.

  3. Background: In the Hawk, the virtual address and physical address have the same format, so the MMU takes a 30 bit virtual address in and outputs a 30 bit physical address.

    A question: Aren't memory addresses 32 bits on the Hawk? Why did it say 30 bits here? (0.5 points).

    The Hawk CPU uses the least significant 2 bits of the memory address to select bytes or halfwords, but all load and store instructions operate on full words, so the least significant 2 bits of the address are not needed by the memory. Instruction fetches might need a 31 bit address, but the CPU could easily be built to always fetch the full word holding the current pair of instruction halfwords, again, using onlhy a 30 bit word address when communicating with memory.

  4. Background: In the Hawk, the virtual address and physical address have the same format, so the MMU takes a 30 bit virtual address in and outputs a 30 bit physical address. The Hawk MMU interface is describe in Secitons 1.3.4.6 and 1.3.5 of the Hawk manual.

    The marketplace these days demands that large machines support RAM sizes bigger than 4 gigabytes. A larger physical address could be supported by dedicating unused bits in the MMUDATA register to the physical page number.

    A question: How many bits are there in the largest physical (word) address that could be supported on a modified version of the Hawk without major changes to the way the MMU interface operates? (0.5 points).

    Bits 6 to 11 (6 bits) are marked as unused in MMUDATA and could therefore be used as part of the physical page number, thus expanding the physical address space by a factor of 32, from 4 gigabytes to 128 gigabytes.