Assignment 12, due May 1

Solutions

Part of the homework for CS:2630, Spring 2015
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On every assignment, write your name legibly as it appears on your University ID card! Homework is due on paper at the start of class on the day indicated (usually Friday). Exceptions will be made only by advance arrangement (excepting "acts of God"). Late work must be turned in to the TA's mailbox (ask the CS receptionist in 14 MLH for help). Never push homework under someone's door!

  1. Background: The code to virtualize the LOADCC instruciton distributed in the file mp6.a includes a block of code to update the processor status word, ending with the comment savearea.pswsv = psw | cc.

    a) How does this instruction get the new value of the condition codes that the instruction is supposed to set? If you read the manual for LOADCC, you will see that the condition code setting is complicated, with one bit that tells you whether the loaded value has any NUL bytes in it, yet the code doesn't seem to check for this at all. (0.3 points)

    It uses the LOADSCC instruction to get the operand of the virtual LOADCC instruction. These set the condition codes identically, so the code then uses CPUGET to get the entire PSW into R10.

    b) The code to update the condition codes could update just the least significant 4 bits of the PSW, but it updates many more bits. How many bits of the saved PSW does it the code actually update? You can answer this from inspecting the code. (0.3 points)

    The code uses the mask 0000FFFF16 to select the bits from the PSW set by the LOADSCC instruction, while it uses the mask FFFF000016 to select the bits from the user's saved PSW that will not be changed. Thus, it may change 16 bits of the PSW.

    c) The extra bits of the PSW that get updated (see part b) contain a field of the PSW other than the condition codes. What is the name of this field? (You may need to review Chapter 1 of the Hawk manual.) (0.2 points)

    The bcd carry field.

    d) What is the use of the field of the processor status word discussed in part c)? (You may need to review Chapter 9 of the class notes.) (0.2 points)

    The bcd carry field seems to be set by the ADD instruction and used by the ADJUST instruction when it is used to correct the sum for BCD or excess 3 arithmetic -- forms of decimal arithmetic that can be done with reasonable efficiency on a binary computer.

  2. Background: Look back at the (rather archaic) keyboard interface defined in Chapter 12, and assume the interrupt enable bit is wired as suggested near the end of Chapter 13.

    Furthermore, assume that the keyboard is connected to trap vector entry C016.

    In addition, assume that the three least-significant bits of the level field of the processor status word are decoded as a 3-bit integer. If this field of the level is nnn, this interrupts are enabled via trap-vector entries 1nnn and lower. For example, if the level field is 0010, then interrupts using trap-vector entries 1000, and 1001 and 1010 are enabled, but 1011 and up are disabled.

    a) Suppose someone hits a key on the keyboard. What combination of values in the keyboard device control register and the processor status word will permit the keyboard to interrupt the CPU? (0.5 points)

    i. The IE (interrupt enable) bit in the keyboard status and control register must be set.
    ii. The level field of the PSW must be set to a value in the range 0000-0100 (inclusive) or in the range 1000-1100.

    b) When control enters the interrupt service routine for the keyboard, the level field of the PSW is set to 0000. Within the interrupt service routine for a slow device like the keyboard, it might be important to allow higher priority devices to interrupt the interrupt service routine. What instruction sequence should the interrupt service routine contain in order to enable higher priority interrupts? (Hint, it is an LIW followed by a CPUSET, what you need to work out are the arguments.) (0.5 points)

    You will write code that resembles this:

            LIW     R3,#30000000
            CPUSET  R3,PSW
    

    The value 3 enables all devices interrupting through trap vector entries 8, 9, 10 and 11 while blocking interrupts at higher numbered (lower priority) levels.

    Commentary: The new PSW will remain in effect until the saved PSW is restored at the end of the interrupt service routine or until the code changes it to some other value (for example, to disable interrupts during the sequence of instructions used to restore the CPU state during a return from interrupt).

  3. Background: Suppose you are interested in the performance of algorithm X, an algorithm you just invented and named. One thing that worries you is the speed of the algorithm, so you devise a test. You write a program to measure the how long the algorithm takes (t) as a function of the size of the data (s). Your theoretical work suggests that a plot of the how long the algorithm takes t vs s ought to be a straight line.

    Like essentially all theoretical computer scientists, your theoretical work is based on the assumption that the memory access time is a constant.

    You take data covering a huge range of sizes. For values of s below approximately 16000, your t vs s curve is straight. Above this, the t vs s curve bends upward, until s exceeds 128000. Above this, the curve seems straight again, but the slope is steeper.

    a) Is your theoretical analysis of Algorithm X wrong, or did your theory overlook something. Explain. (0.4 points)

    The theory overlooked the fact that, with a cache memory, the speed of memory access is not constant.

    b) Estimate the size of the cache on this machine? (0.3 points)

    The straight line from a data structure size of 0 to 16K suggests that you measured the effective speed of the cache in that range, while things were slowed by cache misses for data structures exceeding 16K. This means that the cache probably had a capacity of 16,384.

    c) Suggest a way to estimate the ratio of the memory speeds of the cache and the main memory on this machine, using your t vs s curves. (0.3 points)

    When the data structure size is only somewhat above 16K, there are a mixture of cache hits and misses, but as the data structure grows larger than the cache, the speed should asymptotically approach that of the main memory. The curve between sizes of 16K and 128K seems to be dominated by this asymptotic approach, but above 128K, the apparent straight line suggests that you can ignore the cache here.

    Thus, to estimate the ratio of the cache speed to the main memory speed, take the ratio of the slopes of the curve below 16K and above 128K.

    Note: There is a category of algorithm design called "cache aware" design that aims to design algorithms that perform well even when the data is much larger than the cache. Cache aware algorithms matter particularly to users of "big data" like Google and Amazon.