Exam 3: Final

Solutins and Commentary

Part of the homework for 22C:60, Fall 2010
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Final Exam

Mean   = 6.14
Median = 4.9   X
               X
           X X X   X
           X X X   X                     X
    ___X___X_X_X_X_X_X_X_______X_________X_______
      0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20

All Exams

Mean   = 17.81
Median = 17.2    X         X   X X X   X                           X
    ___________X_X_X_____X_X___X_X_X_X_X_X_______X_________________X_____
      4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36

All Machine Problems

                                                       X
Mean   = 23.38                                         X
Median = 24.5                                          X
                                   X     X             X     X X   X
    _____________________________X_X_____X_______X_X_X_X___X_X_X_X_X_
      0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Top 10 of 12 Homeworks

Mean   = 21.69                             X             X
Median = 21.8                    X         X     X   X   X     X
    ___________________________X_X_____X___X_X_X_X___X_X_X_X___X_X___
      0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Sum Total Scores and Letter Grades

Mean   = 62.88
Median = 64.4
                                   X
           X                       X     X
    _______X_X_X___X_X___X_X___X_X_X_X___X_X_X_X_______________X_X___
      36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84. 88. 92. 96
Grades     - - C C C C + + - - B B B B + + - - A A A A + +

Background: Consider these Hawk instructions: (i) LIL R3,X (ii) LEA R3,X and (iii) LOAD R3,X (Assume that the address X is defined as a label).
a) One of these loads a different value in R3 than the other two. Which one and what is the difference? (1.0 points)

LOAD loads the contents of memory location X, while the other two load the value of the constant X, albeit in different ways.
5 did well, while one earned half credit by giving a partial correct answer. 15 singled out LEA with answers that implied that LOAD and LIL somehow loaded similar values. 3 singled LIL, presumably distracted by the issues at the center the next part.

b) One of these has a different constraint on the label X than the other two. Which one and what is the difference? (1.0 points)

LIL requires that X be representable as a 24 bit two's complement value. The other two represent X using a 16-bit two's complement displacement from the program counter.
6 got partial credit, singling out LIL but saying X was limited to 16 bits with the other two. 4 more said X could be a full word for the other two. The rest earned no credit, mostly by giving answers that made no sense.
Background: In the 1960's, a number of ones complement computers were built. Control Data Corporation made a number of these. Of them, the CDC 160 was the smallest, with a 12-bit word. The University of Iowa Physics Department had several that were used to control physics experiments.
Custom hardware had to be built to recover the data from one of these experiments that was stored on magnetic tape. Each consecutive value on the tape corresponds to a voltage from minus 2047 millivolts to plus 2047 millivolts. Your job is to convert each 12-bit one's complement integer to a 32-bit two's complement integer.
a) Assume that the least significant 12 bits of R3 hold a value just extracted from the tape. The most significant bits are undefined, they could be ones or zeros. Write Hawk code to sign-extend the 12-bit data to 32 bits. There are several ways of doing this. Inefficient code will be penalized. (1.0 points)
```
        SXT     R3,11
```
1 gave this answer, 2, for small penalties, gave the argument 12. 3 tried very complex sequences of shift instructions that almost worked, for less than half credit. 9 gave incredibly complex answers that were, ultimately, nonsense. The rest left it blank. Here is a sequence of shifts that would do the same thing:
```
        SL      R3,10   ; 20 bit shift requires 2 instructions
        SL      R3,10   ;  (the sum of the shift counts must be 20)
        SR      R3,8
        SR      R3,12   ; using a signed shift sign extend the result
```
b) Assume that R3 holds a one's complement value (it has already been widened to a full 32 bits). Write Hawk code to convert it to the equivalent two's complement value. Inefficient code will be penalized. (1.0 points)
```
        TESTR   R3
        BNR     L
        ADDSI   R3,1
L:
```
2 did well. 6 just incremented, forgetting that positive two's complement values are equal to their one's complement equivalents. 5 gave answers that were essentially nonsense, the rest left it blank.
The whole point of this question was to ask if students understood the different number systems, and to test this operationally and not just with a question involving memorization or looking up something in the text.
c) We could have solved the problem by first converting the 12-bit value from one's complement to two's complement and then widening it to 32 bits. Explain why the widening can be safely done either before or after the conversion to two's complement. (0.5 points)

Sign extension works equally well on one's and two's complement numbers, so it works equally well to increment based on the old sign bit or to widen and then increment based on the new sign bit.
5 did well, 2 thought negative zero was a problem (is is not!), and 7 earned half credit with answers that were on the right track. 4 gave answers that were essentially nonsense, the rest left it blank.
Precision machine tools in the U.S. frequently have an accuracy of 0.0001 in. Your boss has decided to represent all dimensions in inches with fixed-point data with 14 bits after the point. Assume x is such a variable. Some of your input and output is given in units of 0.0001 inches. Assume i is a variable of this type. The formula to convert between these is:
i = x × (10000/16384)
10000/16384 = 0.1001110001₂ (exactly)
Assume that the machine-tool controller uses a Hawk CPU, and you have access to LONGMULU, a routine that takes two 32-bit unsigned integer operands in R3 and R4 and returns the 64-bit product in the double-word R3 (least significant bits) and R4 (most significant bits).
Your job in this assignment is to convert x in R3, to i, leaving the result in R3.
a) Write Hawk code to solve this problem by calling LONGMULU, preceeded and followed by any code needed to put the arguments in place get the results into the right registers. (1.0 points)
```
        LIW     R4,#9C400000    ; 32 places after the point
        ADDSI   R2,R2,ARSIZE
        LIL     R1,LONGMULU
        JSRS    R1,R1
        ADDSI   R2,R2,-ARSIZE
        MOVE    R4,R3           ; keep only the integer part of the result
```
1 did well, 2 had almost correct answers, 9 did not adjust the result properly in the context of the multiplier they opted to use. Two forgot to adjust the activation record for the call. 4 gave answers that were essentially nonsense, although quite long, while a few left it blank.
Several of the nonsense answers appeared to be copied from the solutions to exam 2, even though this question asked for the inverse of the computation required on that exam. Several answers attempted to do something like this, although none completely:
```
        LIW     R4,#9C4         ; 12 places after the point
        ADDSI   R2,R2,ARSIZE
        LIL     R1,LONGMULU
        JSRS    R1,R1
        ADDSI   R2,R2,-ARSIZE
                                ; keep only the integer part of the result
        SLU     R3,12           ; (sufficient if values are small enough)
        SR      R4,11
        SR      R4,9            ; get the high 12 bits of the result
        OR      R3,R4           ; get the high 12 bits of the result
```
b) Write Hawk code to solve this problem using a series of shift and add instructions. If you need a temporary register, use R1. (1.0 points)
```
        MOVE    R1,R3
        SRU     R3,4
        ADDSRU  R3,R1,1
        ADDSRU  R3,R1,1
        ADDSRU  R3,R1,3
        ADDSRU  R3,R1,1
```
There were no perfect answers. Among the small problems plaguing those who came close, 2 forgot to use unsigned shifts, 1 shifted left, 1 omitted the copy to R1 and then used R3 as the second operand on all the shifts, 1 forgot the final shift, 1 added an extra shift, and 2 used ADDSRU for the first shift, forgetting that ADDSRU does its shift after the addition.
9 students gave nonsense answers, 5 left it blank. Among the nonsense answers, several tried to give answers that would have been correct in the context of the second midterm.
There were also a few students who tried to give answers based on an alternative approach, multiplying by 10000 and then dividing by 16384. This is not a good idea because it gives incorrect results for large initial values, but within the range of values where it works, it is OK.
```
        ; first multiply by 10000
        ADDSL   R3,R3,2         ; times 5
        ADDSL   R3,R3,2         ; times 5 (gives times 25)
        SL      R3,2            ; times 4 (gives times 100)
        ADDSL   R3,R3,2
        ADDSL   R3,R3,2
        SL      R3,2            ; repeated gives times 10000
        ; then divide by 16384
        SRU     R3,14
```
Background: Another file of antique 12-bit data has come to light. The only information you have is an old faded scrap of paper with these examples:

011100000000 -- positive infinity
111100000000 -- negative infinity
x111xxxxxxxx -- other error values
011011111111 = 7.984 -- floating point
011000000000 = 4.000
010010000000 = 1.500
010000000000 = 1.000
110000000000 = -1.000
001100000000 = 0.500
000100000000 = 0.125
000010000000 = 0.0625
000001000000 = 0.0312
000000000000 = 0.000

To talk about this system, number the bits from 0 (the rightmost) to 11 (the leftmost).
a) Which bits are the exponent field, including the exponent's sign? (0.5 points)

8 to 10
10 did well, 2 left the problem blank. The most common wrong answer 9 to 11, but a few thought it was a 2-bit or a 4-bit exponent field.

b) What number system is used for the exponent and how is zero represented in this system? (0.5 points)

biased, with zero represented as 100
8 did well, 2 did not give the bias, some gave strange answers, and 8 left it blank.
A disturbing number gave the representation for zero in the whole number system, a largely irrelevant issue in the discussion of a biased exponent.

c) Identify any special exponent values that are outside the system you identified above. Very briefly (in one or two one words each) say what makes each such value special. (0.5 points)

111 for non-numeric values, 000 for denormalized values.
4 did well, 2 omitted 111, 3 omitted 000, 2 got a bit of partial credit for giving the full representation of zero (which is a denormalized value).

d) Which bits are the mantissa, including the mantissa's sign? (0.5 points)

11 is the sign, 7 to 0 are the rest of the mantissa.
8 did well. 5 forgot about the sign bit. 3 left this blank.

e) What number system is used for the mantissa? (0.5 points)

It is a fixed point signed magnitude number.
2 gave good ansers, 3 forget to mention that it is signed, 1 said it was 2's complement, an unlikely choice that was not contradictted by the examples given. 9 left thi blank.

f) Where is the point in the mantissa (between which bits in the word)? (0.5 points)

Between bit 7 and 8.
10 did well. 3 aid between 8 and 9, 1 said between 9 and 10 (without making compensating changes to the bias on the exponent). 3 left it blank or gave nonsense answers.

g) Are there any hidden bits in the mantissa? If so, what bit numbers do they "hide under", what is their value, and how is this value determined? (0.5 points)

Yes, under bit 8, 1 for normalized numbers, 0 for non-normalized.
3 did well, 4 gave ambiguous answers about where or how many hidden bits there are. 9 forgot to explain when the hidden bit was 1 and when it was 0. the rest left it blank or gave nonsense answers.
Background: Consider the flipflop circuit given to the right.
A problem: Complete this timing diagram: (3 points)

4 gave correct solutions for parts c, d, e, f, g and h. Many wrong answers appeared random. 1 student gave just the blackened parts and left everything else blank. It is impossible to categorize all of the wrong answers.
In the solution given above, in lines c, d, e and f, solidly blackened lines show the parts that can be directly derived just from the inputs a and b (step 1). Hollow enhanced lines show results that can be derived from the inputs plus the results of step 1 (step 2). Lightly enhanced lines (all in output c) can be derived from step 2 (step 3), finally, the rest of outputs c, d, e and f can be derived by examining the behavior of the RS flipflops embedded in the master side of this complex flipflop.
In grading this exam, the answer given for the slave RS flipflop to produce outputs g and h was always graded in terms of the answers given for d and e. The net result, incidentally, is an edge-triggered master-slave type-d flipflop, and this particular design has been in widespread use since the late 1960's.
Background: Consider this alternative exception model presented in Midterm Exam II: The global variable for an exception, for example, constraint_error is a single word, the address of the current handler description record. The activation record of any routine that contains an exception handler must contain a three-word handler description record. The fields of this might be declared as follows in SMAL code:
```
; handler description record (HDR) format
EXAR    =       0       ; address of activation record of handler
EXPC    =       4       ; address of activation record of handler
EXPRIOR =       8       ; saved address of the previous HDR
HDRSIZE =       12
```
Assume that MYHDR is a 3-word long local variable that holds the handler description record for a try-catch block in the current subroutine and that MYHAND is the label on the handler for this block in the current subroutine. The global variable CONSTRAINT_ERROR is an exception (a pointer to a HDR), and there is a default handler already installed.

a) Write SMAL Hawk code for entry to a try-catch block that installs MYHAND. (4.0 points)
```
        STORE   R2,R2,MYHDR+EXAR

        LEA     R3,MYHAND
        STORE   R3,R2,MYHDR+EXPC

        LIL     R1,CONSTRAINT_ERROR
        LOADS   R3,R1           ; get pointer to prior HDR
        STORE   R3,R2,MYHDR+EXPRIOR

        LEA     R3,R2,MYHDR
        STORES  R3,R1           ; finish install
```
Nobody did perfectly, but many got significant parts right. 3 copied the code to install an exception handler verbatim from the notes. Unfortunately, the notes used a different exception model, so that code was irrelevant. 2 more copied the solution to exam 2. That exam used the same exception model, but that code was the code to throw an exception, not the code to install a handler. 5 more gave strange code with no obvious function and 1 left this problem blank.
b) Suppose you wanted to raise a constraint-error exception from within the MMU trap handler (for example, in response to a MMU trap caused by a program referencing memory through an invalid page-table entry. Explain why the following code for throwing a constraint-error exception (lifted from the solutions to Exam II) would not work correctly in this context: (1 point)
```
        LIL     R3,CONSTRAINT_ERROR
        LOADS   R4,R3           ; follow pointer to handler description
        LOAD    R1,R4   EXPC    ; get handler address
        JUMPS   R1              ; go there
```
This code would attempt to transfer control to the handler leaving the MMU turned off, while the application expected to run with the MMU on. It does not restore the PSW correctly, nor does it account for the need to translate the virtual address of the user's constraint error data structure to a physical address for use within the handler.
4 gave good answers, 4 more gave answers that were at least partially correct. Among the entirely wrong anwer, 2 attempted to give some kind of Hawk code where the central requirement of the question was to give an explanation.
Background: Each page table entry on the Hawk includes 5 bits, CRWXV indicating whether it is legal to Cache addresses in that page, whether Read, Write and Execute operations are legal to that page, and whether the page is Valid.
Notes: When a computer includes multiple cache memories on the path from CPU to memory, the L1 cache is closest to the CPU, while the L2 cache is closer to memory. The L1 cache may be split into an I cache (code) and a D cache (data). Self-modifying code stores data in RAM and then executes it, for example, loading a program and then jumping to it.
a) If the MMU sits between the L1 and L2 caches on the way to memory, which cache does the C bit in the page table entry naturally disable, and what is awkward about having the C bit operate on the other cache. (1 point)

The MMU does not even know there is an L1 cache, and the L1 cache could be so fast that it finishes its operation before the MMU can react. In contrast, it would be easy for the MMU to send, with the physical address, an extra bit saying "do cache" or "don't cache", depending on the C-bit from the page table.
4 did well, 2 gave backward anwers with reasons that made enough sense to earn partial credit, and 3 gave partial explanations that earned partial credit. 11 gave answers that made no sense, touching on irrelevant issues such as the distinction between I and D cache.

b) If I cache and D cache are distinct, what must self-modifying code do before executing the code that was just modified? (1 point)

Flush (invalidate all entries in) the I cache.
3 did well here, 3 more gave answers that made enough sense to earn partial credit. 12 gave answers with no obviou connection to the problem or that were evidence of significant misunderstanding.