Exam 3: Final

Solutions and Commentary

Part of the homework for 22C:60 (CS:2630), Fall 2011
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Exam 3

mean   = 6.72          X
median = 7.0           X         X X
               X X     X   X     X X         X   X
  _______X_____X_X_____X___X___X_X_X_________X___X_X_X_____
    0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13

Total Exams 1 to 3

                                     X
mean   = 17.93                       X
median = 16.5                        X
                           X     X X X X X               X
  _____________________X___X___X_X_X_X_X_X_X___X___X_X___X_____X___
    0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Total Machine Problems 1 to 5

                                         X
mean   = 16.88                           X
median = 18.0                          X X     X
                             X       X X X X   X       X
  _______________X___X_______X_X_X_X_X_X_X_X___X_______X__
    0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24.

Total Top 10 of 12 Homework Scores

mean   = 22.03                                           X
median = 23.3                                X         X X
                             X               X X       X X   X
  _____________________X_____X_______X_X_X___X_X_X_X_X_X_X_X_X_____
    0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Overall Total and Grade Distribution

mean   = 56.84
median = 56.1                    X   X   X
                                 X   X X X X
  _______X_____________X_______X_X_X_X_X_X_X_X___X___X_X_X_____X_X_____
    20. 24. 28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84
     - D D D D + + - - C C C C + + - - B B B B + + - - A A A A + +

Solutions and Commentary

Background: Consider the problem of going to the label THERE on the Hawk.
a) Your program contains the code BR PLACE. It used to work, but you added some code, and now the assembler gives an "out of bounds" error. How would you rewrite this line to fix the error? (1 point)

________JUMP____PLACE_______________________

____________________________________________

____________________________________________

About 3/5 of the class got this right. A few got partial credit for answers involving syntax errors JUMPS PLACE or chains of branches to branches to branches. Popu

b) Your program also contains BGT PLACE and your additions cause another "out of bounds" error here. How would you rewrite it to fix the error? (1 point)

________BLE_____L___________________________

________JUMP____PLACE_______________________

__L:________________________________________

Only 2 got this right. Partial credit was available to those who used chains of branches to branches, or who accidentally flipped the sense of the conditional by using BGT. 2/5 earned no credit.

c) Your program also contains JSR R1,SUBR and as you added a huge amount of code to your program, this also stopped working. How would you rewrite it to fix the error? (1 point)

________LIL_____R1,SUBR_____________________

________JSRS____R1,R1_______________________

____________________________________________

Only 3 got this right. Many earned partial credit, substituting LOAD for LIL or JUMPS for JSRS. 1/2 earned no credit.

_5 _6 _E _3

_1 _2 _3 _4

_7 _8 _C _3

A Problem: Show the machine code for LIW R3,#12345678 Give your answer in the blanks to the right as a sequence of 16-bit halfwords in hex. (1 point)

3 got this right. Common errors earning partial credit included incorrect byte order or (worse) inconsistent byte order, incorrect opcodes (for some reason, ORIS was harder than LIL), and breaking up the segments of the long constant incorrectly (an issue distinct from byte order). and Over 1/3 earned no credit.
Note that this question was extremely predictable. Everyone should have expected something like it on the exam, given that it followed naturally from problems that gave people trouble on the first midterm.

SUBR: TBIT R4,0 BBS SUB1 SL R3,1 SUB1: TBIT R4,1 BBS SUB2 SL R3,2 SUB2: TBIT R4,2 BBS SUB3 SL R3,4 SUB3: TBIT R4,3 BBS SUB4 SL R3,8 SUB4: TBIT R4,4 BBS SUB5 SL R3,16 SUB5: JUMPS R1

A Problem: You inherited the code shown to the right from a programmer who was careless about writing any comments. What does it do? A correct answer will give an overall explanation of what this this code does, and not a line-by-line commentary. (2 points)

_______ Subroutine to shift R3 left ________

_______ by the one's complement of R4. _____

_______ (only R4 bits 0 to 4 matter) _______

__________ eg: R3 = R3 << ~R4 ______________

Nobody did perfectly. The best answer, given by 1/6 of the class, was R3 = R3 * 2^{2^R4}. There is something bizarre about the double exponentiation in this answer, but at least is expresses the idea that R3 is shifted an amount determined by R4. Over half the class earned no credit.
Note: There was an error that made the problem harder than intended: The ones complement was never intended. All of the BBS instructions should have been BBR, so it should have done R3 = R3 << R4.

Background: Consider this subroutine to convert to integer from a familiar 6-bit floating point format:

        ;  |_|_|_|_|_|_| s = mantissa sign
        ;  |s|  e  | m | e = exponent, 111: not a number
        ;                              100: represents zero
        ;                              000: not normalized
        ;                m = mantissa, 0.XX if exponent = 000
        ;                              1.XX for other exponents

        FTOINT: MOVE    R4,R3   ; copy the number f (for exponent)
                SR    = R4,2
                TRUNC = R4,3    ; exp = (f >> 2) & 7
                CMPI  = R4,#7
                BEQ     FTOINN  ; if (f = 7) go throw NAN exception
                CMPI  = R4,#2
                BLE     FTOIRZ  ; if (f <= 2) go return zero
                MOVE    R5,R3   ; -- copy f so we can check the sign later
                TRUNC = R3,2
                ADDSI = R3,4    ; mant = (f & 3) + 4 -- set the hidden bit
                SL      R3,1    ; mant = mant << 1   -- a bit extra precision
        FTOILP: CMPI  = R4,#3
                BEQ     FTOILQ  ; while (exp != 3) {  -- shift point over
                SR      R3,1    ;   mant = mant >> 1
                ADDSI   R4,-1   ;   exp = exp - 1
                BR      FTOILP  ; }
        FTOILQ: ADDSI   R3,1    ; -- round to the nearest integer
                SR      R3,1    ; mant = (mant + 1) >> 1 -- discard extra bit
                BITTST= R5,5
                BBR     FTOIQT  ; if (sign bit of f is set) {
                NEG     R3      ;   mant = -mant
        FTOIQT:                 ; }
                JUMPS   R1      ; return mant
        FTOIRZ: LIS     R3,0
                JUMPS   R1      ; return 0
        FTOINN: LIL     R1,NAN_EXCEPTION
                LOAD    R2,R1,EXAR
                LOAD    R1,R1,EXHAND
                JUMPS   R1      ; throw NAN_EXCEPTION

a) This code rounds to the nearest integer as it does the conversion. Cross out all instructions involved with rounding so what remains returns the result truncated toward zero. (1.5 points)

The last two strikeouts (the ADDSI and SR instructions) were identified by most of the class. The first strikeout above (the SL instruction) was harder; it balances the SR done after rounding. About 1/5 of the class got this.

b) Among the instructions that remain, mark the ones that would need changes to make this code convert from IEEE format to integer. (2.0 points)

This question was only reasonable because thie 6-bit floating point format is actually fully compatable with IEE format except that the exponent and mantissa fields have been greatly shortened. The lines marked by = must be changed. In all cases, the value that needs changing is in the final field of the instruciton. In each case, this value relates to the size of some field of the floating point number.
This was a hard problem. Many students appear to have marked lines almost at random, and it is difficult to generalize about the errors.

A Problem: You set out to measure the speed of a program that uses a linear search. You expect the program to take time proportional to the size of the data structure being searched. When you run the code and plot the run time as a function of the data structure size, you get the curve shown to the right. The curve has "knees" at about 2K and 8K where the slope changes abruptly.
a) How big is the L1 cache on this machine? (0.5 point)

_____2K_____________________________________
b) How many levels of cache have you discovered by running your program? (0.5 point)

_____2______________________________________
c) Which cache limits the speed for data structures filling between 2K and 8K bytes? (0.5 point)

_____L2_____________________________________

This was the easiest question on the exam. Half the class did perfectly on all 3 parts.
A few people got partial credit on part a), giving 8K, and 1/5 gave truly odd answers ranging from 1.5K to 32K.
In general, part b) was the easiest. Over 2/3 gave correct answers and the remainder got partial credit by stating that there were 3 cache levels.
1/6 of the class had serious trouble with part c), giving answers such as I cache or D cache or even worse. The information given in the question hints at a multilevel cache, but it does not give any evidence of any separation between I cache and D cache.
Background: Consider the following code from the Hawk monitor:
```
        INT     GETCHAR
GETCHAR:                        ; get char from keyboard
                                ; return char in R3, wipe out R4
        LIW     R4,KBDBASE+KBDSTAT
GETCLP:                         ; loop awaiting input
    =   LOADSCC R3,R4           ; test status
    =   BZS     GETCLP
        LIW     R4,KBDBASE+KBDDATA
        LOADS   R3,R4           ; get char
        JUMPS   R1              ; return
```
Recall that the keyboard status register contains a ready bit (bit zero), an error bit (bit 6), and an interrupt enable bit (bit 7). The error bit is set when a key is pressed while the ready bit is already set.
a) Many students working on MP5 noticed that if they typed too fast, their program lost control. Identify and explan the error in the above code that caused this. (2 points)

__ It does not distinguish between the _____

__ error and ready conditions, and it ______

__ does not reset the error bit. ___________

____________________________________________

1/8 did well here, and 2/5 earned no credit. Partial credit was offered to those who identified the problem area in the code but failed to offer clear explanations of what went wrong.

b) This error can be fixed by replacing two lines of the above code with three lines. Give the correction here: (2 point)

_____=__LOADS___R3,R4_______________________

_____=__BITTST__R3,0____;_(test_KBDRDY)_____

_____=__BBR_____GETCLP______________________

1/5 did well here, while 2/5 earned no credit. A significant number got partial credit for inserting code from the notes without proper regard to context -- substituting LOAD R3,R4,KBDSTAT for the LOADS instruction above. This fails because the index registers are used differently in the code in the notes and the code quoted above from the Hawk monitor.
Background: Consider the problem of rotating the contents of R3 8 places right. With each rotation step, all the bits are shifted right except the least significant bit, which replaces the former sign bit. You could just repeat the following instructions 8 times:
```
                SRU     R3,1            ; move everything one place right
                ADJUST  R3,CMSB         ; put the old LSB in the new MSB
```
This takes a total of 16 instructions to rotate 8 places. We can do it in 5 instruction:
```
                MOVE    R1,R3           ; make a copy of the low 8 bits
                SRU     R3,8            ; move the high 24 bits 8 places right
                
                SL____  R1,12_______
                
                SL____  R1,12_______    ; move the low bytes up to the high end
                
                OR____  R3,R1_______    ; merge the results
```
a) Fill in the missing code in the blanks provided above. (2 points)

Only 2 did well, while 3/5 of th class earned no credit. Common errors among those earning partial credit included 1/6 of the class, who invented the SLU instruction (presumably Shift Left Unsigned), and 1/5 of the class who used a shift count of 24 on the SL instruction, despite the fact that the shift count field of the instruction halfword is only 4 bits.

b) This could have been done in one less instruction. How? Hint: MOVE was a mistake. (1 point)

___ Change MOVE R1,R3 to MOVESL R1,R3,12 ___

___ and eliminate one SL instruction _______

Just 1 did well here, while 4/5 of the class earned no credit. The remainder used a shift count of 24 on the MOVESL instruction, earning partial credit.

c) If this had been a 16-bit rotate, it could have been even faster. Why? (1 point)

___ We can use MOVESL R1,R3,16 _____________

___ with no added SL instruction ___________

2 did well, while the remainder earned no credit.
Background: Consider the following two bits of code to move a 64-bit quantity from the location pointed to by R3 to the location pointed to by R4.
```
                LOADS   R5,R3        ; move low half    
                STORES  R5,R4
                LOAD    R5,R3,4      ; move high half
                STORE   R5,R4,4

                LOADS   R5,R3        ; get low half
                LOAD    R6,R3,4      ; get high half
                STORES  R5,R4        ; put low half
                STORE   R6,R4,4      ; put high half
```
A problem: Why is the second fragment of code faster on most modern computers? (1 point)

___ it cuts pipeline delays by not doing ___

___ LOAD then STORE on the same register ___

1/2 earned full credit, while 2/5 earned no credit. Among those earning partial credit, a common failure was to assert that the instructions could run in parallel or simultaneously (as opposed to partially in parallel), or to hint at the existence of some kind of pipeling or parallelism while not coming out and saying it.