| Exam 3: FinalSolutions and Commentary
    
     Part of 
      
      the homework for 22C:60 (CS:2630), Fall 2011
      
     
      | 
mean   = 6.72          X
median = 7.0           X         X X
               X X     X   X     X X         X   X
  _______X_____X_X_____X___X___X_X_X_________X___X_X_X_____
    0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13
                                     X
mean   = 17.93                       X
median = 16.5                        X
                           X     X X X X X               X
  _____________________X___X___X_X_X_X_X_X_X___X___X_X___X_____X___
    0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30
                                         X
mean   = 16.88                           X
median = 18.0                          X X     X
                             X       X X X X   X       X
  _______________X___X_______X_X_X_X_X_X_X_X___X_______X__
    0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 
mean   = 22.03                                           X
median = 23.3                                X         X X
                             X               X X       X X   X
  _____________________X_____X_______X_X_X___X_X_X_X_X_X_X_X_X_____
    0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30
mean   = 56.84
median = 56.1                    X   X   X
                                 X   X X X X
  _______X_____________X_______X_X_X_X_X_X_X_X___X___X_X_X_____X_X_____
    20. 24. 28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84
     - D D D D + + - - C C C C + + - - B B B B + + - - A A A A + +
a)
Your program contains the code BR PLACE.  It used to work, but you
added some code, and now the assembler gives an "out of bounds" error.
How would you rewrite this line to fix the error?
(1 point)
________JUMP____PLACE_______________________
____________________________________________
____________________________________________
About 3/5 of the class got this right. A few got partial credit for answers involving syntax errors JUMPS PLACE or chains of branches to branches to branches. Popu
b)
Your program also contains BGT PLACE and your additions cause
another "out of bounds" error here.  How would you rewrite it to fix the error?
(1 point)
________BLE_____L___________________________
________JUMP____PLACE_______________________
__L:________________________________________
Only 2 got this right. Partial credit was available to those who used chains of branches to branches, or who accidentally flipped the sense of the conditional by using BGT. 2/5 earned no credit.
c)
Your program also contains JSR R1,SUBR and as you added
a huge amount of code to your program, this also stopped working.
How would you rewrite it to fix the error?
(1 point)
________LIL_____R1,SUBR_____________________
________JSRS____R1,R1_______________________
____________________________________________
Only 3 got this right. Many earned partial credit, substituting LOAD for LIL or JUMPS for JSRS. 1/2 earned no credit.
| 
                         _5 _6 _E _3
 | 
3 got this right. Common errors earning partial credit included incorrect byte order or (worse) inconsistent byte order, incorrect opcodes (for some reason, ORIS was harder than LIL), and breaking up the segments of the long constant incorrectly (an issue distinct from byte order). and Over 1/3 earned no credit.
Note that this question was extremely predictable. Everyone should have expected something like it on the exam, given that it followed naturally from problems that gave people trouble on the first midterm.
| 
    SUBR:   TBIT    R4,0
            BBS     SUB1
            SL      R3,1
    SUB1:   TBIT    R4,1
            BBS     SUB2
            SL      R3,2
    SUB2:   TBIT    R4,2
            BBS     SUB3
            SL      R3,4
    SUB3:   TBIT    R4,3
            BBS     SUB4
            SL      R3,8
    SUB4:   TBIT    R4,4
            BBS     SUB5
            SL      R3,16
    SUB5:   JUMPS   R1
 | 
Nobody did perfectly. The best answer, given by 1/6 of the class, was R3 = R3 * 22R4. There is something bizarre about the double exponentiation in this answer, but at least is expresses the idea that R3 is shifted an amount determined by R4. Over half the class earned no credit.
Note: There was an error that made the problem harder than intended: The ones complement was never intended. All of the BBS instructions should have been BBR, so it should have done R3 = R3 << R4.
        ;  |_|_|_|_|_|_| s = mantissa sign
        ;  |s|  e  | m | e = exponent, 111: not a number
        ;                              100: represents zero
        ;                              000: not normalized
        ;                m = mantissa, 0.XX if exponent = 000
        ;                              1.XX for other exponents
        FTOINT: MOVE    R4,R3   ; copy the number f (for exponent)
                SR    = R4,2
                TRUNC = R4,3    ; exp = (f >> 2) & 7
                CMPI  = R4,#7
                BEQ     FTOINN  ; if (f = 7) go throw NAN exception
                CMPI  = R4,#2
                BLE     FTOIRZ  ; if (f <= 2) go return zero
                MOVE    R5,R3   ; -- copy f so we can check the sign later
                TRUNC = R3,2
                ADDSI = R3,4    ; mant = (f & 3) + 4 -- set the hidden bit
                SL      R3,1    ; mant = mant << 1   -- a bit extra precision
        FTOILP: CMPI  = R4,#3
                BEQ     FTOILQ  ; while (exp != 3) {  -- shift point over
                SR      R3,1    ;   mant = mant >> 1
                ADDSI   R4,-1   ;   exp = exp - 1
                BR      FTOILP  ; }
        FTOILQ: ADDSI   R3,1    ; -- round to the nearest integer
                SR      R3,1    ; mant = (mant + 1) >> 1 -- discard extra bit
                BITTST= R5,5
                BBR     FTOIQT  ; if (sign bit of f is set) {
                NEG     R3      ;   mant = -mant
        FTOIQT:                 ; }
                JUMPS   R1      ; return mant
        FTOIRZ: LIS     R3,0
                JUMPS   R1      ; return 0
        FTOINN: LIL     R1,NAN_EXCEPTION
                LOAD    R2,R1,EXAR
                LOAD    R1,R1,EXHAND
                JUMPS   R1      ; throw NAN_EXCEPTION
a)
This code rounds to the nearest integer as it does the conversion.
Cross out all instructions involved with rounding so what remains
returns the result truncated toward zero.
(1.5 points)
The last two strikeouts (the ADDSI and SR instructions) were identified by most of the class. The first strikeout above (the SL instruction) was harder; it balances the SR done after rounding. About 1/5 of the class got this.
b) Among the instructions that remain, mark the ones that would need changes to make this code convert from IEEE format to integer. (2.0 points)
This question was only reasonable because thie 6-bit floating point format is actually fully compatable with IEE format except that the exponent and mantissa fields have been greatly shortened. The lines marked by = must be changed. In all cases, the value that needs changing is in the final field of the instruciton. In each case, this value relates to the size of some field of the floating point number.
This was a hard problem. Many students appear to have marked lines almost at random, and it is difficult to generalize about the errors.
 
a)
How big is the L1 cache on this machine?
(0.5 point)
_____2K_____________________________________
b)
How many levels of cache have you discovered by running your program?
(0.5 point)
_____2______________________________________
c)
Which cache limits the speed for data structures filling
between 2K and 8K bytes?
(0.5 point)
_____L2_____________________________________
This was the easiest question on the exam. Half the class did perfectly on all 3 parts.
A few people got partial credit on part a), giving 8K, and 1/5 gave truly odd answers ranging from 1.5K to 32K.
In general, part b) was the easiest. Over 2/3 gave correct answers and the remainder got partial credit by stating that there were 3 cache levels.
1/6 of the class had serious trouble with part c), giving answers such as I cache or D cache or even worse. The information given in the question hints at a multilevel cache, but it does not give any evidence of any separation between I cache and D cache.
        INT     GETCHAR
GETCHAR:                        ; get char from keyboard
                                ; return char in R3, wipe out R4
        LIW     R4,KBDBASE+KBDSTAT
GETCLP:                         ; loop awaiting input
    =   LOADSCC R3,R4           ; test status
    =   BZS     GETCLP
        LIW     R4,KBDBASE+KBDDATA
        LOADS   R3,R4           ; get char
        JUMPS   R1              ; return
Recall that the keyboard status register contains a ready bit (bit zero), an error bit (bit 6), and an interrupt enable bit (bit 7). The error bit is set when a key is pressed while the ready bit is already set.
a)
Many students working on MP5 noticed that if they typed too fast, their
program lost control.  Identify and explan the error in the above code
that caused this.
(2 points)
__ It does not distinguish between the _____
__ error and ready conditions, and it ______
__ does not reset the error bit. ___________
____________________________________________
1/8 did well here, and 2/5 earned no credit. Partial credit was offered to those who identified the problem area in the code but failed to offer clear explanations of what went wrong.
b)
This error can be fixed by replacing two lines of the above code with
three lines.  Give the correction here:
(2 point)
_____=__LOADS___R3,R4_______________________
_____=__BITTST__R3,0____;_(test_KBDRDY)_____
_____=__BBR_____GETCLP______________________
1/5 did well here, while 2/5 earned no credit. A significant number got partial credit for inserting code from the notes without proper regard to context -- substituting LOAD R3,R4,KBDSTAT for the LOADS instruction above. This fails because the index registers are used differently in the code in the notes and the code quoted above from the Hawk monitor.
                SRU     R3,1            ; move everything one place right
                ADJUST  R3,CMSB         ; put the old LSB in the new MSB
This takes a total of 16 instructions to rotate 8 places. We can do it in 5 instruction:
                MOVE    R1,R3           ; make a copy of the low 8 bits
                SRU     R3,8            ; move the high 24 bits 8 places right
                
                SL____  R1,12_______
                
                SL____  R1,12_______    ; move the low bytes up to the high end
                
                OR____  R3,R1_______    ; merge the results
a) Fill in the missing code in the blanks provided above. (2 points)
Only 2 did well, while 3/5 of th class earned no credit. Common errors among those earning partial credit included 1/6 of the class, who invented the SLU instruction (presumably Shift Left Unsigned), and 1/5 of the class who used a shift count of 24 on the SL instruction, despite the fact that the shift count field of the instruction halfword is only 4 bits.
b)
This could have been done in one less instruction.  How?  Hint:
MOVE was a mistake.
(1 point)
___ Change MOVE R1,R3 to MOVESL R1,R3,12 ___
___ and eliminate one SL instruction _______
Just 1 did well here, while 4/5 of the class earned no credit. The remainder used a shift count of 24 on the MOVESL instruction, earning partial credit.
c)
If this had been a 16-bit rotate, it could have been even faster.  Why?
(1 point)
___ We can use MOVESL R1,R3,16 _____________
___ with no added SL instruction ___________
2 did well, while the remainder earned no credit.
                LOADS   R5,R3        ; move low half    
                STORES  R5,R4
                LOAD    R5,R3,4      ; move high half
                STORE   R5,R4,4
                LOADS   R5,R3        ; get low half
                LOAD    R6,R3,4      ; get high half
                STORES  R5,R4        ; put low half
                STORE   R6,R4,4      ; put high half
A problem:
Why is the second fragment of code faster on most modern computers?
(1 point)
___ it cuts pipeline delays by not doing ___
___ LOAD then STORE on the same register ___
1/2 earned full credit, while 2/5 earned no credit. Among those earning partial credit, a common failure was to assert that the instructions could run in parallel or simultaneously (as opposed to partially in parallel), or to hint at the existence of some kind of pipeling or parallelism while not coming out and saying it.