Exam 3: FinalSolutions and Commentary
Part of
the homework for 22C:60, Fall 2007
|
Median = 9.0 X X X X X X _________________X___X_X_X_X_____X_X_X_X_X___X___X_X_X___X_X_______ 0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15.
Mean = 23.55 X X X X X X X X X ________________X_X_X_X_X___X_X_X___X_X_X_X_X___________________ 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40
mean = 19.01 X X X X X X X X X _______X_______________X___X_X_X_X___X_X_X_X_______X_X_X__________ 0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30.
mean = 22.25 X X X X X X X X _______________________X_____X___X_X___X_X___X___X_X_X___X_X_X___X_______ 0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34
mean = 64.81 X X X X X X __________________X_____X___X_____X_X_X_X_X_X_X_______X_X___X___X_X___X_ 20. 24. 28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84. 88 D + | - C + | - B + | - A +
LOAD R3,PARRAY LOAD R4,10 LOOP: STORES R0,R3 ADDSI R3,4 ADDSI R4,-1 BGT LOOP
a) Given that ARRAY is a global array of 10 integers, write appropriate SMAL code to declare ARRAY and PARRAY (1 point)
_____________COMMON__ARRAY,40_______________________________________________ _____PARRAY:_W_______ARRAY__________________________________________________
1/3 did perfectly here. 1/3 did well except for giving an odd size for ARRAY (the most common odd size was 10, forgetting that the size is in bytes, but several gave sizes in the thousands).
b) Write high level language code equivalent in effect to the HAWK code fragment given above in some procedural high-level language. (1 point)
_____for (i = 0; i < 10; i++) array[i] = 0;_________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________
1/4 did well. Almost as many earned no credit at all. The most common solutions that earned partial credit involved deep focus on the control structure while forgetting the loop body (the STORES). Fully half the class made this mistake. Typically, these partial credit answers did not abstract very far from the assembly language code, retaining one line of code per line of assembly code.
c) Loop unrolling might improve the performance of this code. Show the body of the unrolled loop that you would replicate once for each iteration. (1 point)
_____________STORES___R0,R3_________________________________________________ _____________ADDSI____R3,4__________________________________________________ ____________________________________________________________________________
1/3 did well. 1/10 earned no credit. The most common error was to include code in the unrolled version to manipulate R4. 1/3 made this mistake. Since R4 is the loop counter, it serves no purpose when the loop is fully unrolled.
d) Suppose the CPU has a small I-cache. Explain how loop unrolling could actually slow the execution of this code. (1 point)
_____In the unrolled loop, no instruction is ever executed twice____________ _____so the cache cannot help as it does in the iterative version___________ _____If the cache offers more than a 2 times speedup, iterative is faster.__
1/4 did well here. 1/4 earned partial credit by saying that the unrolled version could overflow the instruction cache, but did not explain the performance consequences. 1/2 gave irrelevant answers.
INT POWER10 POWER10: ; on entry, R3 = a, R4 = b ; on exit, R3 = a * 10**b LOOP: a: TESTR R4 b: BLE QUIT ; while (b > 0) { c: SL R3,1 d: ADDSL R3,R3,2 ; a = a * 10 e: ADDSI R4,-1 ; b = b - 1 f: BR LOOP ; } QUIT: JUMPS R1 ; return a
a) Suppose this is run on a pipelined machine. How could you rearrange the code above to minimize pipeline interlocks? (The labels a,b,c,d,e,f are included above so you can give your answer by listing the labels in the order you want the instructions rearranged.) (1 point)
_____________a b c e d f____________________________________________________
1/3 did well here. 2/3 earned no credit by proposing reorderings that changed the meaning of the code. There was no partial credit. Typical changes to the meaning involved changes to the number of times lines c and d are executed, so that the POWER10 routine would no longer compute the reqired value.
b) Suppose you needed to call POWER10 from a separately assembled source file. Write the code you would include in the calling source file (or in a header file used by the calling source file) to support linking to POWER10. (1 point)
_____________EXT_____POWER10________________________________________________ _____PPOWER10:_W_____POWER10__________________________________________________ ____________________________________________________________________________
This was the easiest question on the exam. Only a few students had any difficulty.
c) When a subroutine calls POWER10 is there any need to push and pop the activation record? Why? This is a short answer question! (1 point)
_____________No. POWER10 does not use its activation record_______ _____________nor does it call any other subroutines_________________________
1/4 did well here. 1/3 said No, and then failed to offer a useful explanation, earning half credit. 1/6 gave wrong answers.
2/5 did well here. 1/4 earned no credit. More had difficulty with the inputs than the outputs. A significant number of those who had difficulty added R and S labels on the intermediate paths in the multiplexor that are neither inputs or outputs. These labels were not asked for in the question and are misleading in any attempt to understand the circuit.
_____________ADDSL___R3,R3,2________________________________________________ _____________ADDSL___R3,R3,2________________________________________________ _____________SR______R3,6___________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________
1/3 did well, 2/5 earned no credit. Partial credit was offered to those who divided before multiplying (for a significant loss of precision) or who managed to multiply by strange values not associated with the problem. Among those who did well, there was considerable variation in optimization, so full credit was given for the optimal answer, while small penalties were offered for solutions requiring 4 or 5 instructions.
An exception is a one-word global variable holding a pointer to the activation record of the current handler. If an activation record contains a handler, it must begin as follows:
RETAD = 0 ; return address HANDLER = 4 ; address of the handler's code OLDEX = 8 ; previous value of the exceptionAssume that PRANGE is the (constant) pointer to the global variable RANGE holding the range exception. a) Write code to raise the range exception using this new model. (1 point)
_____________LOAD____R2,PRANGE______________________________________________ _____________LOADS___R2,R2__________________________________________________ _____________LOAD____R1,R2,HANDLER__________________________________________ _____________JUMPS___R1_____________________________________________________ ____________________________________________________________________________
This was a hard problem. One student had a good answer. Many students clearly started writing too soon, copying irrelevant code from the text even though the problem statement said that this design was an alternative to the design in the text.
In thinking about this code, it is important to understand that the text in the problem statement really is a complete description of an alternative to the scheme in the text. Single words carry heavy import here.
b) Given that MYHAND is the label on a handler, write code to save the old range exception and install MYHAND as the current handler. (2 points)
_____________LOAD____R1,PRANGE______________________________________________ _____________LOADS___R1,R3__________________________________________________ _____________STORE___R3,R2,OLDEX____________________________________________ _____________STORES__R2,R1__________________________________________________ _____________LEA_____R1,MYHAND______________________________________________ _____________STORE___R1,R2,HANDLER__________________________________________ ____________________________________________________________________________ ____________________________________________________________________________ ____________________________________________________________________________
As above, this was a hard question. Nobody got it right, although tere was significant partial credit. Certain elements, such as the initial loading of an index register with the pointer to the exception variable, were fairly obvious, as were the loading of the address of the handler and the need to store that address somewhere.
Some students opted to copy the code from the notes, although it used a completely different data structure. That was not worthless, since it did include these key parts.
c) In comparing the exception handling mechanism in the text with the one described here, there are space-time tradeoffs, but there is also a difference in expressive power. How is this scheme less flexible than the scheme discussed in the notes? (2 points)
_____________This alternative model does not permit one subroutine__________ _____________to install handlers simultaneously for multiple exceptions,____ _____________and it makes it messy to nest try/catch blocks for a single____ _____________exception in one subroutine.___________________________________ ____________________________________________________________________________ ____________________________________________________________________________
One got this right. Many answers suggested that students had not taken the time to understand the question. The fact that an exception is a one-word variable, for example, does not limit its expressive power. That word is a pointer to a block of memory (the activation record) that has many fields. This exception handling model is, incidentally, perfectly appropriate for languages such as C++ and Java where there is only one exception, and the (logical) type of the exception is passed as a parameter to the handler.
; see if the protection violation was CPUGET ; all registers have been saved ; R2 points to register save area a: LOAD R3,R2,PCSAVE ; get the saved program counter b: LOADS R4,R3 c: EXTH R5,R4,R3 ; the the halfword it points to ; R4 is the instruction that caused the trap! d: LIL R6,#F0F0 e: AND R5,R6 ; clear all but the opcode bits f: CMPI R5,OPCPUGET ; is it a CPUGET instruction? g: BEQ ISCPUGET h: LOAD R3,R2,PSWSAVE ; it is not CPUGET! i: CPUSET R3,PSW ; restore saved PSW j: LOAD R1,PPROGERR k: LOAD R2,R1,EXAR l: LOAD R1,R1,EXHAND m: CPUSET R1,TPC n: CPUGET R0,TPC ; raise PROGRAM_ERROR exceptionRecall that R2 points to the register save area, and that the identifiers R1SAVE, R2SAVE etc. are used for the offsets of the saved values of registers.
a) Lines j through l begin to throw a program-error exception. In the normal sequence of instructions for throwing an exception, lines m and n would be replaced by JUMPS R1. Why did we need to use this strange code instead? (1 point)
_____________JUMPS R1 and the CPUSET/CPUGET sequence above have the_________ _____________same effect on the PC, but the CPUGET instruction also_________ _____________restores the LEVEL field of the PSW.___________________________
1/3 got this right. One got partial credit for an answere that was relevant but never got to the point.
b) Suppose we decide that all trap handlers will return to the user program by a jump to TRAPEXIT (that was the code to restore registers and return). The above code doesn't do this! Instead, it does its own return in line n. Write replacement code for lines h through n that has the same net effect (raising the program error trap) but does so by altering the saved registers and then jumping to TRAPEXIT where they are restored. (3 points)
_____________LOAD____R1,PROGERR_____________________________________________ _____________LOAD____R3,R1,EXAR_____________________________________________ _____________STORE___R3,R2,R2SAVE___________________________________________ _____________LOAD____R3,R1,EXHAND___________________________________________ _____________STORE___R3,R2,PCSAVE___________________________________________ _____________JUMP____TRAPEXIT_______________________________________________ ____________________________________________________________________________ ____________________________________________________________________________
One got this right, 1/3 got no credit. In general, troubles revolved around storing to R2SAVE and PCSAVE. A surprising number of students had difficulty with the final JUMP and tried replacing it with sequences ending with a JUMPS. This new code is significantly simpler to understand than the original code it replaces!