We have 64 opcodes for length 1, 128 for length 2, and 64 for length 3.
         _________________________________________________
	|___I1__|___I2__|___I3__|___I4__|___I5__|___I6__|_...
            |       |       |       |       |       |
            o---    |       |       |       |        -- 
            |  _|_  |       |       |        --o-----  |
            | |   | |       |        --o-----o-|---  | |
            | | F | |       o--------o-|---  | |   | | |
            | |___| o-------|------  | |   | | |   | | |
            |   |   |       |     _|_|_|_ _|_|_|_ _|_|_|_
            |    ---|-------|-----\0_1_2/-\0_1_2/-\0_1_2/
         ___|_______|_______|___  ___|_______|_______|_____
	|__P1B1_|__P1B2_|__P1B3_||__P2B1_|__P2B2_|__P2B3_|_...
Part B, C and D: Give the column labels on the truth-table ... How big is this ROM? ... Give the contents of 2 rows of this ROM ...
F I1B1 I1B2 | Mux Control ------------+-------------- 0 0 | 0 0 1 | 1 1 0 | 1 1 1 | 2
The stupid design, requiring no optimization, requires a 4 word ROM with 2 bits per word. A better design eliminates this entirely and simply uses I1B1 and I1B2 (the two most significant bits of I1) to direclty control custom-designed 3-input multiplexors.This solution is predicated on the fact that later stages in the pipeline don't mind receiving nonsense in unneeded bytes of the instructions. If they require that unneeded bytes be cleared or filled with special default values, a more complex solution will be required.
Notice that the valid bits play no part in this! They are extremely important in the instruction register shift logic, and they determine whether the instruction presented to the pipeline can be executed, but they play no necessary role in data routing.
Boldface indicates changes in the following.
-- assume the following named index registers: -- AR - pointer to the current activation record -- NR - points to new activation record (temporarily) -- RA - register holding return address (temporarily) -- if P points to an activation record -- P->RA holds the return address -- P->DL holds the saved dynamic link -- P->SL holds the saved static link -- we assume that ARS, the size of each activation -- record is known statically; it may vary from -- place to place depending on the number of -- temporaries currently needed, but each time -- any particular instruciton is executed, the -- size is fixed. -------------------- -- calling sequence ADD NR = AR + ARS -- copy parameters into fields of *NR -- save registers into fields of *AR MOVE NR->SL = DST_CONTEXT CALL RA,DST -- restore registers from *AR -------------------- -- receiving sequence DST MOVE NR->RA = RA MOVE NR->DL = AR MOVE AR = NR -------------------- -- return sequence MOVE RA = AR->RA MOVE AR = AR->DL JUMP *RA
Part A: Does this alternative model offer a potential performance advantage? Why or why not?
Removing instructions from the pipeline as soon as possible eliminates potential operand conflicts by making the operands available sooner. Furthermore, once the instruction is completed, the resulting bubble in the pipeline will insulate the upper level in the pipe from stalls in lower levels.
Part B: Does this alternative model pose potential problems for interrupt and trap handlers? Explain!
If the instruction sequence A B C is executed in a conventional pipeline, A will be completed first and C will be completed last. If, however, instruction A is completed in stage 5 of the pipe, while B is completed in stage 3 and C is completed in stage 1, They will be completed in the order B C A. In sum, the alternative leads to out-of-order execution.Out-of-order execution requires new interlock logic, but it also poses problems with interrupts. Consider the sequence A B C D where A enters the pipeline first but B is completed first. If an interrupt is requested after B is completed but before A is completed, we must find a way to execute A and then C, but not B, on return. Either that, or we must find a way to inhibit out-of-order execution before the interrupt is handled.