Always, on every assignment, please write your name legibly as it appears
on your University ID and on the class list! All assignments will be due
on Fridays at the start of class, and unless there is what insurance companies
call "an act of God", the only exceptions to this rule will be by advance
Consider again (you may be getting tired of this) the example architecture
from the midterm. This machine has two pipeline stages that commit the results
of instructions, the ALU/MEM stage commits the results of store instructions,
while the OS stage commits the results of LOAD, LOAD-ADDRESS and OPERATE
The trap conditions that might concern us on this machine include traps
raised because of unimplemented physical memory addresses, write operations
to read-only memory, and any unimplemented ALU operations.
a) What pipeline stage(s) cause(s) traps.
b) When a trap condition is detected:
What stages have their contents invalidated, what stages are allowed to
finish this pipeline cycle, what stage injects this new instruction, and
what version of the PC is saved.
- The contents of some pipeline stages are invalidated
- The contents of some pipeline stages are unchanged
- The contents of one stage become a special instruction
that saves some version of the PC in the trap-save register.
- The PC in the INSTRUCTION-FETCH stage is changed to the first location
in the trap service routine.
Hint: PC relative addressing also required that we have multiple
versions of the PC
c) Are traps implemented in this way precise or imprecise?
Consider a pipelined machine with a 1 nanosecond pipeline clock
cycle, where the memory word is large enough to contain two
instructions, so every other instruction fetch involves one memory
cycle, on the average, and where 1 instruction in 4 is a load
instruction and 1 instruction in 8 is a store instruction. Assume
an I-cache that achieves a 10% miss rate, and a write-through D-cache
with a 10% miss rate. Both caches share one memory port through
memory arbitration logic.
a) How many memory cycles per second does this processor generate?
(you might want to also express your answer in nanoseconds per memory cycle.)
b) What fraction of the instruction fetches will be blocked by memory
In the notes for lecture 36, there is an algorithm given for
write-through-cached-ram. Rewrite this so, that, instead
of using linear search, it uses a 256-word small and very fast RAM,
as described in the next section of the notes. Assume that the system word
size for data and addresses is 32 bits, and you may use two RAMs to build
the cache, one for data, one for tags, if you want.