Final Exam, Spring 2003

Part of 22C:122, Spring 2003
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Note: This is an open-book, open-notes exam. Please write your answers in the exam booklet supplied. Leave blank lines between your answers to different questions, and leave a margin down the side of each page. Please write your name in the form it appears on your university ID card; write your name in the blank provided on the front cover. This exam is worth 30 points and has 15 parts! Allocate about 8 minutes per part.

General Background: Several problems on this exam refer to the PµCPL Architecture (Pipelined MicroController with Programmable Logic), a more detailed version of the architecture proposed in the study questions distributed for this exam. Here are the relevant parts of the PµCPL specification:

A 16-bit word
1. 64K word-addressable memory address space
2. low memory (starting at zero) is read-only
A 4-stage pipe:
1. IF -- Instruction Fetch
2. AC/OF -- Address Computation / Operand Fetch
3. MA/OP -- Memory Access / Operate
4. RS -- Result Store
7 General Registers, R[1] to R[7], one word each
- R[0] is a special case

Instruction Formats:

	 ___ _____ _____ _______________ 
	|0 0|  r  |  x  |      disp     |  store M[R[x] + disp] = R[r]
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
	|0 1|  r  |  x  |      disp     |  load R[r] = M[R[x] + disp]
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
	|1 0|  r  |  x  |      disp     |  immed R[r] = R[x] + disp
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
	|1 1|  r  |  x  |    op   |  d  |  operate R[r] = R[x] op R[d]
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|

Special cases for R[0]

R[0] can mean either the program counter or the constant zero, or in contexts where neither of these makes sense, the a value of 000 in the r, x or d field can be used to select other opcodes. This will be the subject of some of the exam questions.
Programmability

The interpretation of the op field of the operate instructions by the ALU stage of the pipeline is entirely programmable and will be the subject of some of the exam questions.

The Exam

Background: The PµCPL Architecture looks like there is no room for expansion in the instruction set because all 4 combinations of 2 opcode bits are taken! This is not true because R[0] is a special case.
Consider each of the following and identify the appropriate interpretation or appropriate interpretations of R[0]. It could mean PC, it could mean the constant zero, it might be useful as both, and it might be that neither makes sense; in that case, you have found room to expand this instruction set. An example is given at the end of this list:
```
	 ___ _____ _____ _______________ 
Part a)	|0 0|  r  |0 0 0|      disp     |  store M[ ?   + disp] = R[r]
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
Part b)	|0 0|0 0 0|  x  |      disp     |  store M[R[x] + disp] =   ?
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
Part c)	|0 1|  r  |0 0 0|      disp     |  load R[r] = M[ ?   + disp]
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
Part d)	|0 1|0 0 0|  x  |      disp     |  load  ?   = M[R[x] + disp]
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
Part e)	|1 0|  r  |0 0 0|      disp     |  immed R[r] =  ?   + disp
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
Example	|1 0|0 0 0|  x  |      disp     |  immed ?    = R[x] + disp
	|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
```
In the final example, if R[0] cannot mean the constant zero, since storing to a constant is meaningless. If R[0] means PC, however, we have a very useful instruction, an indexed branch; with a displacement of zero, this is a classic function return, and if x=0 also means the program counter, this could be a PC-relative branch.
Background: One thing missing from the PµCPL instruction set is any specialized support for function calls. For the sake of this question, ignore what you did in the previous question and simply assume that R0 always refers to the program counter. Consider the following proposal for a function call linked through register 2:
```
	IMMED 2,0,3     R[2] = PC + 3
	LOAD  PC,PC,1   PC = M[PC + 1]
        CONST F         -- address of F
```
Part a) Suppose we'd had enough "opcode space" to allow a dedicated call instruction, or more properly, a branch and link instruction that did everything a normal PµCPL jump instruction does but also saves the return address in a register. How much faster, in clock cycles, would this new call be than the call given above?
Part b) If the PµCPL pipeline is implemented the naive way, the CPU will read the constant F from memory twice. If we add some forwarding logic, we can eliminate one of these fetches. Explain why the double read, explain where the value of F can be forwarded from to eliminate the extra read, and describe, at an abstract level, the forwarding predicate that is required.
Background: In typical microcontrollers, all of RAM and ROM is on the same chip as the processor. The small size of the main memory on such systems and the fact that no off-chip data paths are required for access to memory makes it possible to use memory with a cycle time that is closely matched to the CPU clock rate.
A question: What memory arbitration and cache structure would you recommend for use in the PµCPL? A top-level description will suffice.
Background: Assume that the PµCPL operate instructions are entirely based on programmable logic. That is, the meaning of the op field is entirely dependent on the values loaded into the program registers of the programmable logic. Assume that of the data paths in the load, store or immediate operations use programmable logic. Assume that, at power-up, the processor begins executing code at memory location zero.
Part a) What must you assume about the mechanisms for accessing the programmable logic in order to bootstrap this machine?
Part b) What must you assume about the mechanisms for accessing the programmable logic in order to allow interrupt handling on this machine?
Background: A fully general programmable logic system for the PµCPL operate instructions may not be very useful. Assume, instead, that there is a general purpose ALU with a shift network attached to one input, a shift network attached to the result output, and programmable carry propagation between successive digits.
Part a) How many bits would you expect to be needed control a general purpose shift network for the PµCPL? Explain your answer! That is, what might you expect each bit or group of bits control?
Part b) A typical ALU has 4-bits of function select, if we ignore control over carry propagation. Given this, how many bits would you want in the op field for complete control over the ALU and shifters?
Part c) Here is a proposal for the programmable logic in our ALU: In addition to a carry propagation control register, we allow each bit of control input to the ALU and shifters to be computed as the programmed and of any of the bits in the op field, or their inverses. How many bits does it take to program each bit of control input, and how many control registers would you use to hold these bits?
Part d) Combining your answers to parts b and c, how many control registers are required to control the programmable ALU stage of our pipeline?
Background: If a PµCPL program first sets one of the operate stage programmed logic control registers, and also, in an adjacent instruction, does an operate instruction, there may be a need for pipeline interlocks. The logic control register load could be finished in one of the following pipeline stages:
- AC/OF -- Address Computation / Operand Fetch
- MA/OP -- Memory Access / Operate
- RS -- Result Store
Problem: The interlock required and the need for interlocks depends on which stage does the work. For each of the possibilities listed above, is interlock logic required? In each case, must an operate instruction be blocked to await the loading of a logic control register, or must the loading be blocked to await completion of an operate instruction?