Assignment 8, due Mar 28

Solutions

Part of the homework for 22C:60 (CS:2630), Spring 2014
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On every assignment, write your name legibly as it appears on your University ID card! Homework is due on paper at the start of class on the day indicated (usually Friday). Exceptions will be made only by advance arrangement (excepting "acts of God"). Late work must be turned in to the TA's mailbox (ask the CS receptionist in 14 MLH for help). Never push homework under someone's door!

Background: Examine the circuit diagram for the adder modified to perform logic at the near the end of Chapter 8 of the notes. Assume your ALU includes a bank of 32 of these circuits with the carry out of each connected to the carry in of the next.

A problem: Make a table showing all possible combinations of f₁, f₂, f₃, and c₀ (the carry in to the least significant bit). For each row in this table (that is, for each combination of the inputs) document the function performed by this bank of logic circuits. (for example, one row might compute A+B+1, while some other row might compute A&B, and yet another row might perform some useless function. (1.0 points)

f₁ f₂ f₃ c₀ Function

0 0 0 0 —
0 0 0 1 —
0 0 1 0 a & b
0 0 1 1 —
0 1 0 0 —
0 1 0 1 —
0 1 1 0 —
0 1 1 1 —
1 0 0 0 a ⊕ b
1 0 0 1 —
1 0 1 0 a | b
1 0 1 1 —
1 1 0 0 a + b
1 1 0 1 a + b + 1
1 1 1 0 —
1 1 1 1 —

A quick question: What is loop unrolling? (Hint: Quoting Wikipedia here would be a very bad idea. It gives an awful definition, and then gives some excellent examples. Chapter 9 in the notes also gives good examples. (0.4 points)

Loop unrolling involves repeating the body of the loop in order to reduce the number of iterations.

Background: Compare the code for faster binary multiplication on the Hawk (about 1/5 of the way into Chapter 9) with the code for fast signed multiplication on the Hawk. The most obvious difference between these two is that the signed version has a special case for the sign bit of the multiplier before the main loop of the multiplication. There are some other important differences.
A Problem They shift different things. Very briefly state what each routine shifts and which way it shifts it. (0.4 points)

The routine for faster binary multiplication shifts the multiplier right and the multiplicand left.
The routine for fast signed multiplication shifts both the multiplier and the product left.

f₁	f₂	f₃	c₀	Function
0	0	0	0	—
0	0	0	1	—
0	0	1	0	a & b
0	0	1	1	—
0	1	0	0	—
0	1	0	1	—
0	1	1	0	—
0	1	1	1	—
1	0	0	0	a ⊕ b
1	0	0	1	—
1	0	1	0	a \| b
1	0	1	1	—
1	1	0	0	a + b
1	1	0	1	a + b + 1
1	1	1	0	—
1	1	1	1	—

A quick question: Give the fastest code you can work out to multiply by 75 on the Hawk. (0.4 points)

        ADDSL  R3,R3,2    ; R3 times 5 \
        ADDSL  R3,R3,2    ; R3 times 5  > since 75 = 3 × 5 × 5
        ADDSL  R3,R3,1    ; R3 times 3 /

Background: In C or C++ if i is declared as a 64-bit unsigned integer, you can shift i right 12 bits by writing i>>12. Now, suppose you have to do this on the Hawk. If we assume that the 64-bit value is in R3 (the least significant 32 bits) and R4 (the most significant bits), we could do this with 12 repetitions of the following three instructions:
```
        SRU     R3,1
        SRU     R4,1
        ADJUST  R1,CMSB
```
The above code is from section 11.2 of the Hawk manual.
A Problem: There is a much faster way to do this using 2 12-bit right-shifts, a 20-bit left shift and a small number of auxiliary instructions. Give the best code you can for this. The very best code is surprisingly compact. (0.8 points)
```
        MOVESL  R5,R4,10
        SL      R5,10
        SRU     R3,12
        SRU     R4,12
        OR      R4,R5
```