Assignment 10, due Nov 7Solutions
Part of
the homework for CS:2630, Fall 2019
|
Background: Look at the IEEE floating point format documented in chapter 11. Imagine a very similar floatin point format for 16-bit floating point values, with the following structure:
15 | 14 | 13 | 12 | 11 | 10 | 09 | 08 | 07 | 06 | 05 | 04 | 03 | 02 | 01 | 00 | ||||||||||||||||
s | exp | mant |
As with the IEEE format, the maximum exponent value (here, 1111) is reserved for NaN (not a number), and the minimum value (here, 0000) is usef for un-normalized values. Otherwise, the exponent is biased so that 0111 means 0.
As with the IEEE format, there is a hidden bit, so the normalized mantissa 01001010010 represents a mantissa value of 1.01001010010, and as with the IEEE format, the hidden bit is zero when the exponent has its minimimum value of 0000.
a) What is the binary representation of the smallest positive nonzero value in this number system, and give an algebraic expression in decimal for the value it represents, along with a decimal expression in scientific notation for that value, to the appropriate number of significant figures. (0.5 points)
0 0000 00000000001 = (1.0 × 2-11) × 2-6 = 1.0 × 2-17 ≅ 7.63 × 10-6
b) What is the binary representation of the largest positive legitimate value in this number system, and give an algebraic expression in decimal for the value it represents, along with a decimal expression in scientific notation for that value, to the appropriate number of significant figures. (0.5 points)
0 1110 11111111111 = (2.0 - 2-12) × 27 ≅ 2.56 × 102
MOVE R5,R4 ; bug fixed from original version SR R5,--?-- ADD R5,R5,R3
a) Give the shift count that should be used on the SR instruction to replace the --?--. (0.5 points)
7 &ndash 4 = 3
b) Give the instruction and its operand(s) that should be added to the above code (and indicate where this goes relative to the SR instruction) so that the result of the SR is rounded and not truncated. Note: There are two ways to do this, one with an instruction before the SR and one with a different instruction after the SR. (0.5 points)
Solution 1:
ADDSI R5,4 SR R5,3
Solution 2:
SR R5,3 ADDC R5,R0
The second solution is more obscure and rests on the fact that the last bit shifted out of R5 is left in the C condition code, so adding the C bit to R5 increments it only if the most significant of the discarded bits was 1.
Background: The vector dot product is the sum of the products of corresponding vector elements. We can write this in C as:
float dotprod( const float * a, const float * b, int len ) { float acc = 0.0; while (len > 0) { acc = acc + ((*a) * (*b)); a = a + 1; b = b + 1; len = len - 1; } return acc; }
A problem: Write the equivalent SMAL Hawk code. (1.0 points)
DOTPROD:; expects R3 = a -- pointer to first element of an array ; R4 = b -- pointer to first element of an array ; R5 = len -- count of elements in arrays a and b ; the floating point unit is already be on and selected ; returns R3 = the vector dot product of a and b ; uses R6,7 = temporaries for accessing a and b ; FPA0 = acc, the accumulator for the product ; FPA1 = term, holds each term LIS R6,0 COSET R6,FPA0 ; float acc = 0 TESTR R5 BLE DOTPQT ; if (len > 0) { DOTPLP: ; do { LOADS R6,R3 LOADS R7,R4 COSET R6,FPA1 COSET R7,FPA1+FPMUL ; float term = (*a) * (*b) ADDSI R3,4 ; a = a + 1 -- move to next element ADDSI R4,4 ; b = b + 1 COGET R6,FPA1 COSET R6,FPA0+FPADD ; acc = acc + term ADDSI R5,-1 len = len - 1 BGT ; } while (len > 0) DOTPQT: ; } COGET R3,FPA0 JUMPS R1 ; return acc
There are, of course, many possible solutions. The above solution makes an effort to give the floating-point coprocessor some time after each operation is initiated before it asks for a result.