Assignment 6, due Jun 28

Part of the homework for CS:2630, Summer 2018
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

On every assignment, write your name legibly as it appears on your University ID card! Homework is due on paper at the start of class on the day indicated (Tuesday or Thursday). Exceptions will be made only by advance arrangement (excepting "acts of God"). Late work must be turned in to the TA's mailbox (ask the CS receptionist in 14 MLH for help). Never push homework under someone's door!

  1. Background: the Hawk LOAD and STORE instructions, in both long and short form, ignore the least significant 2 bits of the effective address and operate on aligned words in memory. Occasionally, there is a real need to access non-aligned data, for example, to load a non-aligned word. A non-aligned word is a sequence of 4 bytes starting at any address in memory.

    A problem: Write SMAL Hawk code for a subroutine called LOADS because it does the same thing as the instruction with that name, except that it works for non-aligned memory addresses. It takes an address in R3 and returns the value of the word at that memory location in R3.

    Your code should be neat and appropriately commented. Note that all the computation it does can be done in R3 to R7 without any calls to other subroutines, so it does not need an activation record to hold a return address or local variables. This can be done in 16 instructions, but you are more likely to take closer to 20 instructions.

    LOADS:  ; expects R3 -- addr a memory address, with no alignment constraints
            ; returns R3 -- value the word at that address
            ; uses R4-R7 to hold the 4 bytes of the word before combining them
    
            LOADS   R4,R3
            EXTB    R4,R4,R3        ; b0 = M[addr]
    
            ADDSI   R3,1
            LOADS   R5,R3
            EXTB    R5,R5,R3        ; b1 = M[addr + 1]
    
            ADDSI   R3,1
            LOADS   R6,R3
            EXTB    R6,R6,R3        ; b2 = M[addr + 2]
    
            ADDSI   R3,1
            LOADS   R7,R3
            EXTB    R7,R7,R3        ; b3 = M[addr + 3]
    
            MOVE    R3,R7           ; value = b3
            ADDSL   R3,R6,8         ; value = (value << 8) + b2
            ADDSL   R3,R5,8         ; value = (value << 8) + b1
            ADDSL   R3,R4,8         ; value = (value << 8) + b0
    
            JUMPS   R1              ; return value
    

    The above is the tightest solution, but the ending isn't necessarily the most obvious. You could end with something like the following:

            EXTB    R7,R7,R3        ; b3 = M[addr + 3]
    
            MOVSL   R3,R7,8
            ADD     R3,R3,R6        ; value = (b3 << 8) + b2
    
            SL      R3,8
            ADD     R3,R3,R5        ; value = (value << 8) + b1
    
            SL      R3,8
            ADD     R3,R3,R4        ; value = (value << 8) + b0
    
            JUMPS   R1              ; return value
    

    You could also use the OR instruction instead of the ADD instruction to combine the parts, and you could use ORIS to shift the pieces 8 places left. Here is yet another solution, neither optimal nor obvious:

            EXTB    R7,R7,R3        ; b3 = M[addr + 3]
    
            MOVE    R3,R4		; value = b0  -- no alignment needed
    
    	LIS	R4,1
    	STUFFB	R3,R5,R4	; value = value + (b1 << 8))
    
    	LIS	R4,2
    	STUFFB	R3,R6,R4	; value = value + (b2 << 16)
    
    	LIS	R4,3
    	STUFFB	R3,R7,R4	; value = value + (b2 << 24)
    
            JUMPS   R1              ; return value
    

  2. Background: Another string routine that you could write is strcmp(s1,s2). This takes two memory addresses, s1 and s2 as parameters, where each points to a null-terminated character string. The return value is zero if the two strings are equal, positive if s1 comes alphabetically after s2 and negative if if s1 comes alphabetically before s2.

    In more detail, it returns zero if all characters of s1 and s2 are identical up to the trailing null. Otherwise, they must differ at some character i, which is to say s1[i]-s2[i] is not zero. In this case, it returns a positive value if s1[i]-s2[i] > 0, and a negative value if s1[i]-s2[i] < 0.

    a) Write code in a high level language notation for strcmp. You can safely act as if strings are simply arrays of characters. (0.5 points)

    int strcmp(char * s1, char * s2) {
        int i = 0;
        while (s1[i] == s2[i]) {
            if (s1[i] == '\0') break
            i = i + 1
        }
        return s1[i] - s2[i]
    }
    

    b) Write SMAL Hawk code for STRCMP. As much as is possible, use the algorithm you documented above, and make sure your code incorporates comments that convey that algorithm and explain any optimization involved. (1.0 points)

    STRCMP: ; expects R3 = s1, a pointer to a null terminated string
            ; expects R4 = s2, a pointer to a null terminated string
            ; uses    R5 = t1, a character from s1
            ;         R6 = t2, a character from s2
            ;         R7 = dif, result of comparison
            ; instead of using array indexing, it increments the string pointers
    STRCLP:                         ; while (true) -- loop exits are by break
            LOADS   R5,R3
            EXTB    R5,R5,R3        ;   t1 = * s1
    
            LOADS   R6,R4
            EXTB    R6,R6,R4        ;   t2 = * s2
    
            SUB     R7,R5,R6        ;   dif = t1 - t2
            BNE     STRCQT          ;   if (dif != 0) break
    
            TESTR   R5
            BZS     STRCQT          ;   if (t1 == NUL) break
    
            ADDSI   R3,1            ;   s1++
            ADDSI   R4,1            ;   s2++
    
            BR      STRCLP
    STRCQT;                         ; }
            MOVE    R3,R7
            JUMPS   R1              ; return dif
    

  3. Background: The LOADSCC and and LOADCC instructions load a value from memory and set the condition codes. For signed operands, it would be the same to just use LOADS or LOAD followed by CMPI 0 and then use one of BLT, BLE, BEQ, BNE, BGT or BGE to do the comparison.

    On the other hand, if th eoperands are unsigned and we set the condition codes with LOADSCC or LOADCC and then test them with one of BLTU, BLEU, BEQ, BNE, BGTU or BGEU the result will sometimes be wrong.

    a) Which of the above branches will produce wrong results? To answer this question, look up which condition codes these inspect and look at how the instruction sets those condition codes. (0.3 points)

    BLTU, BLEU, BGTU and BGEU all produce wrong results after LOADSCC or LOADCC because those load instructions set the C condition code to mean something other than carry (or not borrow) after subtraction from zero.

    Note that this isn't really a problem, because unsigned values can't be negative, so after comparison with zero, BZR can be used instead of BGTU, BZS can be used instead of BLTU, a simple BR will substitute for BGEU, and since no unsigned value is less than zero, any BLTU you might be tempted to use after comparing with zero can be simply deleted.

    b) What useful purpose is solved by the way the LOADSCC or LOADCC instructions set the condition code or codes that cause the problems you discovered in part a? The answer to this is in Chapter 7 of the notes. (0.2 points)

    The C condition code is set by LOADSCC or LOADCC to indicate that some byte in the loaded value was zero. This is useful when trying to do efficient operations on null-terminated text strings in 4 character blocks.