Machine Problem 5, due Nov. 10

Part of the homework for CS:2630 (22C:60), Fall 2014
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Background: The Sparrowhawk architecture is a subset of the Hawk architecture; see Section 16 of the Hawk manual for the official details of the subset. All programs that run correctly on the Sparrowhawk run identically on the Hawk, but because the Sparrowhawk does not support any 32-bit instructions, many Hawk programs will give Instruction Trap errors when run on the Sparrowhawk.

A version of the Hawk monitor exists for the Sparrowhawk. If you would have used these commands to link and run a Hawk program under the Hawk monitor:

[yourname@serv15]$ link main.o subroutine.o
[yourname@serv15]$ hawk link.o

you can use these commands to link the same code under the Sparrowhawk monitor and run it on the Sparrowhawk:

[yourname@serv15]$ link -m sparrowmon.o main.o subroutine.o
[yourname@serv15]$ sparrowhawk link.o

Of course, if the code you link contains any 32-bit instructions such as LIL or ADDI, the code will give an Instruction Trap error.

One approach to nearly complete compatibility with the Hawk is a set of macros that replaces hawk.h with new macros that stay within the Sparrowhawk instruction set. For example, where hawk.h uses something like this macro to assemble the LIL instruction:

MACRO LIL =dst,=const
  B #E0 ! dst
  T const
ENDMAC

We could, instead, use something like this:

MACRO LIL =dst,=const
  LIS   dst, (const >> 16)
  ORIS  dst, (const >>  8) & #FF
  ORIS  dst, (const      ) & #FF
ENDMAC

Substituting sparrowhawk.h macros for the hawk.h macros does this. Note, however, the "something like" qualifier that was repeated twice above! The hawk.h macros do more than was mentioned above because they try to detect illegal arguments (misuse of registers or constants too big for 24 bits). The LIL macro in sparrowhawk.h tries to use as few instructions as necessary to load the constant, omitting one or both ORIS instructions, if possible, and it supports external symbols by a clever trick.

As a result, if this is your hawk program:

        TITLE   "Something"
;       various comments

        USE     "hawk.h"

...  the body of your program ...

        END

You must rewrite it like this to use the Sparrowhawk instruction set.

        TITLE   "Something"
;       various comments

        USE     "sparrowhawk.h"

...  the body of your program ...

        CONSTPOOL
        END

The call to the CONSTPOOL macro at the end is needed to support use of external simbols as arguments to LIL. Without this, you would have difficulty writing code such as LIL R1,PUTCHAR to do output.

The posted solution to MP4 has been tested under sparrowhawk.h and it runs perfectly with no changes to the body of the program. The test program mp4test.o is Sparrowhawk compatible, so if you make the changes above to a solution to MP4, you should be able to link and run it on the Sparrowhawk.

Problems with sparrowhawk.h: The code for LIL in sparrowhawk.h is reasonably optimized. The same cannot be said for the code for the LOAD, STORE and related instructions. These always compute their effective address in R1, even if they could do better, and they always use LIS followed by ORIS even when they could use just LIS to load an 8-bit displacement.

Using R1 for effective address computation makes the macros simpler, but in the case of LOAD, LOADCC and LEA, the macros could compute the effective address directly in the destination register, reducing the number of conflicts with Hawk programs that user R1.

A second problem is that LEACC (which is also ADDI) does something stupid: It first computes the effective address, and then it uses a TESTR instruction to set the condition codes. A smarter version of this macro would simply use ADD to both compute EA and set the condition codes.

Any time the code uses more instructions than strictly necessary, the code gets bigger, and this will break some control structures that use branches by pushing the destination address out of the relatively short range of legal branch addresses.

The Problem: Your job is to write better versions of these macros so that code assembled with sparrowhawk.h and your modifications runs faster and takes up less space, and offers fewer problems when porting Hawk code to the Sparrowhawk.

Mechanics: The sparrowhawk.h macro file has a switch embedded in it. If you write this in your program:

        USE     "sparrowhawk.h"

you will get the full set of Sparrowhawk macros. If, on the other hand, you write:

        STRICTSPARROW = 1
        USE     "sparrowhawk.h"

you will get only support for the 16-bit short instructions, with no provisions for "faking" the 32-bit instructions from the Hawk. This is useful when writing code that is specifically optimized for the Sparrowhawk, and it allows you to write your own code. For MP5, your test program (perhaps a Sparrowhawk version of MP4) should begin like this:

        STRICTSPARROW = 1
        USE     "sparrowhawk.h"
        USE     "mp5.h"

Here, mp5.h will contain your improved versions of the macros for LEA, LEACC, LOAD, LOADCC, JSR, STORE and LIL, along with the derived instructions TEST, ADDI, CMPI, JUMP and LIW. A version of mp5.h is provided on line that is nothing more than a quotation from sparrowhawk.h. Use this as a starting point for your solution.

Note that assembly listing files, by default, do not list the lines of text produced during macro expansion. By default, the assembly listing file only contains the object code generated by that expansion. When you are debugging a macro, it is useful to be able to see how macros are expanded. To do this, use the SMAL LIST directive. If you simply put LIST 1000 at the head of an assembly language program, it will turn on listing for all of the macro expansion everywhere in the program. This may be too much. The assembly of a single branch instruction under the macros in hawk.h will produce 17 lines of listing, mostly involving macro code involved with error checking. If you put LIST +1 and LIST -1 in a macro body surrounding something you want listed, that will be listed while listing for the remainder of the macro body is suppressed. If one macro calls another and you want to see how something in the second macro works, use LIST +2 and LIST -2 around that.

The macros in sparrowhawk.h and mp5.h use some tricks:

qLCSAVEq, qEACOMPq and similar names are used for symbols that should never be used outside this header file. The SMAL assembler does not have real local symbols, so we use the letter q as a quote mark on names we wish were local.
IF LEN(arg)>0 is used to see if a macro parameter is present, for example, to distinguish between PC-relative and indexed versions of the LOAD instruction.
IF TYP(arg)=0 is used to see if a macro parameter is absolute. Nonzero assembly-language types indicate relocatable or external values.
qCONSTPOOLq is used as a second location counter for a block of memory immediately after the memory used to hold the code being assembled. All constants that cannot be stored in-line in the sparrowhawk code are put here.

Submission: As with MP1, use the on-line Online Coursework Submission tool provided by the Liberal Arts Linux server cluster. Your macros must be in a file named mp5.h (the name is case-sensitive). This is the name you will type in response to the File/directory name prompt. When the submission tool prompts for Course type c_060. Select the assignment directory mp5.

Grading: We will use your macros to assemble a benchmark program. Given that a working solution is distributed to the class to begin with, everyone should be able to turn in a working solution. Solutions will be judged by how much they reduce the size of the object code for the benchmark. Turning in code that makes no improvement indicates that you have done essentially no work.

Of course, the format of the solution will be evaluated. There is generally little reason for elaborate comments for these kinds of macros, but of course, you must take credit for your work in opening comments of the file, and penalties will be assessed for ugly indenting and for sprawling or poorly organized code.