22C:18, Lecture 27, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

A Problem for the Compiler Writer

In writing a compiler for the example calculator language, much of the work can be done by verbatim copying of blocks of code; to compile the first part of the calculator program E10E1+P, all that is needed is to string together verbatim copies of the instruction blocks for E, 1, 0, E, 1, and +.

Compiling the final P command of the example calculator program is more difficult. The problem is that the Hawk instruction set makes it very difficult to code a call to a print procedure (or any other procedure) without using some form of relative addressing.

Here is the simplest block of code that the compiler can generate when it compiles a P command:

	JSR  R1,PRINT
This uses PC relative addressing, and only works if the PRINT command is within plus or minus 32K bytes of the call. Additionally, our assembler cannot assemble this if PRINT is an external symbol. Alternately, consider:
	LOAD R1,PPRINT
	JSRS R1,R1
This uses PC relative addressing to find the pointer to PRINT, which must be within 32K of the call, but PRINT itself may be anywhere. Another choice is:
	LIL  R1,PRINT>>8
	ORIS R1,PRINT&#FF
	JSRS R1,R1
This produces a block of code that contains no relative addresses, but the assembler cannot assemble it because it is unable to apply either the >> or the & operator to relocatable symbols. Finally, consider:
	LOAD	R1,PPRINT
	JSRS	R1,R1
	BR	BEYOND
PPRINT:	W	PRINT
BEYOND:
This sequence also produces a block of code that contains no relative addresses. The problem here is that it contains an operand that must be word aligned, PPRINT. Copying this block of code to an arbitrary location in a sequence of instructions will therefore have only a 50% chance of working correctly.

In fact, all of the above can be made to work, but each will require that certain problems be solved!

Keeping all blocks word aligned

The final solution to the above problem may be the simplest. This block of code can be assembled correctly so long as PPRINT is correctly aligned. To assure this, what we need to do is align the entire block of code and make sure that, when the entire block is aligned, the label PPRINT is also aligned.

To do the latter, we can look up the number of halfwords in each instruction in the Hawk manual and add up the number from the start of the block to the label we want to align:

	LOAD	R1,PPRINT	; 2 halves
	JSRS	R1,R1		; 1 half
	BR	BEYOND		; 1 half
PPRINT:	W	PRINT		; 2 halves
BEYOND:				; 0 halves
Here, we are lucky, because PPRINT will always be 2 full words (4 halfwords) beyond the start of this block. If it was not, we could insert NOP instructions to pad things out and guarantee its alignment.

Now, how do we guarantee that this block begins on a word boundary? Easy! We simply make sure that the previous block ends on a word boundary. To assure this, we make all blocks begin on word boundaries, and we add NOP instructions, as needed, to make all blocks an even number of halfwords long. The block given in the previous section of the notes for enter is already an even number of halfwords in length, but the blocks for the arithmetic operations are odd, so we have to add no-ops, for example:

plus:   LOADS   R3,R2
        ADDSI   R2,-4
        LOADS   R4,R2
        ADD     R4,R4,R2
        STORES  R4,R2
        NOP		; added to even up the length
How do we guarantee that the whole stream of instructions begins on a word boundary? The SMAL linker guarantees that each block of storage allocated by the COMMON directive begins on a word boundary, so as long as we begin our code at the start of a COMMON, alignment will be guaranteed.

One pleasant advantage of this approach is that all copying of code blocks can be done a word at a time instead of requiring halfword addressing. This will speed the actual compilation process, but the added no-ops and the awkward call structure will slow the execution of the compiled code.

Generating Load Immediate Long sequences

We cannot simply use the assembler to assemble the following block of code and then copy that block into place, unless we abandon relocatable assembly and the use of external symbols:

	LIL  R1,PRINT>>8
	ORIS R1,PRINT&#FF
	JSRS R1,R1
We can, however, write code that puts this block of code into memory when needed. Consider the following C outline of a routine to do so:
	makejsr( lc, addr )
	halfword **lc;
	word addr;
	/* generate a jsr to addr,
           store the code starting at loc */
	{
		/* first put the LIL in place */
		*(byte *)*lc = addr >> 24;   /* high byte of value */
		((byte *)*lc)++;
		*(byte *)*lc = 0xE1;         /* the opcode and register */
		((byte *)*lc)++;
		**lc = (addr >> 8) & 0xFFFF; /* low halfword of value */
		(*lc)++;
		/* then put the ORIS in place */
		*(byte *)*lc = addr&FF;      /* value */
		((byte *)*lc)++;
		*(byte *)*lc = 0xC1;         /* the opcode and register */
		((byte *)*lc)++;
		/* finally put the JSRS in place */
		**lc = 0xF131;		     /* the opcode and registers */
	}
In effect, the above code bypasses the assembler and directly places the desired machine code in memory at the indicated location counter, advancing the location counter by bytes and halfwords as it stores the consecutive bytes and halfwords of the instructions in the block.

Generating Relative Addresses

We can use a similar approach to generate the sequence of bytes and halfwords needed to do a relative memory reference! For example, consider the problem of generating this code sequence:

	LOAD R1,PPRINT
	JSRS R1,R1
Here, we assume that the operand of the LOAD instruction is somewhere within 32K bytes of the instruction itself; for example, in our compiler, we could put pointers to all of the compiler's routines in fixed locations in RAM, perhaps just before the start of the code buffer. If your compiled code needs to call the PRINT, ENTER, DIGIT, ADD and SUB procedures, for example, you might declare your code buffer and procedure pointers as follows:
	COMMON	CODEBUF,#1000
LOCSAVE	=	.
.	=	CODEBUF
; pointers to routines needed by compiled code:
PPRINT	W	PRINT
PENTER	W	ENTER
PDIGIT	W	DIGIT
PADD	W	ADD
PSUB	W	SUB
; the compiled code follows!
CODE	=	.
.	=	LOCSAVE
Of course, different solutions to this problem may require more or fewer procedures!

To generate a call to PRINT, the compiler might call the following routine:

;-----------------------------
; R2 points to activation record with this format
;	0		; save return address
CALPAR= 2		; activation record size
	
CALLPRINT:		; generate a call to PRINT
			; on entry R3 points to where code should be put
			; updates R3 to point just beyond generated code
			; uses R4,R5
	STORES	R1,R2
	LIL	R4,#F1E0; first halfword of LOAD R1, PC-relative
	LOADS	R5,R3
	STUFFH	R5,R4,R3
	STORES	R5,R3	; halfword stuffed in place
	ADDIS	R3,2	; update code pointer
	LOAD	R4,PPPRINT
	ADDIS	R2,CALPAR
	JSR	R1,PCREL; generate a PC relative address to PPRINT
	ADDIS	R2,-CALPAR
	LIL	R4,#F131; JSRS R1,R1 instruction
	LOADS	R5,R3
	STUFFH	R5,R4,R3
	STORES	R5,R3	; halfword stuffed in place
	ADDIS	R3,2	; update code pointer
	LOADS	R1,R2
	JUMPS	R1	; return
Here, we have assumed that PPPRINT holds the address of PPRINT; this is needed because PPRINT is not relatively addressable from this code. The key to the above code is the procedure PCREL that it calls to generate a PC relative address! In C, the code for PCREL might look like the following:
halfword * pcrel( lc, addr )
halfword * lc;
halfword * addr;
{
	*lc = (int)addr - ((int)lc + 2);
	return (halfword *)((int)lc + 2);
}
This is exactly the same computation the assembler does when it generates PC-relative addresses! In fact, tracing through the above code is not very different in complexity from tracing through the macros in the hawk.macs file that instructs the SMAL assembler in how to assemble Hawk instructions.