22C:18, Lecture 10, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

Hawk Output

The Hawk machine allows text output through a memory mapped display. This is the same display technology used by most PC and workstation class machines, although the Hawk version we are using is somewhat limited in its capabilities. Specifically, because our Hawk emulator is accessable using minimal dialup facilities, its display hardware is text-only.

Our display hardware is attached to the Hawk machine at address #FF000000. The hardware contains two read-only registers, one giving the number of rows and one giving the number of columns on the machine's display, and it contains the "video RAM" that is used to hold the text being displayed. The size of the video RAM depends on the window size you setup before starting the Hawk emulator. If you are using a standard 24 line by 80 character display, the Hawk emulator will directly control a 9 line by 80 character region on the bottom half of the display. This occupies 720 bytes or 180 words of the memory address space.

The file /group/22c018/hawk.sysdef file contains standard definitions for access to the the video RAM region of memory. These are:

DISPBASE = #FF000000 ; base address of the display interface
DISPROWS = 0         ;  offset of the rows register
DISPCOLS = 4         ;  offset of the columns register
DISPTEXT = #100      ;  offset of the start of video ram

Assuming that your assembly program includes the line

	USE "/group/22c018/hawk.sysdef"

the program can reference memory location DISPBASE+DISPROWS to find out how many rows of characters are in the display window and it can access DISPBASE+DISPTEXT to reference the character displayed in the upper left corner of the display.

This file is comparable to header files in C or C++; the symbol DISPBASE is a pointer to a structure, and this structure has fields for ROWS, COLUMNS and TEXT. A C data structure definition equivalent to the above would thus be something like the following:

struct disp {
	long int disprows;
	long int dispcols;
	char padding[0x100-8];
	char disptext[ ... ];
}
struct disp * dispbase = (struct disp *) 0xFF000000;

The problem with finding an exact expression for things like this in C is that the size of the text array is unknown until you read the rows and columns fields of the structure. These fields are set by hardware, not by software. The curious field called padding is used to position the text field properly; this assumes that the C compiler allocates consecutive memory locations for the fields of the structure; C compilers are expected to do this!

Hawk Memory Addressing

To access the video RAM, we need to get the address of the vide interface into a register. We can access the contents of an arbitary memory address by first loading that address in a register and then using that address. For example, to read the number of columns on our display, we could do this:

	LIW	R1,DISPBASE+DISPCOLS
	LOADS	R2,R1

This loads R2 using the address loaded in R1. If many different related memory locations are to be accessed, we can load the address once and then use it many times, using what is called indexed addressing. Here is an example:

	LIW	R1,DISPBASE
	LOAD	R2,R1,DISPROWS
	LOAD	R3,R1,DISPCOLS

This is equivalent to the following bit of pseudo C code:

	r2 = dispbase->disprows;
	r3 = dispbase->dispcols;

Recall that the long version of the LOAD instruction adds a 16 bit constant, DISPROWS or DISPCOLS in this example, to the contents of a register, and then uses the sum as a memory address. Thus, this example loads R2 with the contents of the rows register, and R3 with the contents of the columns register. In effect, we are using the the LOAD instruction to access fields of a record -- a C struct.

Sometimes, we need to compute an address without immediately referencing that address. For example, if we are about to make a sequence of modifications to the video RAM, we are likely to want to load the address of the video RAM first. We do this using the LEA or load effective address instruction, as follows:

	LIW	R1,DISPBASE
	LEA	R4,R1,DISPTEXT

In C, this can be approximated as:

	r4 = dispbase->disptext;
	r4 = & dispbase->disptext;

In C, the above lines are effectively equivalent, although the second line will give a warning message under some compilers. The reason is that text field of our structure is an array, and assigning an array name is the same as assigning the address of the array, while assigning variable names assigns the contents of the variable. This is one of the worst features of C and of languages derived from it!

C also lets us descend to a lower level, since it allows us to directly manipulate addresses and untyped data. Here is another way of looking at what is goind on with the above bits of assembly code:

	#define dispbase 0xFF000000
	#define	disprows 0
	#define	dispcols 4
	#define	disptext 0x100
	{
		char* r1 = (char*)dispbase;
		long int r2 = *((long int*)(r1 + disprows));
		long int r3 = *((long int*)(r1 + dispcols));
		char* r4 =          r1 + disptext  ;

Recall that the C type char is a variable holding one byte, while the type long int is a variable holding one 32 bit word. Unfortunately, the type int may be either 16 or 32 bits, depending on the compiler. The type char* is a pointer to a byte, and the cast prefixes (char*) and (long int*) force the value of the following expression to be interpreted as a pointer to a character or a pointer to an integer. The prefix unary operator * causes a memory reference to the object pointed to by an expression (assuming that expression has a type that is a pointer type).

So, what should we do with this? Consider the following C code, in the context of the above:

	char  r5 = '-';
	int   r6;
	for (; r2 > 0; r2--) { /* for each row */
		for (r6 = r3; r6 > 0; r6--) { /* for each col */
			*r4 = r5;
			r4 ++;
		}
	}

This bit of code puts rows times cols characters in memory, filling the video RAM with dashes.

Hawk Byte Addressing

Translating this to assembly code requires solving one little problem, that of assigning to a single byte in memory. The Hawk architecture allows byte addressing, but the basic load and store instructions only address whole words, ignoring the least significant two bits of each address.

The Hawk solution to byte addressing is provided by the stuff and extract instructions. The stuff instructions can be used to "stuff" a byte into position within a word, while the extract instructions can be used to extract one byte from a word. To fetch an arbitrary byte from memory, first fetch the word holding that byte, then extract it. To store a byte at an arbitrary location in memory, first fetch the word surrounding that location, stuff the byte into position and then put the word back.

This may appear unnecessarily expensive, particularly in the light of the fact that many computers have single machine instructions for loading and storing bytes. In fact, these instructions are fairly expensive! To store a byte in an arbitrary memory location, most machines do exactly what the Hawk machine does, loading the byte into a register somewhere in the CPU, stuffing the byte into position, and then storing the result. Thus, the Hawk simply makes the programmer explicitly state the work that the CPU would be doing anyway.

To store a byte in memory, for example, translating *r4=r5 (in C) to the SMAL Hawk assembly language, the following sequence of Hawk instructions will suffice.

	LOADS  R7,R4
        STUFFB R7,R5,R4
        STORES R7,R4

Completing the Example

The C code given above can now be translated to SMAL Hawk code:

	TITLE	Program to fill the screen
	USE	"/group/22c018/hawk.macs"
	USE	"/group/22c018/hawk.sysdef"

        MACRO   LIW =dst, =const
          LIL   dst, const >> 8
          ORIS  dst, const & #FF
        ENDMAC

.	=	#1000
	S	.

	LIW	R1,DISPBASE
	LOAD	R2,R1,DISPROWS
	LOAD	R3,R1,DISPCOLS
	LEA	R4,R1,DISPTEXT
	LIS	R5,'-'

OUTERLOOP:		; for each row (decrementing R2)
	
	MOVE	R6,R3	; setup for inner loop

INNERLOOP:		; for each column (decrementing R6)

	LOADS	R7,R4
	STUFFB	R7,R5,R4
	STORES	R7,R4   ; store '-' m[R4]
	ADDSI	R4,1	; bump pointer into text
	ADDSI	R6,-1	; decrement inner loop counter
	BGT	INNERLOOP

	ADDSI	R2,-1	; decrement outer loop counter
	BGT	OUTERLOOP

	JUMP	0

This example program runs correctly under the Hawk interpreter as currently installed on the IBM and Silicon Graphics machines we have, and it runs eccentrically but not completely incorrectly under the HP machines we have. The error in the HP machines should be correctled shortly (it's HP's fault, they moved to a new and faulty version of the curses package in their new operating system release).

The final JUMP 0 transfers control to locaton zero; as long as your program is running in the low half of ROM, this is legal, and, if you haven't experimented with breakpoints and other advanced debugging methods in the interpreter, it will halting the interpreter because, by default, there is a breakpoint set at location zero.

Operating System Access to Output

The above code is ugly! The standard Hawk operating system allows far more convenient access to the display through a library of procedures for display access. These are:

DSPINI -- initialize display package
DSPAT -- move to coordinates (x,y)=(R3,R4)
DSPCH -- display character from R3
DSPST -- display string pointed to by R3
DSPHX -- display hex number from R3
DSPDEC -- display decimal number from R3

To fully understand the use of these system functions, we will have to discuss procedure calling! Here is a small example, the classic Hello World program:

	TITLE   Hello World Program, by D. Jones
	USE    "/group/22c018/hawk.macs"
	USE    "/group/22c018/hawk.system"

	S       START		; set starting address

START:
        LOAD    R2,PSTACK       ; set up the calling stack
        CALL    DSPINI
        LEA     R3,HELLO
        CALL	DSPST
        CLR     R1
        JUMPS   R1              ; stop!

HELLO:  ASCII   "Hello World!",0

	END

The CALL macro uses R2 for the stack pointer in the procedure calling stack, and it uses R1 for procedure linkage. Thus, programs that use the system routines must not use R1 and R2 except as illustrated above.