Up to this point, we have looked fairly informally at how a person can hand translate high-level-language code to assembly language. The time has come to take a more formal look at this problem. We will look at this problem through the perspective of a very simple example language, the language of a simple RPN calculator. This calculator has the following commands:
Input Output Comments
EP 0
E5P 5
E53P 53 Eddd pushes the decimal number ddd
EEPP 0 0
E5E3PP 3 5
E5E3+P 8 5 + 3 = 8
EE5PP 5 0
EE5-P -5 0 - 5 = -5
A Pascal implementation of this calculator is trivial, assuming that the
ends of strings are marked by a distinctive character, endofstring:
procedure calculate(var line: string);
const stacksize = 80;
var stack: array[0..stacksize] of integer;
sp: 0..stacksize;
i: 0..stringsize;
begin
sp := 0;
i := 0;
while s[i] <> endofstring do begin
case s[i] of
'E':begin
stack[sp] := 0;
sp := sp + 1;
end;
'0'..'9':
stack[sp-1] := stack[sp-1] * 10
+ (ord(s[i]) - ord('0'));
'P':begin
sp := sp - 1;
write(stack[sp]);
end;
'+':begin
sp := sp - 1;
stack[sp-1] := stack[sp-1] + stack[sp];
end;
'-':begin
sp := sp - 1;
stack[sp-1] := stack[sp-1] - stack[sp];
end;
end;
i := i + 1;
end;
end;
The above routine does no error checking, something that should be remedied
in any practical implementation!
From a user's perspective, it would be most convenient to allow a programmer to enter a full line of input before the calculator does any computation, starting the calculations anew with each line of input. The Hawk monitor provides a routine, KBGETS, to read a null-terminated string from the keyboard. We can rewrite the above in terms of KBGETS, along with calls to various other monitor routines to provide for better output format to produce the following main program:
program Calculator(input,output)
var line: string;
procedure clear;
var i; int;
begin
for i = 0 to 70 do dspch(' ');
end;
begin { main program }
repeat
dspat(10,2);
dspch('>'); { output a prompt }
kbgets(line);
dspat(10,3); { clear previous output }
clear;
dspat(10,3); { setup for output }
calculate(line);
dspat(10,2); { clear previous input }
clear;
forever;
end;
This main program can be translated to a Smal Hawk program as follows:
TITLE Calculator main program USE "/group/22c018/hawk.macs" S START ; put the procedure call stack in unused memory EXT UNUSED PSTACK: W UNUSED ; linkage for external procedures in the monitor EXT DSPINI,DSPAT,DSPCH,KBGETS PDSPINI:W DSPINI PDSPAT: W DSPAT PDSPCH: W DSPCH PKBGETS:W KBGETS ; linkage for external procedures in this program EXT CALC PCALC: W CALC ; the line buffer COMMON LINE,80 PLINE: W LINE ; the main program START: LOAD R2,PSTACK ; set up calling stack LOAD R1,PDSPINI JSRS R1,R1 ; initialize display LOOP: LIS R3,10 LIS R4,2 LOAD R1,PDSPAT JSRS R1,R1 ; at (10,2) LIS R3,'>' LOAD R1,PDSPCH JSRS R1,R1 ; output prompt LOAD R3,PLINE LOAD R1,PKBGETS JSRS R1,R1 ; get line LIS R3,10 LIS R4,3 LOAD R1,PDSPAT JSRS R1,R1 ; at (10,3) JSR R1,CLEAR ; erase old output, if any LIS R3,10 LIS R4,3 LOAD R1,PDSPAT JSRS R1,R1 ; at (10,3) LOAD R3,PLINE LOAD R1,PCALC JSRS R1,R1 ; calculate on the line LIS R3,10 LIS R4,2 LOAD R1,PDSPAT JSRS R1,R1 ; at (10,2) JSR R1,CLEAR ; erase old input, if any BR LOOP ; and then get next line ;------------------------------- CLEAR: ; routine to clear 70 characters ; may wipe out R3-8 ; does not use R2 MOVE R8,R1 ; set aside return address LIL R7,70 CLEARL: LIS R3," " LOAD R1,PDSPCH JSRS R1,R1 ; output blank ADDSI R7,-1 ; count it BGT CLEARL; ; iterate if more blanks JUMPS R8 ; return ENDIn coding the calculate procedure, one of the most interesting design decisions that comes up is where to put the stack. In high level languages such as C or Pascal, we were forced to put the stack in an explicitly declared array, and we can do this in assembly language. In assembly language, we have a new option, however, that of using the procedure calling stack, as illustrated in the following code:
TITLE Calculator
USE "/group/22c018/hawk.macs"
; linkage for external procedures in the monitor
EXT DSPCH,DSPST,DSPDEC
PDSPCH: W DSPCH
PDSPST: W DSPST
PDSPDEC:W DSPDEC
;-------------------------------
INT CALC
CALC: ; procedure to handle one line of calculator input
; expects R3 = pointer to line
; uses R2 for the calculator stack and called AR's!
MOVE R8,R1 ; save return address
MOVE R9,R3 ; save pointer to command string
MOVE R10,R2 ; save pointer to stack base
LOOP:
LOADS R3,R9
EXTB R3,R3,R9 ; get a character
BZR NONNULL
MOVE R2,R10 ; restore stack (clean off unpopped stuff)
JUMPS R8 ; return if null
NONNULL:
LEA R4,JUMPTAB ; setup for case statement
ADDSL R3,R4,2 ; index into jumptab by the character
LOADS R3,R3 ; get the jump table entry
JUMPS R3 ; go process the entry
;-------------------------------
; jump table with one entry for each ASCII code
; because the table is complete, no bounds checking is needed
ALIGN 4
JUMPTAB:
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; control
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; chars
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; control
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; chars
W SPACE,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; !"#$%&'
W ERROR,ERROR,ERROR,PLUS, ERROR,MINUS,ERROR,ERROR ; ()*+,-./
W DIGIT,DIGIT,DIGIT,DIGIT,DIGIT,DIGIT,DIGIT,DIGIT ; 01234567
W DIGIT,DIGIT,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; 89:;<=>?
W ERROR,ERROR,ERROR,ERROR,ERROR,ENTER,ERROR,ERROR ; @ABCDEFG
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; HIJKLMNO
W PRINT,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; PQRSTUVW
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; XYZ[\]^_
W ERROR,ERROR,ERROR,ERROR,ERROR,ENTER,ERROR,ERROR ; `abcdefg
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; hijklmno
W PRINT,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; pqrstuvw
W ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR,ERROR ; xyz{|}~
;-------------------------------
; Each of the following blocks of code handles one calculator command
; The following are not procedures; the code for Each calculator command
; ends by advancing the input pointer and going to the loop top!
; On entry to each of these, the following registers are in use:
;
; R2 - points to the first free word beyond the stack top
; R8 - the return address
; R9 - points to the current calculator command
; R10 - points to the bottom word on the the calculator stack
;-------------------------------
SPACE:
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
;-------------------------------
ERROR:
LEA R3,ERMSG1
LOAD R1,DSPST
JSRS R1,R1 ; output error message prefix
LOADS R3,R9
EXTB R3,R3,R9 ; get the character
LOAD R1,DSPCH
JSRS R1,R1 ; output it
LEA R3,ERMSG2
LOAD R1,DSPST
JSRS R1,R1 ; output error message suffix
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
ERMSG1: ASCII "Error '",0
ERMSG2: ASCII "' ",0
ALIGN 2
;-------------------------------
PLUS:
ADDSI R2,-4 ; pop the stack
CMP R2,R10 ; check for underflow!
BLE ERROR ; complain if underflow
LOADS R3,R2 ; get old (popped) stack top
LOAD R4,R2,-4 ; get item below it (current top item)
ADD R4,R4,R3 ; add them
STORE R4,R2,-4 ; put the result back on top of the stack
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
;-------------------------------
MINUS:
ADDSI R2,-4 ; pop the stack
CMP R2,R10 ; check for underflow!
BLE ERROR ; complain if underflow
LOADS R3,R2 ; get old (popped) stack top
LOAD R4,R2,-4 ; get item below it (current top item)
SUB R4,R4,R3 ; subtract them
STORE R4,R2,-4 ; put the result back on top of the stack
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
;-------------------------------
DIGIT:
CMP R2,R10 ; check for empty stack!
BLE ERROR ; complain if empty
LOAD R4,R2,-4 ; item on stack top
LOADS R3,R9 ; the character needs getting again
EXTB R3,R3,R9 ; so get it (known to be a digit)
ADDI R3,-"0" ; convert it to an integer
ADDSL R4,R4,2 ; multiply stack top by 5
ADDSL R4,R3,1 ; multiply by 2 and add digit
STORE R4,R2,-4 ; return to stack top
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
;-------------------------------
ENTER:
STORES 0,R2 ; push zero
ADDSI R2,4
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
;-------------------------------
PRINT:
CMP R2,R10 ; check for empty stack!
BLE ERROR ; complain if empty
ADDSI R2,-4 ; pop
LOADSCC R3,R2 ; check sign
BNR PRINTP
LIS R3,"-" ; if negative
LOAD R1,DSPCH
JSRS R1,R1 ; print - first
LOADS R3,R2 ; then recover the number
NEG R3,R3 ; and negate it
PRINTP:
CLR R4
LOAD R1,DSPDEC
JSRS R1,R1 ; print in field width zero
LIS R3," "
LOAD R1,DSPCH
JSRS R1,R1 ; print trailing blank
ADDSI R9,1 ; advance pointer into command string
JUMP LOOP
END
Note that the above code includes error checking for stack underflow
(two many pop operations) and for illegal calculator commands, but it
does not give any explanation with its error messages, nor does it
check for stack overflow.
Checking for stack overflow is unneeded so long as the maximum length of an input string is known. In our main program, we assumed that the maximum was 80 characters, so the maximum calculator program would be 80 consecutive push operations. So long as the stack is at least 80 words long, this program cannot produce a stack overflow.
In fact, the maximum command length is not known because our calculator uses the KBGETS monitor routine, and KBGETS shares a well known flaw with the standard C library routine gets() -- it does nothing to limit the length of a string to that of the string variable it is passed! This is a major flaw in the standard C library; it has been the source of many system failures, and if you experiment with this calculator program, you will discover that it behaves strangely for excessively long input lines.
The case statement in the Pascal code above has been translated to an indexed indirect jump through the jump table in the calculator. This is very fast, and it gains speed by having an exhaustive jump table, thus eliminating any need for run-time bounds checking.