22C:18, Lecture 24, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

Object Oriented Programming in Assembly Language

In writing programs with polymorphic data structures in assembly language, commenting plays a crucial role. Consider the problem of declaring the data structures for a polymorphic tree node, for example, the parse tree nodes defined in the previous section:

	class NODE;
	begin
	    virtual: procedure print;
	    virtual: integer procedure evaluate;
	end NODE;

We might translate this to assembly language as follows;

	;----------------
	; Offsets of fields in a parse tree node.
	; this is a polymorphic type; all parse tree nodes
	; begin as follows:

	PTNPRINT = 0	; pointer to print procedure
			; expects R3 = pointer to node to print
			; uses R2 as activation record pointer
			; wipes out R3-7

	PTNEVAL  = 4	; pointer to evaluate procedure
			; expects R3 = pointer to node to evaluate
			; returns R3 = value of node
			; uses R2 as activation record pointer
			; wipes out R3-7

Note that calling sequences are defined by the comments with the pointers; it is up to the code for the procedures for each kind of tree node to conform to these comments.

It would be quite possible to put all of the pieces of the program in the same source file, but object oriented programming methods are most useful for very large programs, where the code for each class is sufficiently large to be worthy of its own source file. If this is done with the parse tree node classes, the above code will be needed by the source code for each derived class (internal and leaf nodes, in our example). The cleanest way to do this in SMAL is to put the above code in a file; in the following, we'll assume it's in a file called ptn.h (using the same .h convention that C and C++ use).

Of the two classes, LEAF and INTERNAL, LEAF is clearly the simpler, since it involves no recursion. The original version of this was:

	NODE class LEAF;
	begin
	    int value;

	    procedure print;
	    begin
		write(value);
	    end print;

	    procedure evaluate;
	    begin
		evaluate := value;
	    end evaluate;

	end LEAF;

We can convert this to assembly language as follows:

		TITLE	Leaf nodes in a parse tree

	; leaf nodes are a kind of parse tree node, so we include
	; the generic fields of such nodes
		USE	"ptn.h"

	; in addition, each leaf node has the following fields:
	LEAFVAL  = 8	; value of the leaf
	LEAFSIZE = 12	; size of a leaf node

	; linkage to external routines
		EXT	MALLOC,DSPDEC
	PMALLOC:W	MALLOC
	PDSPDEC:W	DSPDEC
	
	;------------------------
	INT	LEAFINIT

	LEAFINIT:	; procedure to allocate an initialize a leaf
			; returns R3 = pointer to leaf node
			; uses R4-6
			; does not use R2
		LIS	R3,LEAFSIZE
		LOAD	R1,PMALLOC	; allocate leaf node
		JSRS	R1,R1		; wipes out R4-6
		LEA	R4,LEAFPRINT	; setup pointers to methods
		STORE	R4,R3,PTNPRINT
		LEA	R4,LEAFEVAL
		STORE	R4,R3,PTNEVAL
		JUMPS	R1		; return

	;------------------------
	; R2 points to AR with the following format:
	;	0	  return address
	LEAFPAR=4	; size of AR
	
	LEAFPRINT:	; method to print a leaf node
			; conforms to interface in ptn.h
		STORES	R1,R2		; save return address
		LOAD	R3,R3,LEAFVAL	; get value
		LIS	R4,1		; set minimum field width
		LOAD	R1,PDSPDEC	; assume it wipes out R3-7
		ADDSI	R2,LEAFPAR
		JSRS	R1,R2		; output the value
		ADDSI	R2,-LEAFPAR
		LOADS	R1,R2		; recover return address
		JUMPS	R1		; return

	;------------------------
	LEAFEVAL:	; method to evaluate a leaf node
			; conforms to interface in ptn.h
		LOAD	R3,R3,LEAFVAL	; get value
		JUMPS	R1		; return

The above print and eval routines are essentially trivial. The interesting part above is the init routine. Since the original class definition had no specific code for initialization, everything in this init routine is implicit, the allocation of a leaf node using MALLOC and the initialization of the virtual method pointers to point to the specific methods used for leaf nodes. In any real application, it is likely that additonal initialization code would be included!

Note that the initializer above is the only externally callable routine. The linker never needs to know about LEAFPRINT and LEAFEVAL; users gain access to these only through the pointers in parse tree nodes returned by LEAFINIT.

The code for internal nodes is more complex. The original code, in Simula 67, is:

	NODE class INTERNAL
	begin
	    ref (NODE) left;
	    char op;
	    ref (NODE) right;

	    procedure print;
	    begin
		write('(');
		left.print;
		write(op);
		right.print;
		write(')');
	    end print;

	    procedure evaluate;
	    begin
		if op = '+' then
		    evaluate := left.evaluate + right.evaluate
		else if op = '-';
		    evaluate := left.evaluate - right.evaluate
		else if op = '*';
		    evaluate := left.evaluate * right.evaluate
		else if op = '/';
		    evaluate := left.evaluate * right.evaluate
		else error;
	    end evaluate;

	end INTERNAL;

This translates to the following assembly code:

		TITLE	Internal nodes in a parse tree

	; internal nodes are a kind of parse tree node, so we include
	; the generic fields of such nodes
		USE	"ptn.h"

	; in addition, each internal node has the following fields:
	INTLEFT  = 8	; pointer to left subtree, a parse tree node
	INTRIGHT = 12	; pointer to right subtree, a parse tree node
	INTOP    = 16	; operator, a character
	INTSIZE  = 20	; size of subtree (rounded up to a multiple of 4)

	; linkage to external routines
		EXT	MALLOC,DSPCHR
	PMALLOC:W	MALLOC
	PDSPCHR:W	DSPCHR
	
	;------------------------
	INT	INTINIT

	INTINIT:	; procedure to allocate and init internal node
			; returns R3 = pointer to node
			; uses R4-6
			; does not use R2
		LIS	R3,INTSIZE
		LOAD	R1,PMALLOC	; allocate leaf node
		JSRS	R1,R1		; wipes out R4-6
		LEA	R4,INTPRINT	; setup pointers to methods
		STORE	R4,R3,PTNPRINT
		LEA	R4,INTEVAL
		STORE	R4,R3,PTNEVAL
		JUMPS	R1		; return

	;------------------------
	; R2 points to AR with the following format:
	;	 0	  return address
	INTPPTR= 4	; pointer to node
	INTPAR = 8	; size of AR
	
	INTPRINT:	; method to print an internal node
			; conforms to interface in ptn.h
		STORES	R1,R2		; save return address
		STORE	R3,R2,INTPPTR	; save pointer to node
		LIS	R3,"("
		LOAD	R1,PDSPCHR	; assume R3-R7 wiped out
		JSRS	R1,R1		; output (
		LOAD	R3,R2,INTPPTR
		LOAD	R3,R3,INTLEFT	; get node's left pointer
		LOAD	R1,R3,PTNPRINT	; get print method of left
		ADDI	R2,INTPAR
		JSRS	R1,R1		; call print method (R3-R7 wipeout)
		SUBI	R2,-INTPAR
		LOAD	R3,R2,INTPPTR
		LOAD	R3,R3,INTOP	; get node's operator
		LOAD	R1,PDSPCHR	; assume R3-R7 wiped out
		JSRS	R1,R1		; output operator
		LOAD	R3,R2,INTPPTR
		LOAD	R3,R3,INTRIGHT	; get node's right pointer
		LOAD	R1,R3,PTNPRINT	; get print method of right
		ADDI	R2,INTPAR
		JSRS	R1,R1		; call print method (R3-R7 wipeout)
		SUBI	R2,-INTPAR
		LIS	R3,")"
		LOAD	R1,PDSPCHR	; assume R3-R7 wiped out
		JSRS	R1,R1		; output )
		LOADS	R1,R2
		JUMPS	R1		; return

	;------------------------
	; R2 points to AR with the following format:
	;	 0	  return address
	INTEPTR= 4	; pointer to node
	INTELFT = 8	; save result of left subtree
	INTEAR = 12	; size of AR

	INTEVAL:	; method to evaluate an internal node
			; conforms to interface in ptn.h
		STORES	R1,R2		; save return address
		STORE	R3,R2,INTEPTR	; save pointer to node
		LOAD	R3,R3,INTLEFT	; get node's left pointer
		LOAD	R1,R3,PTNEVAL	; get eval method of left
		ADDI	R2,INTEAR
		JSRS	R1,R1		; call eval method (R3-R7 wipeout)
		SUBI	R2,-INTEAR
		STORE	R3,R2,INTELFT	; save result
		LOAD	R3,R2,INTEPTR	; save pointer to node
		LOAD	R3,R3,INTRIGHT	; get node's right pointer
		LOAD	R1,R3,PTNEVAL	; get eval method of right
		ADDI	R2,INTEAR
		JSRS	R1,R1		; call eval method (R3-R7 wipeout)
		SUBI	R2,-INTEAR
		STORE	R4,R3,INTELFT	; get left result in R4
		LOAD	R5,R2,INTPPTR
		LOAD	R5,R5,INTOP	; get node's operator
		CMPI	R5,"+"
		BNE	INTENP		; skip if not plus
		ADD	R3,R3,R4	;   handle add
		BR	INTQT
	INTENP:	CMPI	R5,"-"
		BNE	INTENM		; skip if not minus
		SUB	R3,R4,R3	;   handle subtract
		BR	INTQT
	INTENM:
		---- code for other operators goes here
	INTQT:	LOADS	R1,R2
		JUMPS	R1

In the above code, the code for initialization is essentially the same as that for leaf nodes! In a more fully developed example, the initialization code would typically be more complex because of the addition of code to handle the actual details of creating an appropriate parse tree for some particularl string, but all such detail has been ommitted here.

The code given for the print and evaluate routines for internal nodes is potentially recursive. Each of these routines has an identical recursion control structure; they differ only in what is done around that recursion, printing in the one case, and evaluating in the other.

An important thing to note is that there is no apparent control structure here to terminate the recursion! The reason is that the particular routine called with each recursive call is determined by the methods of the node being processed, and these methods differ between leaf nodes and internal nodes. Thus, it is the choice of method that determines which calls are recursive and which are not; this choice is made at the time the tree is built, not at the time it is traversed.