22C:18, Lecture 30, Summer 1997

Douglas W. Jones
University of Iowa Department of Computer Science

Calls when part of the Activation Record is in Registers

If part of the current activation record is in registers and part of it is in memory as discussed in the previous lecture, procedure calls must be more complicated. Our previous lecture ended with the following idea:

Given this complicated notion of an activation record, our challenge is to arrange things so that the necessary registers are saved through a procedure call and restored on return. This is complicated by the possibility that calls will be nested, as in the following little example:
	p( 1, q(2), 3);
We expect to represent this in our RPN style language as something like the following:
	mark
	  pushi 1
	  mark
	    pushi 2
	  funcall q
	  pushi 3
	proccall p
In the above, indenting has been used to make it clear which mark corresponds to which call! Note that we have used proccall (procedure call) to indicate that the call to p does not return a value on the stack top, while funcall (function call) indicates that a value will be returned.

At a high level, the mark macro simply needs to save all the registers, so that the following push operations build a parameter list for the procedure or function being called. The mark macro must also remember how many registers it saved so that the call macro can restore those registers. Here is one solution to this problem:

	MACRO mark
	  IF SP > R3
	    SAVEREGS R3
	  ENDIF
	  PUSH SP
	ENDMAC

	MACRO pcall p
	  JSR R1,p
          POP SP  
          IF SP > R3
            RESTORE SP
          ENDIF
	ENDMAC

	MACRO fcall
	  JSR R1,p
          POP SP  
          IF SP > R3
	    MOVE R3,SP
            RESTORE SP
          ENDIF
          SP = SP + 1
	ENDMAC
These macros are simple only because all the hard work has been moved into four new macros. SAVEREGS and RESTORE save and restore the contents of registers, while PUSH and POP manage a privsate little stack used to remember what registers are in use in case there is nesting of function calls.

Ideally, SAVEREGS would be written something like the following:

	MACRO SAVEREGS
	  for i = R3 to SP-1 do
            STORE i,R2,AR
	    AR = AR + 1
          endfor
        ENDMAC
Unfortunately, our macro assembler does not have for loops in it, so we've got to use recursive macros to do the iteration described above:
	MACRO SAVEREGS =r
	  STORE r,R2,AR
	  AR = AR + 1
          IF (r+1) < SP
	    SAVEREGS r+1
          ENDIF
        ENDMAC

	MACRO RESTORE =r
          AR = AR - 1
          LOAD r-1,R2,AR
          IF (r-1) > R3
            RESTORE r-1
          ENDIF
        ENDMAC
Note that registers are saved and restored in the opposite order, and that the actual accesses to the memory half of the activation record are arranged as push and pop operations.

An Assembly Time Stack

The above macros required that we maintain a stack, at assembly time, to manage nesting of control structures such as mark and call. Our problem is to manage this stack using the macros PUSH and POP (we'll also define TOP, a macro to inspect the top item on the stack without popping it). We'll maintain this stack in a set of assembly time symbols; if we have 5 items on the stack, it would look like the following:

      qSTK1q  -- the bottom of stack
      qSTK2q
      qSTK3q
      qSTK4q
      qSTK5q  -- the top item on the stack
We'll use the symbol qSPq for the assembly-time stack pointer, incrementing on push and decrementing on pop. The problem we face is how to combine the value of this symbol with the string qSTKq to make the stack? The answer lies in the macro LABGEN which simply concatenates its three actual parameters. This lets us write our stack macros as:
	qSPq = 0

	MACRO	PUSH x
	  qSPq = qSPq + 1
	  LABGEN (qSTK),qSPq,(q = x)
	ENDMAC

	MACRO	POP x
	  LABGEN (x = qSTK),qSPq,(q)
	  qSPq = qSPq - 1
	ENDMAC

	MACRO	TOP x
	  LABGEN (x = qSTK),qSPq,(q)
	ENDMAC
The LABGEN macro must accept its first and final parameters as strings while it converts the value of its middle parameter to a decimal number. Our macro assembler solves this problem for us with its parameter passing modes so we can write LABGEN as:
	MACRO   LABGEN (a),=b,(c)
	  a'b'c
	ENDMAC
Recall that our macro assembler elides single quote marks when it expands a macro!

Control Structure Macros -- Loops

The label generation mechanism discussed above is sufficient to allow us to write macros that implement the common control sturctures of high level languages! Consider the following idea for implementing (infinite) loops:

	MACRO loop
	  LOOPTOP = .
	ENDMAC

	MACRO endloop
	  JUMP   LOOPTOP
	ENDMAC
Here, we assume that loops are never nested! Backward branches are easy because the destination address is known at the time the location containing the branch is reached during the assembly process. We can use our assembly-time stack macros to allow loop nesting, although with infinite loops, such nesting is of questionable value:
	MACRO loop
	  PUSH   .
	ENDMAC

	MACRO endloop
	  POP    qTEMPq
	  JUMP   qTEMPq
	ENDMAC
Forward branches, something we need for loop exits, add a new challenge! In the above code, the solution is to simply avoid the problem! One solution to this problem is to use automatic label generation to make a label Lxxx at the top of each loop and Qxxx at the end of each loop. We'll number our blocks (loops, if-statements, etc), so L1 will be the top of the first loop, L2 the top of the second, and so on:
	qBLOCKNUMq = 0

	MACRO loop
	  qBLOCKNUMq = qBLOCKNUMq + 1
	  PUSH   qBLOCKNUMq
	  LABGEN (L),qBLOCKNUMq,(:)
	ENDMAC

	MACRO endloop
	  POP    qTEMPq
	  LABGEN (JUMP L),qTEMPq,()
	  LABGEN (Q),qTEMPq,(:)
	ENDMAC

        MACRO break
	  TOP    qTEMPq
	  LABGEN (JUMP Q),qTEMPq,()
        ENDMAC
The above macro set includes a break macro that works exactly as the break statement in C (or C++); it exits the nearest enclosing loop! There is a problem with this macro set that will come up when we consider using the same assembly time stack for other purposes; the nearest enclosing loop won't be the same as the nearest enclosing control structure that uses this stack. As a result, we'll have to do a bit of extra work:
	qBLOCKNUMq = 0
	qBREAKNUMq = 0

	MACRO loop
	  PUSH   qBREAKNUMq
	  qLOOPNUMq = qBLOCKNUMq + 1
	  PUSH   qBLOCKNUMq
	  LABGEN (L),qBLOCKNUMq,(:)
	  qBREAKNUMq = qBLOCKNUMq
	ENDMAC

	MACRO endloop
	  POP    qTEMPq
	  LABGEN (JUMP L),qTEMPq,()
	  LABGEN (Q),qTEMPq,(:)
	  POP    qBREAKNUMq
	ENDMAC

	MACRO break
	  LABGEN (JUMP L),qBREAKNUMq,()
	ENDMAC
This is a bit ugly, but we've now built a foundation strong enough to allow us to write macros for if/else/endif and other common control structures.

Control Structure Macros -- Select/Case

As an extreme example of this, lets define the following macros:

Within each select/endselect block, we'll define the following assembly time symbols:
	MACRO select max,min
	  qBLOCKNUMq = qBLOCKNUMq + 1
	  PUSH   qBREAKNUMq

	  SP = SP - 1
	  LABGEN (CMPI  SP,MX),qBLOCKNUMq,()
	  LABGEN (BGT   TO),qBLOCKNUMq,()
	  LABGEN (SUBI  SP,MN),qBLOCKNUMq,()
	  LABGEN (BLT   TO),qBLOCKNUMq,()

	  LABGEN (LEA   R1,T),qBLOCKNUMq,()
          ADDSL SP,R1,2
	  LOADS	R1,SP
	  JUMPS R1

	  LABGEN (TO),qBLOCKNUMq,(:)
	  LABGEN (JMP O),qBLOCKNUMq,()

	  qTEMPq = .
	  LABGEN (qTEMP1q = O),qBLOCKNUMq,()
	  LABGEN (qMEMFILLq qTEMP1q,TS),qBLOCKNUMq,()
	  . = qTEMPq

	  LABGEN (qTEMPq = MN),qBLOCKNUMq,()
	  LABGEN (TM),qBLOCKNUMq,( = qTEMPq)
	  LABGEN (MX),qBLOCKNUMq,( = #80000000)
	  LABGEN (MN),qBLOCKNUMq,( = #7FFFFFFF)
	  qBREAKNUMq = qBLOCKNUMq
	  PUSH  qBLOCKNUMq
	ENDMAC

	MACRO case =i

	  TOP	qTEMP1q
	
	  LABGEN (qTEMPq = MX),qTEMP1q,()
	  IF i > qTEMPq
	    LABGEN (MX),qTEMP1q,( = i)
	  ENDIF
	  LABGEN (qTEMPq = MN),qTEMP1q,()
	  IF i < qTEMPq
	    LABGEN (MN),qTEMP1q,( = i)
	  ENDIF

	  qTEMPq = .
	  LABGEN (qTEMP2q = T),qTEMP1q,( + (i << 2))
	  LABGEN (. = qTEMP2q - TM),qTEMP1q,()
	  W   qTEMPq
	  . = qTEMPq

	ENDMAC

	MACRO otherwise
	  TOP	qTEMP1q
	  LABGEN (O),qTEMP1q,(:)
	ENDMAC

	MACRO endselect
	  POP   qTEMP1q

	  ALIGN 4
	  LABGEN (T),qTEMP1q,(:)
	  LABGEN (qTEMPq = MX),qTEMP1q,()
	  LABGEN (qTEMPq = qTEMPq = MN),qTEMP1q,()
	  . = . + (qTEMPq << 2)
	  LABGEN (TS),qTEMP1q,( = qTEMPq)

	  LABGEN (qTEMPq = O),qTEMP1q,()
	  IF \ DEF(qTEMPq)
	    LABGEN (O),qTEMP1q,(:)
	  ENDIF

	  LABGEN (Q),qTEMP1q,(:)
	ENDMAC
The above macros are just a bit daunting looking, but together, they work to make a good case/select statement that exactly replicates the semantics of the corresponding C (and C++) statements, with one small exception.

The SELECT macro relies on the macro qMEMFILLq to fill the case table with pointers to the otherwise clause; as with the qSAVEREGSq macro defined previously, an iterative formulation would have been nice, but a recursive formula must be used instead. Consider the following:

	MACRO qMEMFILLq val,=num
	  IF num > 0
	     W val
	     qMEMFILLq val, num-1
	  ENDIF
	ENDMAC
If the select table is large, this macro will be deeply recursive, and this may cause the assembler to fail because of a macro stack overflow. Rewriting the macro as follows eliminates this problem:
        MACRO qMEMFILLq val,=num
          IF num > 0
            W val
            qMEMFILLq val, num >> 1
	    IF (num & 1) = 1
              qMEMFILLq val, num >> 1
            ELSE
              qMEMFILLq val, (num >> 1) - 1
            ENDIF
          ENDIF
        ENDMAC