Given the assembly language declaration of a node in a lexicographic tree, How can we write programs to do things with this? Recall the SMAL Hawk declarations:
; lexicographic tree of strings. KEYSIZE = 16 ; size of key field ; fields of each node in the tree: LEFT = 0 ; pointer to left subtree RIGHT = 4 ; pointer to right subtree KEY = 8 ; an array of characters ; size of each node in the tree: NODESIZE= 8 + KEYSIZEOne thing we frequently do with trees is traverse them. Consider the problem of doing an in-order traversal, printing the keys of each node. In Pascal, this might involve the following code:
procedure traverse(ptr: noderef); begin if ptr <> nil then begin traverse( ptr^.left ); write( ptr^.key ); traverse( ptr^.right ); end; end;In SMAL Hawk assembly language, we can duplicate this logic, exactly! The following bit of code is written following cookbook methods, clearly separating receiving sequence from body, and using the full calling sequence for every call:
;------------------------- ; activation record structure ; 0 ; return address (4 bytes) TRAPTR = 4 ; save pointer to node (4 bytes) TRAAR = 8 ; size of activation record TRAVERSE: ; traverse tree and print keys ; expects R3 = pointer to node ; uses R--- STORES R1,R2 ; save return address and parameter STORE R3,R2,TRAPTR ; if pointer is nil, quit TESTR R3 ; 3 BZS TRAQT ; 3 ; call traverse(ptr^.left) LOAD R3,R3,LEFT ADDI R2,TRAAR JSR R1,TRAVERSE ADDI R2,-TRAAR ; 2 ; call dspstr(ptr^.key) LOAD R3,R2,TRAPTR LEA R3,R3,KEY ADDI R2,TRAAR ; 1 LOAD R1,PDSPSTR JSRS R1,R1 ADDI R2,-TRAAR ; 1 ; call traverse(ptr^.right) LOAD R3,R2,TRAPTR LOAD R3,R3,RIGHT ADDI R2,TRAAR ; 2 JSR R1,TRAVERSE ADDI R2,-TRAAR ; windup and return TRAQT: ; 3 LOADS R1,R2 ; recover return address JUMPS R1 ; returnThis code can be optimized. The numbered comments above refer to optimizations; these are discussed below:
;------------------------- ; activation record structure ; 0 ; return address TRAPTR = 4 ; save pointer to node TRAAR = 8 ; size of activation record TRAVERSE: ; traverse tree and print keys ; expects R3 = pointer to node TESTR R3 ; see if we have a valid pointer BZS TRAQT ; quit if null STORES R1,R2 ; save return address STORE R3,R2,TRAPTR ADDI R2,TRAAR LOAD R3,R3,LEFT JSR R1,TRAVERSE LOAD R3,R2,TRAPTR-TRAAR LEA R3,R3,KEY LOAD R1,PDSPSTR JSRS R1,R1 LOAD R3,R2,TRAPTR-TRAAR LOAD R3,R3,RIGHT JSR R1,TRAVERSE ADDI R2,-TRAAR LOADS R1,R2 ; recover return address TRAQT: JUMPS R1 ; returnThis code is only 17 instructions long, while the original is 21 instructions; this is a savings of about 1/5. In fact, the savings is greater. Assuming that half the calls made have a nil pointer, half the calls to the old version would require executing 6 instructions where in the optimized version, only 3 instructions are needed.
A fair guess at the actual improvement results from adding the time taken for a non-nil call to the procedure to the time taken for a nil call under optimized and non optimized versions. These are 21+6 or 27 versus 17+3 or 20. We can conclude from this that the speed of tree traversal under the optimized version would be about 75% (20/27) of that under the original.
To achieve even better performance, we would have to abandon the framework established by the original code. The optimizations we did above adhered to this framework; such optimizations are sometimes referred to as peephole optimizations, since they can be made by looking at small parts of a program without attention to the global program structure. Global optimization of this program begins by rewriting the original procedure to move the tests for nil pointers:
procedure traverse(ptr: noderef); { called only when suspected that ptr = nil } begin if ptr <> nil then traverse1( ptr ); end; procedure traverse1(ptr: noderef); { called only when ptr known not to be nil } begin if ptr^.left <> nil then traverse1( ptr^.left ); write( ptr^.key ); if ptr^.right <> nil then traverse1( ptr^.right ); end;Here, we save the cost of procedure calls for each nil pointer by putting the test outside each call instead of inside of it. To do this, we had to introduce a second procedure to take care of testing the root of the tree for nil. For each nil pointer in the tree, this saves a call and a return instruction, and in addition, it eliminates the need for a TESTR instruction because the condition codes can be set by the LOADCC instruction used to get the left or right link.