10. Separate Assembly and Objects

Part of CS:2630, Computer Organization Notes
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Classes and Objects

In the last chapter, we discussed many distinct number representations, along with the implementation of operations ranging from addition and subtraction, to multiplication and division. In an object-oriented programming language such as C++ or Java, we can define each of these classes or data types with a specific linguistic construct, the class definition. A class definition encapsulates all of the details of the representation of all objects in the class, so that outside that class definition, operations on objects of that class are carried out only by the methods defined for that class.

Assembly and machine language programmers cannot completely hide the details of the data types with which they work. Any time an object is loaded into registers, its size is exposed, and any time specific machine instructions are used to manipulate an object, these instructions must be used with full knowledge of their effect on the object's representation as well as on its abstract value.

Nonetheless, an assembly language programmer on a machine such as the Hawk can go a considerable distance toward isolating the users of objects from the details of their representations. We can do this in several steps, first isolating the code used to implement operations from the code that uses those operations, second, providing clean interface specifications for the objects, and finally allowing for polymorphic classes, that is, mixtures of different but compatable representations.

Separate Assembly

Suppose you have a set of subroutines, for example, multiply and divide, that you want to break off from a large program. Since the mid 1950's, there have been assembly languages that allowed such routines to be separately assembled before use in other programs. We have already been using such tools here in order to call routines in the Hawk Monitor without having to include that code as part of each source file.

There are at two good reasons to separate a program into multiple source files, no matter what programming language is being used. First, smaller source files can be easier to edit. Real-world application programs are frequently huge, with total sizes measured in thousands or even millions of lines of code. Second, separating an application into separate pieces lets us isolate program components that have already been tested from those currently under development. Some files may contain standard components that are used in many applications, while other files are unique to one application. The Hawk monitor is a good example of the later.

In assembly language programs, and to some extent, in C and C++ programs, there is another reason to separate programs into multiple source files. In these languages, some identifiers are defined locally within one source file, while other identifiers are global to all source files. In C and C++, for example, static functions and static global variables are local to the source file in which they are defined, while other functions and globals are genuinely global across all source files that make up an appliction. The use of the type qualifier static in these languages is eccentric, in that the word does not say what it means, but the idea remains quite useful despite its misnaming.

In SMAL, all identifiers are, by default, purely local to the source file in which they are declared, but they may be made global using explicit INT and EXT declarations. INT X says that the symbol X defined internally in this source file is to be globally visible. If X is not given a value in this file, it is an error.

EXT X says that the definition of the symbol X is external to this source file. This source file must not give X a value. Some other source file must contain the declaration INT X and that source file must define X.

If we wanted to put our own integer multiply and divide routines in a separate source file from the program or programs that used them, we might structure the file like this:

Outline of a separately assembled source file for the Hawk
       
        TITLE   "intmuldiv.a, integer multiply and divide"
	USE	"hawk.h"
        INT     INTMUL
        INT     INTDIV

        SUBTITLE "multiply"

INTMUL:
        ; expects R3 = multiplier
        ;         R4 = multiplicand
        ; ... other interface specifications

        ... Code for multiply

        SUBTITLE "divide"

INTDIV:
        ; expects R3 = low 32 bits of dividend
        ;         R4 = high 32 bits of dividend
        ; ... other interface specifications

        ...  Code for divide

        END
       

This file, intmuldiv.a, contains no main program and has no starting address. Some aspects of the file illustrate good programming practice. For example, the title of the file begins with the file name and the remainder briefly describes the contents, so assembly listings are always self-descriptive. The subtitles before each routine add additional documentation. Many assemblers provide similar documentation support, and where they do not, careful use of comments can achieve the same effect. The SMAL assembler uses the subtitles in the source file to create a table of contents for the listing file.

We adopted a naming convention here that is worth noting. Instead of calling our multiplication and division routines MULTIPLY and DIVIDE, we have named them INTMUL and INTDIV. There could be many different multiply and divide routines in a large program, perhaps for integers, long integers, floating point numbers and complex numbers. If the program was in a language like C++ or Java, we would distinguish between these using overloading rules or using object prefixes such as a.multiply(), where this means "the multiply routine belonging to the class of the variable a." In assembly language, we can't easily do either of these, but we can prefix the name of the subroutine with the name (or an abbreviation of the name) of the class to which it belongs.

The one feature of the SMAL file given above that is specific to separate assembly is the use of INT directives to declare the identifiers INTMUL and INTDIV. These directives tell assembler that the definitions of these identifiers must be given internal to this file so that they can be be exported for use elsewhere. Without these directives, the file would assemble correctly, but it could not be linked to other files that use the routines it contains.

The user of a separately compiled or assembled collection of routines typically needs to write out a list of declarations in order to access each routine. Nothing in most assembly languages requires that these declarations be gathered together, but we will do so. Specifically, we will gather all declarations needed to use the contents of intmuldiv.a into a file called intmuldiv.h, and we ask users of our multiply and divide routines write USE "intmuldiv.h" near the head of their source files. When the assembler processes the USE directive, it reads the contents of intmuldiv.h, so the result is exactly the same as if the entire contents of intmuldiv.h was substituted into the source file in place of the USE directive. Files with the suffix .h are frequently called header files; more properly, they are interface specification files. This is exactly the same way that C and C++ programmers have long solved the same problem.

The header file for users of a separately assembled source file
        ; intmuldiv.h -- interface specification for intmuldiv.a

        EXT     INTMUL  ; integer multiply
                        ; link through R1
                        ; expects R3 = multiplier
                        ;         R4 = multiplicand
                        ; ... other interface specifications

        EXT     INTDIV  ; integer divide
                        ; link through R1
                        ; expects R3 = low 32 bits of dividend
                        ;         R4 = high 32 bits of dividend
                        ; ... other interface specifications

Again, we have decorated our source file with comments that are not, strictly speaking, necessary. We have given our file a title, but it is included as a regular comment, not with a TITLE directive because we don't usually want the header file to be listed in an assembly listing. Additionally, we have included comments giving the complete interface definition for each routine listed in our header file.

The key lines in our header file are the EXT declarations for INTMUL and INTDIV. These tell the assembler that these symbols are defined externally in a different source file. These symbols must not be locally defined in any source where they are defined with an EXT directive. With these definitions in place, the assembler will tell the linker to find the definition exported from some other object file. If no object file exports a definiton, the linker will report an error.

This header file assumes that the linker can correctly link references to INTMUL and INTDIV. In general, this will only be true if the calling routine uses W directives to reference these symbols. There is a special case, however, that we will use routinely. If the actual physical address assigned to INTMUL and INTDIV is less than about 8 million, the linker can correctly link references made with the 3-byte T directive. This is the form of address used in the LIL instruction, and in most of our examples, we will not need to link to addresses above the 8 million byte mark.

The current version of the Hawk linker does not allow PC relative indexed addressing to be used to reference external symbols. This forces us to use a different calling sequence for externally defined subroutines than we use for those defined internally, that is, locally in the same source file as their caller. Here is an fragment of an assembly source file that uses the separately assembled integer multiply and divide routnes given above:

Using separately assembled subroutines
        TITLE   "main.a"
	USE	"hawk.h"
        USE     "monitor.h"
        USE     "intmuldiv.h"

        ... whatever code is needed

        LIL     R1,INTMUL
        JSRS    R1,R1           ; call intmul

        ... whatever code is needed

In general, the code for all methods associated with an abstract class will go in the same source file. This file ends up serving as class implementation file, while the header file holds the interface definition.

Sometimes, a group of subroutines to share variables with each other. This is done using class variables in Java and static variables in C. The SMAL assembly language provides COMMON blocks to serve this purpose. Suppose, for example, we wanted to keep statistics on the number of times INTMUL and INTDIV were called by a program. We could do this as follows:

Use of a COMMON block for static variables
        TITLE   "intmuldiv.a, integer multiply and divide"
        USE	"hawk.h"
        INT     INTMUL
        INT     INTDIV

        COMMON  INTSTATS,STATSIZE
MULCNT  =       0       ; count of calls to INTMUL
DIVCNT  =       4       ; count of calls to INTDIV
STATSIZE=       8

        SUBTITLE "multiply"

INTMUL:
        ... Code for multiply

        LIL     R5,INTSTATS     ; prepare to access intstats fields
        LOAD    R6,R5,MULCNT
        ADDSI   R6,1
        STORE   R6,R5,MULCNT    ; mulcnt = mulcnt+1

        ... More code for multiply

As already mentioned, Common blocks in SMAL have global names and are used to allocate space for global variables. Here, the global variable INTSTATS holds a record containing two one-word fields, MULCNT and DIVCNT. The existence of this common block cannot be hidden from other programs, but the size and contents are not exported, so the names used for these are local to one source file.

The COMMON directive requires the programmer to give the size of the memory block being declared. The pattern we used for this is similar to the pattern we used for activation records. Each field of the COMMON block has a symbolic name, and the block size is named last.

To reference a field of the common block, we load a register with the block's address and then reference fields in exactly the same way we reference fields of an activation record. The only difference is that R2 is not used. Local variables go in the activation record, global and static variables go in common blocks. The Hawk linker is smart enough to allow the following alternative form of reference to a field of the common block:

An alternative way to reference a COMMON block field
        LIL     R5,INTSTATS+MULCNT
        LOADS   R6,R5
        ADDSI   R6,1
        STORES  R6,R5           ; mulcnt = mulcnt+1

Exercises

a) Rewrite the recursive FIBONACCI routine from Chapter 6 so that it uses a common block to count the number of calls to FIBONACCI (recursive calls as well as calls from outside).

b) Write header files for the DSPNUM and FIBONACCI routines from Chapter 6, assuming that they were each assembled in from a separate source file.

c) What code must be added to DSPNUM and FIBONACCI from Chapter 6 in order to allow them to be assembled from separate source files.

d) Header files are not really needed in C or in SMAL. Rewrite the code above for a user of INTMUL and INTDIV so it does not use intmuldiv.h. Instead, the bare minimum from intmuldiv.h should be part of the caller's source file.

Linkage

Given the source files for a program with many separate components, the assembler and linker provide tools to combine them into one executable program. Consider a main program, main.a that calls subroutines in intmuldiv.a and the Hawk monitor. If main.a or intmuldiv.a are changed, they can be re-assembled to produce new versions of main.o or intmuldiv.o. Before we run the program, we need to combine these object files to make an executable file. The linker does this, combining main.o with intmuldiv.o to produce the executable file link.o.

The path from source code to executable object code
source code main.a intmuldiv.a
assembled by smal main.a smal intmuldiv.a
object code main.o monitor.o intmuldiv.o
linked by link main.o intmuldiv.o
executable code:link.o

The linker does a number of jobs. It combines the two object files main.o and muldiv.o, but it also adds in monitor.o automatically, so that the user program can call on monitor routines. By default, these are linked at addresses in ROM. In addition, it takes all of the common blocks mentioned in any of the separate object files and appends them end to end, by default assigning them to memory addresses in RAM. In allocating space for all of the code files and common blocks, the linker takes care to make each file or block begin at a memory address that is divisible by 4, so that aligned data within the code file or common block will be physically aligned in memory. The -d linker option makes it output a dump of the final values it assigned to external symbols.

Exercises

e) Assuming that dspnum.a and fibonacci.a are the source files for separately assembled versions of the subroutines given in chapter 6, and that you have a main program main.a, give the sequence of commands you would issue in order to assemble these pieces, link them, and make them ready to execute as a file called link.o.

An example, a stack

To illustrate these ideas, consider the problem of creating a package that defines a stack of words, with the operations push and pop. There must also be an operation to initialize the stack to empty. First, of course, we need to define the interface to this package:

Interface Specification for a stack package
        ; stack.h -- interface specification for stack.a
        ; for all routines:
                ;   R1 = return address
                ;   R2 = pointer to activation record
                ;   R3-7 = parameters and temporaries
                ;   R8-15 = guaranteed to saved and restored

        EXT     STKINI  ; initialize stack

        EXT     PUSH    ; push operation
                        ; expects R3 = word to push

        EXT     POP     ; pop operation
                        ; expects R3 = word popped

Before starting to implement this package, we must decide on a data representation for stacks. Should we use a linked list? An array with a stack pointer? Here, we opt for the latter, using a common block to hold the stack pointer and the array. In addition, we opt for the convention that, between calls to the stack routine, the stack pointer points to the top item on the stack.

Implementation for a stack package
        TITLE   "stack.a, implementation of a stack"
        USE     "hawk.h"

STACKSIZE = #1000

        COMMON  STACK,STACKSIZE
;       The first word of this common is the stack pointer sp,
;       later words are the stack itself.
;       Between calls, the stack pointer points to the top item.
;       For calling conventions, see stack.h

        SUBTITLE "STKINI -- initialize stack"
        INT     STKINI
STKINI: ; uses    R3
        LIL     R3,STACK
        STORES  R3,R3           ; stack.sp = &stack[0]
        JUMPS   R1              ; return

        SUBTITLE "PUSH -- push one word on the stack"
        INT     PUSH
PUSH:   ; expects R3 = data to push
        ; uses    R4 = stack
        ; uses    R5 = stack.sp
        LIL     R4,STACK        ; -- set up to access the stack
        LOADS   R5,R4           ; -- get the stack pointer
        ADDSI   R5,4
        STORES  R5,R4           ; stack.sp = stack.sp + 4
        STORES  R3,R5           ; * stack.sp = data
        JUMPS   R1              ; return

        SUBTITLE "POP -- pop one word from the stack"
        INT     POP
POP:    ; uses    R4 = stack
        ; uses    R5 = stack.sp
        ; returns R3 = data popped
        LIL     R4,STACK        ; -- set up to access the stack
        LOADS   R5,R4           ; -- get the stack pointer
        LOADS   R2,R5           ; data = * stack.sp
        ADDSI   R5,-4
        STORES  R5,R4           ; stack.sp = stack.sp - 4
        JUMPS   R1              ; return data

This stack package implementation contains no error checking for stack overflow or for underflow (popping from an empty stack). Both should probably raise exceptions. The interface is also defective. Many stack applications need tools to check for an empty stack, and there is none. Finally, we ought to allow the user to specify the limit on the stack size instead of setting it arbitrarily to 100016 bytes.

Exercises

f) Write a main program that makes minimal use of the stack, just initializing it and then pushing and popping a value.

g) Modify the pop routine given here so that it refuses to pop from an empty stack.

Object Oriented Notations

At this point, it should be clear that any object, say a stack, is represented by a sequence of memory locations holding the variables that compose its representation, and that the methods of a class are simply subroutines that operate implicitly on that object. This simple view is an oversimplification that is only applicable when there is just a single instance of the class. In fact, that situation is quite common. Few applications use multiple stacks, for example, and many of the classes in java programs are "wrapper classes" with no data part at all. Such classes serve merely to group a number of related subroutines.

What if we have multiple instances of a class? In this case, we must move beyond our simple model. Polymorphic classes, that is, classes with multiple coexisting implementations, require even more complexity, but for now, we will ignore polymorphism.

Consider the problem of implementing vectors and vector addition. A vector in two dimensions is a coordinate pair (x,y). The sum of two vectors (x,y) and (x',y') is (x+x',y+y'). That is, the sum is a vector where each component is the sum of the corresponding operand components. In a language such as C++ or Java, if we needed to work with vectors, we might define a vector class. To sum two vectors a and b, we might write a.add(b). That is, we would pass b as a parameter to the add method of object a.

This notation, a.add(b), is only really needed if the class of a is polymorphic. If a is not polymorphic, we might just as well write vector_add(a,b) (the particular punctuation depends on the programming language). This notation is not really object oriented, but rather, it simply views the class as a package to hold a number of related subroutines.

Object oriented versus conventional calls
Object oriented     x = a.add( b )
    y = c.add( d )
 
Conventional     x = integer_add( a, b )
    y = vector_add( c, d )
 

The point is, so long as polymorphism is not involved, the mechanisms we have already discussed in the context of the stack example are sufficient. Here is how we might define the vector class and the vector add operation. Of course, in practice we will likely want more vector operations, perhaps initialization and dot product, but those do not concern us here.

Interface Specification for a vector package
        ; vector.h -- interface specification for vector.a"

VECSIZE	=	8	; size of one vector in bytes

        ; for all routines:
                ;   R1 = return address
                ;   R2 = pointer to activation record
                ;   R3-7 = parameters and temporaries
                ;   R8-15 = guaranteed to saved and restored

        EXT     VECADD  ; vector add
                      	; expects R3 = dst, points to destination vector
                      	;         R4 = src, points to source vector

Note that the interface specification exports the size of the vector, but nothing about its representation. Exporting the size allows users of vectors to statically allocate vectors in common blocks or allocate them in activation records, but without any information about the internal structure of a vector, users cannot manipulate them without the help of the vector package. Of course, within the implementation, there must be details of how the representation of a vector is constructed.

In many object oriented languages, a pointer to an object is called an object handle. We pass pointers to the objects instead of the values of the objects because we do not want the caller worrying about the details of passing an object. Passing a pointer to an object is done identically whether that object is one byte or ten thousand words long.

Exercises

h) Write the implementation for vector add, assuming that vectors are ordered pairs of integers.

i) Write a main program that allocates two vectors, both global variables, and then demonstrates how to add them. Given that we have no mechanism for initializing vectors or using their value (aside from by addition), this program will be of no particular use.

Polymorphism

Whenever there are multiple implementations of the same interface specification for some class, we say that we have a polymorphic class. Consider the class stream, with subclasses input_stream, output_stream and perhaps window, a subclass of output_stream. This is a class that was understood to be polymorphic as far back as the 1960's. In fact, one of the streams of thinking that led to object-oriented programming grew out of thinking about this class.

Input/output devices provide a natural source of polymorphic examples because most operating systems allow multiple and very different types of physical devices to support the same abstractions. An internal hard drive, an external USB hard disk drive, and an external USB flash memory drive may all support the exact same abstract behavior while resting on very different combinations of hardware and software to implement that behavior.

Consider a program in an object oriented language that allows polymorphic classes. When such a program is translated to assembly language, each object of each class that implements the same interface must begin with an indication of its type. Without this, we would have no way, when operating on a polymorphic object, to determine which implementation of the operation was appropriate. There are several ways to do this; some of these give faster access to methods of objects of polymorphic classes, while others give more compact representations for objects of such classes.

The fastest way to access methods of objects that are instances of polymorphic classes is to include pointers to the applicable methods directly in each instance of each object. For objects that are instances of the stream class or its subclasses, we might do this as follows:

A fast implementation of a polymorphic stream class
graphical description of fast class implementation

The first word of every stream object o would be a pointer to the code of the put method to be used for calls to o.put(), and the second word would point to the code for o.get(). In a practical system, there would be additional pointers in the object for every other method applicable to the class. After these pointers, the object would continue with the actual data specific to that object. All members of a particular subclass would have the same values for the method pointers, and the same structure for the data. In assembly language, we might describe the stream class as follows, most likely in the stream.h file:

Code for a fast implementation of a polymorphic stream class
   
; format of a stream object (stream of bytes)
;   for all stream operations,
;     on entry, R3 = pointer to stream object
;     on exit,  R3 = pointer to stream object

; all compatable stream objects begin with
; pointers to the implementations of each method
PUT     =       0       ; put a character on the stream
			;   takes R4 as the value to put
GET     =       4       ; get a character from the stream
			;   returns R4 as the value gotten

; the following fields of each stream object depend on the
; particular stream subclass and usually include a
; representation for one particular type of stream
   

This stream representation was first used in an operating system designed by Christopher Strachey and his students in the 1960's. The paper describing that operating system was one of the very first clear presentations of object oriented polymorphism. The other early presentations of this idea come from work done by the developers of Simula 67, the programming language from which we get our modern object oriented terminology and notation.

Fast access to a method of a polymorphic stream
; given R3, a pointer to an initialized stream object
LOAD    R1,R3,PUT       ; get pointer to the appropriate PUT method
JSRS    R1,R1           ; call the method

The initializer or constructor for a class cannot follow this pattern because initializers must specify the precise class of the object they are constructing. Unless the initializer is included in some kind of factory object, initializers are typically called directly. Consider this initializer for keyboard stream objects:

Initializer for a polymorphic keyboard object
        INT     KBDINIT
KBDINIT:
        ; given R3, a pointer to an uninitialized keyboard stream object
        LEA     R4,KBDINIT
        STORE   R4,R3,INIT      ; initialize pointer to the INIT method
        LEA     R4,KBDPUT
        STORE   R4,R3,PUT       ; initialize pointer to the PUT method
        LEA     R4,KBDGET
        STORE   R4,R3,GET       ; initialize pointer to the GET method
	...			; other initialization could go here
        JUMPS   R1

This initializer does not allocate space for the object because we are not discussing dynamic storage allocation here. The Hawk monitor does not contain a heap manager, and heap management is comple enough that we defer that topic to courses on operating systems. Note that this keyboard initializer uses LEA to load pointers to each keyboard method into the new object. This is appropriate because these methods should be defined in the same source file as the initializer. Also note the INT declaration for KBDINIT. This would typically be the only INT declaration for any of the methods because the others should only be called through initialized keyboard stream objects. One of these, KBDPUT, should just raise an exception, since output to a keyboard is meaningless.

For simple objects like streams, the implementation given above may be appropriate, but consider what happens if we have a class such as "floating point number" with many implementations, including single precision, double precision, and possibly arbitrary precision. Floating point numbers allow a huge number of operations: add, subtract, multiply, divide, sine, cosine, exponentiate, truncate, round, and many more. If each floating point object began with pointers to all of the applicable methods, this list would generally exceed the size of the number representaiton itself by a wide margin.

To avoid this huge storage overhead, we can store the pointers to the methods in a compact table elsewhere, for example, in the same memory area that holds the code for the class. This makes sense because the table of method pointers is constant for each class implementation. It is common to refer to this table of method pointers as the class descriptor, and to refer to the pointer to the class descriptor stored in each object as the tag field of that object, since it is the field that identifies the class to which the object belongs. Most implementations of object-oriented programming languages use this approach, storing the tag field of the object in the first word of the object.

A compact polymorphic stream class implementation
graphical description of fast class implementation

In assembly language, the header file for the class would typically contain the declarations used to describe this data structure. For example, for a polymorphic stream class, the stream.h header file might contain the following:

A compact representation for a polymorphic stream class
   
; format of a stream object
;   for all stream operations,
;     on entry, R3 = pointer to stream object
;     on exit,  R3 = pointer to stream object

; all stream objects begin with a class descriptor pointer
TAG     =       0       ; pointer to descriptor
;                         private fields follow, depending on subclass

; descriptor fields, pointers to the implementations of each method
PUT     =       0       ; pointer to put sugroutine
			;   takes R4 as the value to put
GET     =       4       ; pointer to subtract routine
			;   returns R4 as the value gotten
        ...

This leads to the following code for a polymorphic call to the put method:

Access to a method of an compactly represented polymorphic stream
     
; assume R3 points to the stream, and R4 holds the character to put.
LOADS   R1,R3           ; follow tag to descriptor of object's class
LOAD    R1,R1,PUT       ; get pointer to the put method
JSRS    R1,R1           ; call the method
     

This is the approach used in most modern object oriented languages such as C++. What this means is that calls to methods of polymorphic classes cost, typically, one or at most two more instructions than calls to methods where there is only one possible subroutine. Good compilers always avoid using this more expensive mechanism whenever the method can be uniquely determined at compile time, and use it only where polymorphism is present.

Here is how a keyboard object might be initialized, from the code that goes in the file kbdstream.a:

Initializer for a keyboard stream object
        INT     KBDINIT
KBDINIT:
        ; given R3, a pointer to an uninitialized keyboard stream object
        LEA     R4,KBDDESCR
        STORES  R4,R3
	...			; other initialization could go here
        JUMPS   R1

        ALIGN   4
KBDDESCR:
        W       KBDPUT
        W       KBDGET
        ...

Languages like Java also do this, but then they add a new layer of inefficiency by including large numbers of default attributes in every instance of every class. Java objects all know their own names, for example, so each object representation holds a pointer to the string constant holding the object's name. This information is, of course, of great value during debugging, but it adds significantly to the memory requirements of large programs.

For this reason, object oriented languages almost always include a large collection of built-in types that are outside the class hierarchy of the language. Built in integer, character and floating point numbers are implemented directly with the hardware's natural data types and are not referenced by handles, but rather, are included directly in the objects where they appear.

Exercises

j) Write an implementation OUTSTR, output stream, that calls PUTCHAR for each character put on the output stream. Attempts to get input from an output stream should always return zero (ideally, these ought to raise an unimplemented operation exception).