13. Exceptions and Traps

Part of CS:2630, Computer Organization Notes
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Background

To this point, we have focused on purely sequential control transfers, with no provision for exceptional conditions. In fact, programmers never need to go beyond these simple models, but by adding exceptions to our programming languages and interrupt mechanisms to our hardware, we can write programs that are smaller and more readable. To illustrate this, consider the following simple program:

A simple subroutine with no exception handling
int distance( int x1, int y1, int x2, int y2 ) {
    /* return cartesian distance from <x1, y1> to <x2, y2> */
    return squareroot( square( x1 - x2 ) + square( y1 - y2 ) );
}

Assuming that square() and squareroot() do what they say, this code looks straightforward. That is, until you consider the possibility that something might go wrong. Addition, subtraction and squaring can all produce overflows, and a well-written square-root routine will always include provisions for some kind of response to a negative argument.

A classic way to deal with errors within functions is to define some return value as an error flag. For example, since neither the distance between two points nor the square of a number can be negative, we could use negative values as error indicators. One of the reasons that the IEEE standard floating point representation includes NaNs (values that are not a number) is to support this. Confining ourselves to the purely sequential subset of C and C++, this approach leads to the following rather ugly code:

Clumsy exception handling makes code hard to read
   
int distance( int x1, int y1, int x2, int y2 ) {
    /* return cartesian distance from <x1, y1> to <x2, y2>
       returns a negative value if there is an overflow */
    int squarex = square( x1 - x2 );
    int squarey = square( y1 - y2 );
    if ((squarex < 0) || (squarey < 0)) {
        /* taking the square produced an overflow */
        return -1;
    } else {
        return squareroot( squarex + squarey );
    }
}
   

The above code becomes even harder to read if it tries to detect addition or subtraction overflows. This led, in the late 1960s, to programming languages with what we now call exception handling mechanisms. The language PL/I developed by IBM in the mid 1960s had a crude mechanism, and Pascal and C, both developed around 1970, had primitives that could be used for exception handling. The modern model of exception handling matured with the development of the Ada language in the late 1970s. Variants of the Ada model have been incorporated into C++, Java and other recent languages.

Java's exception handler model is fairly widely understood: Throwing an exception passes an object of a Throwable class that may be caught by the catch clause of a try-catch block. Try-catch blocks may have multiple catch clauses that work much like the alternatives in a case-select construct. When an exception is thrown, it is delivered to the first catch that takes an argument of a class compatable with the Throwable that was thrown. C++ is similar. This implies a search of the available exception handlers whenever an exception is thrown, so C++ and Java style guides generally require that exceptions only be used as the last resort and not for routine control transfers. Here is a small example illustrating the Java exception syntax:

An exception handler in Java
static void X() {
    boolean K; // a local variable
    try {
        ... the body of X, code here or called
        ... from here might throw an IOException
    } catch (IOException e) {
        ... the handler for IOException
    }
}

Exceptions in some other languages are less complex. In Ada, for example, exceptions are not part of any class hierarchy and carry no value. The terminology is also different. In Ada, exceptions are raised, not thrown, and they are handled by when clauses at the end of a block that has an exception clause. Wherever the exception is raised in Ada, the result is a transfer of control to the current handler for that exception, with no need for a search. At any instant, each exception defined in the program must have exactly one active handler. Here is an Ada example:

An exception handler in Ada
procedure X is
    K: BOOLEAN; -- a local variable
begin
    ... the body of X, code here or called
    ... from here might raise Constraint_Error
exception
    when Constraint_Error =>
        ... the handler for constraint errors
end X;

In both examples, a subroutine named X installs a handler for an exception. If there was already a handler in place for this exception, that handler will be out of service until the body of X finishes. Within the body of X and any subroutines X calls, raising the exception will cause an immediate transfer of control to the handler, except if there is a more local handler for that exception. The exceptions used in each example are both predefined by the language definitions.

Note that raising an exception does more than just transfer control to the handler. It also restores the context of the handler. In the above Ada code, because the Constraint_Error handler is part of the subroutine X, the handler has access to any and all local variables of X; in this case, the boolean variable K. From this, we conclude that if local variables are stored in activation records pushed onto a stack, raising an exception must restore the stack to its state at the time the handler was installed.

Exercises

a) Modify the distance() routine given above so that it not only returns a negative value if the calls to square() or squareroot() involve overflows, but also if any of the sums or differences produce an overflow. Note that C and C++ do integer arithmetic without reporting any exceptions, so you must use explicit if statements to do this. Also, you cannot test (a+b)>max when max is the largest possible integer, so you need to do some simple algebra to make this comparison work.

b) What happens to the variables of a program when an exception is raised and handled? Which variables can be trusted to have well determined values, and of these, which will have their values unchanged from the time the handler was installed? Which variables have undetermined or ill defined values as a result of the exception being raised?

Exceptions in C

C does not have any built-in exception handling syntax, but it does contain tools that allow construction of exception handlers. These were added to the C standard library as an afterthought. The interface to these is in the setjmp.h header file, which defines the type jmp_buf and the functions setjmp and longjmp.

The official descriptions of these routines are not particularly helpful. The setjmp(jb) function is defined as returning zero when called, but having the side effect of initializing the jmp_buf named jb so that a call to longjmp(jb,x) will transfer control to the return point of that call to setjmp, returning the value x. Without an example, this doesn't make any particular sense, but in fact, it is sufficient to allow us to demonstrate how to construct the analog of a try-catch block in C:

An exception handler in C
jump_buf BoundsError;

void X() {
    int i; // a local variable
    if (!setjmp( BoundsError )) { /* try this */
        ... the body of X, code here or called
        ... from here might call longjmp( BoundsError, 1 )
    } else { /* catch BoundsError */
        ... the handler for BoundsError
    }
}

The above if statement behaves like a try-catch block because the normal return from a call to setjmp is zero, which is taken as false, so the negated if-condition is true and the body of the if statement is executed. When the code calls longjmp with a non-zero second parameter, this becomes the return value, code returns to inside the expression giving the if condition, but this time, the nonzero value is taken as true, negated, and the else clause is executed.

But wait! How can setjmp(jb) change the value of the jmp_buf named jb? Aren't parameters in C passed by copying their values to the called function? Therefore, shouldn't it be impossible for this to change jb? The answer lies in the definition of the type jb. This is declared as follows inside setjmp.h:

typedef __jbthing jumpbuf[__jbsize];

That is, each instance of jumpbuf is an array of some number of things. As such, when you pass a jumpbuf as a parameter, you are following the C rules for passing an array, which is to say, the actual parameter is the address of the first array element, not a copy of the array.

The details of what is in the array and how big it is depend on details that C hides from the user. These details depend not only on the CPU being used, but on how the compiler makes use of the resources on that CPU.

This code given above illustrating the use of C's long jumps for exception handling is only halfway to being an example of civilized exception handling. Yes, a call to longjmp(BoundsError,1) will transfer control to the handler in the body of X, but:

We can fix these errors by writing more code:

A better exception handler in C
jump_buf BoundsError;

void X() {
    int i; // a local variable
    jmp_buf BoundSave;
    memcpy( BoundSave, BoundsError, sizeof( jmp_buf ) );
    if (!setjmp( BoundsError )) { /* try this */
        ... the body of X, code here or called
        ... from here might call longjmp( BoundsError, 1 )
        BoundsError = BoundSave;
        memcpy( BoundsError, BoundSave, sizeof( jmp_buf ) );
    } else { /* catch BoundsError */
        memcpy( BoundsError, BoundSave, sizeof( jmp_buf ) );
        ... the handler for BoundsError
    }
}

The above code behaves remarkably like an Ada or Java exception handler. The former handler is restored at the end of the home-built try block, so on normal exit, where the exception has not been thrown, the former handler is back in place. If the exception is thrown, the home-built catch block begins by restoring the old handler. The old handler must be restored at the start of the catch block because code in the catch block could throw the exception, and if it does, the catch block must pass control to an outer catch block.

In the above code, we wanted to write the assignment statement

BoundSave = BoundsError;
but we were forced to write:
memcpy( BoundSave, BoundsError, sizeof( jmp_buf ) );
unfortunately, the variables are jump buffers, and as mentioned above, type jmp_buf is an array type. C does not allow assignment of one array to another, and neither the number of elements nor the nature of each element is available, so we can't even write a for loop to copy the elements.

The solution to this problem is the function memcpy, part of the strings.h collection in the C standard library. This function copies a block of memory from one place to another.

Implementing setjmp and longjmp

The documentation for setjmp and longjmp gives no real explanation of what goes in the jump buffer. Looking at historical documentation from the late 1970s, it is apparent that the entire long-jump package is an afterthought. It is not documented in the first edition of Kernighan and Ritchie from 1978, but it shows up in the 7th Edition Unix manual from 1979.

As an afterthought entirely implemented in the subroutine library without support from the compiler, the jump buffer may only hold information available to a called subroutine. A good guess is that the jump buffer holds the return address passed to setjmp, the value of the stack pointer (in the case of the Hawk, R2), and perhaps more saved registers. The documentation for setjmp warns that local variables of the routine that called setjmp may or may not be correctly restored depending on the compiler settings and other details. If a version of setjmp for the Hawk saved R8 to R15 in the jump buffer, optimized code using them to hold local variables would behave differently from code where those variables were stored in the activation record.

Exercises

c) Write SMAL Hawk code for SETJMP and LONGJMP that are equivalent to the routines in the C standard library; this requires you to define the structure of the jump buffer.

d) Consider writing TRY(e){block}CATCH(e){handler} in a C program, where TRY and CATCH are macros, and e is an exception, that is, the name of a jmp_buf. To throw, an exception in this context, use the macro THROW(e). Write C preprocessor #define statements for these macros. (Hint: They are ugly, and they add a few layers of nested curly braces around the ones in the user code given above.)

Exceptions at the Machine Level

The above discussion of implementing exceptions was based on C standard library routines that were an afterthought. If you start from scratch, you can do better. The data structure of an exception can be quite simple. Each exception just needs only two fields:

If we are writing SMAL Hawk code, it makes sense to put the declarations applicable to all exceptions in a header file for use anywhere exceptions are used.

The header file for Hawk exception handling
; exceptions.h -- declarations applying to exceptions

; an exception is a record structure with the following fields
EXHAND  =       0       ; one word, the address of the handler
EXAR    =       4       ; one word, the handler's activation record
EXSIZE  =       8       ; the total size of the exception record

In Ada, there are a number of predefined exceptions, CONSTRAINT_ERROR for example, is raised or thrown by an attempt to use an array index outside the legal range, by an attempt to follow a null pointer or by an attempt to assign a value to a variable outside that variable's legal range of values. To allow access to a CONSTRAINT_ERROR exception in assembly language, we can use the following code, duplicated in in each source file where this exception is raised or handled:

Declaring an exception in a Hawk program
           
        USE     "exceptions.h"

        COMMON  CONSTRAINT_ERROR, EXSIZE
           

This code merely declares a global variable, an uninitialized a block of EXSIZE bytes named CONSTRAINT_ERROR. Each file that repeats this code will be able to load its address into a register. Recall that repeating a common declaration creates only one shared copy of that common block.

Suppose a routine within the program wants to install a handler for constraint errors. The routine must include, in its activation record, space to save the previous handler for the exception. For example, if a routine wants to handle CONSTRAINT_ERROR, it could define CONSSV as a local variable of size EXSIZE. If we define this as a normal local variable, the scope of CONSSV would be the entire subroutine where it is used. In reality, though, it is only needed between the opening of the try block and the start of the handler itself. In the code given here, we will use an alternative approach, adding the save location to the activation record only temporarily where it is needed.

We begin by presenting a try-catch block written in SMAL Hawk assembly language using macros to do the dirty work, and then we will define the macros we need. The definitions for these macros belongs in exceptions.h.

Overall structure of a SMAL Hawk try-catch block
               
        TRY     CONSTRAINT_ERROR, MYHAND

        ... body of try clause

        CATCH   CONSTRAINT_ERROR, MYHAND

        ... body of handler

        ENDTRY  MYHAND
               

Unlike a try-catch block in Java or Ada, we have to name the exception we are trying to catch at both the start and end of the try clause, and we need to name the entire block with a name for this handler. The reason for this will be apparent when we dig into the macros involved.

Start macro for a try-catch block
        MACRO   TRY EXEP, HAND  ; uses R1,R4-5
	  ; begin a try block for exception EXP and handler HAND
          LIL   R1,EXEP

          LOAD  R4,R1,EXHAND    ; -- first save old handler on stack
          LOAD  R5,R1,XAR
          STORE R4,R2,ARSIZE+EXHAND
          STORE R5,R2,ARSIZE+EXAR
          
          LEA   R4,HAND         ; -- then install new handler
          STORE R4,R1,EXHAND
          STORE R2,R1,EXAR

  ARSIZE  =     ARSIZE + EXSIZE ; -- finish pushing the save location
        ENDMAC

The code in this macro is written assuming no optimization of stack pointer increments and decrements, so all displacements to local variables from R2 are positive and the stack pointer must be incremented and decremented immediately before and after each call. Also, this code assumes that R1 is available, so the stack pointer must be stored in the activation record.

The most confusing feature of this code is where it stores the old handler in the activation record. Instead of creating a named field of the activation record, the code uses the top of the activation record (displacement ARSIZE) as the index of the new field, and then it increments ARSIZE by the size of the field. In effect, entering the try block pushes a saved handler on the stack.

In planning the CATCH macro, note that it must do two things. Just as in the C code presented above, it must include code to end the try block, restoring the old handler if no exception was raised, and it must include code to start the catch block, again restoring the old handler.

Macro to end the try block and start the catch block
        MACRO   CATCH EXEP, HAND; uses R1,R4-5
	  ; add handler to try block for exception EXEP and handler HAND
  ARSIZE  =     ARSIZE - EXSIZE ; -- start pop of save location

          LIL   R1,EXEP
          LOAD  R4,R2,ARSIZE+EXHAND
          LOAD  R5,R2,ARSIZE+EXAR
          STORE R4,R1,EXHAND
          STORE R5,R1,EHAR      ; -- old handler restored
          BR    HAND'QT         ; -- skip over catch block
  HAND:
          LIL   R1,EXEP
          LOAD  R4,R2,ARSIZE+EXHAND
          LOAD  R5,R2,ARSIZE+EXAR
          STORE R4,R1,EXHAND
          STORE R5,R1,EHAR      ; -- old handler restored
        ENDMAC

There is one obscure feature of SMAL used above. Inside a SMAL macro definition, apostrophes are elided. So, the apostrophe in HAND'QT will be removed in the expansion. Since HAND is a macro formal parameter, if the macro is called with MYHAND as the actual parameter, the resulting symbol is MYHANDQT. The purpose of the apostrophe here is to allow the macro processor to recognize the formal parameter name.

Compared to the above, the final macro we've used, for finishing the handler, is trivial. All it does is generate the final label so that the end of the catch block can branch around the handler:

Macro to end the try block and start the catch block
        MACRO   ENDTRY HAND
	  ; finish try-catch block for handler HAND                     
HAND'QT:
        ENDMAC

The final macro we need for this approach to exception handling is one to throw an exception:

Throwing an exception
        MACRO   THROW EXCEP       ; uses R1,R2
          ; throw exception EXCEP
          LIL     R1,EXCEPTION
          LOAD    R2,R1,EXAR      ; -- restore stack context
          LOAD    PC,R1,EXHAND    ; -- transfer control to handler      
        ENDMAC

Note that R3 is not used, so values placed in R3 before an exception is thrown are available immediately in the handler. In effect, R3 can be used to pass an object to the handler for Java-style exception handling.

It should be noted that throwing an exception directly from within a try-catch block using THROW is simply an expensive alternative to using a branch instruction directly to the handler. If the only places an exception is thrown are directly within the try-catch block and not from called subroutines, there is no reason to use this entire mechanism.

Throwing a local exception
        BR      MYHAND          ; go to a local handler                 

In languages like Java, the compiler forces every subroutine that might throw an exception to declare that fact, and then it checks to make sure that no call is made to such a subroutine outside the context of a try-catch block. This is a bit ugly, and it places significant limits on how exceptions are used.In assembly language or other languages with less strict checking, it is useful to have some kind of default handler attached to every exception.

Default handlers should be installed for all exceptions in the main program. Typically, default handlers simply print out a message saying something like "unhandled exception" with the name of the exception and then terminate the application.

Exercises

e) Write a Hawk main program to install a default CONSTRAINT_ERROR handler before calling an external routine APP. The handler should print an error message and exit.

f) The macros above do not allow for optimized pushing and popping the stack pointer. The discussion of activation records in Chapter 6 gave an alternate model, where R2 is incremented just once on entry to a subroutine and decremented just once before exit. Between these two points, the local variable LOC would be addressed at displacement LOC-ARSIZE. Write new versions of TRY, CATCH and ENDTRY that work in this environment. Note that the hard part of this problem is dealing with the dynamic growth of the activation record in TRY and its shrinkage in CATCH.

An Example

To illustrate these exception handling tools, consider the problem of adding support for exceptions to an unsigned multiply routines from Chapter 9. Multiplication can produce values too large to store in one word, so it is appropriate to throw a constraint error ohen this occurs.

Unsigned multiplication raising an exception on overflow
MULTIPLY:
        ; expects R3 = unsigned multiplier
        ;         R4 = unsigned multiplicand
        ; uses    R5 = temporary copy of multiplier
        ; returns R3 = unsigned product -- or CONSTRAINT_ERROR thrown
        MOVE    R5,R3           ; -- move the multiplier
        CLR     R3              ; product = 0;
        TESTR   R5
        BZS     MULTLX          ; if (multiplier != 0) {
MULTLP:                         ;   do { -- loop exits by break
        SRU     R5,1            ;     multiplier = multiplier >> 1;
        BCR     MULTE2          ;     if (multiplier was odd) {
        ADD     R3,R3,R4        ;       product = product + multiplicand;
        BCR     MULTE1          ;       if carry set {
        THROW   CONSTRAINT_ERROR;         throw constraint error
MULTE1:                         ;       }
MULTE2:                         ;     }
        TESTR   R5
        BZS     MULTLX          ;     if (multiplier != 0) break;
        SL      R4,1            ;     multiplicand = multiplicand << 1;
        BCR     MULTE3          ;     if carry set {
        THROW   CONSTRAINT_ERROR;       throw constraint error
MULTE3:                         ;     }
        BR      MULTLP          ;   }
MULTLX:                         ; }
        JUMPS   R1              ; return product;

This code tests the carry bit to detect unsigned overflows after both the ADD and SL. On detecting an overflow, the code throws the exception. This code can be optimized. Each call to the THROW macro generates exactly the same sequence of 4 instructions. There is no need to duplicate this sequence, it can be written just once with conditional branches to it from each place where that exception is thrown.

On modern pipelined CPUs, it turns out that it is faster to fall through a branch than to follow it, so in general, code for uncommon situations should be placed at the target of a branch, while the normal case should be to fall through, not taking the branch.
 

Alternative code to throw constraint error
        BCS     RCONERR      ; if carry set, throw constraint error
        ...

; common code used to throw constraint error exception from anywhere
RCONERR:
        THROW   CONSTRAINT_ERROR

 

Suppose distance(x1,y1,x2,y2) computes the distance between two points (x1y1) and (x2y2) as ((x1x2)2 + (y1y2)2)1/2, where squaring is done with the above multiply routine. The following code calls this and outputs the distance in decimal, for example, saying "39", but if there is an range-error exception, it outputs "too big". To do this, it installs a handler.

In the following code, stack-pointer adjustment has been optimized out between the calls to DISTANCE and PUTDEC, but not between the other subroutine calls. This is because the stack pointer must be in the same place on entry to the try-catch block, on entry to the handler, and therefore, on exit from the try-catch block. The easiest way to do this is to avoid all stack-pointer optimizations at these points.

There are other ways to handle exceptions. Consider representing an exception with a global variable that is a pointer to an exception object. The object itself goes in a local variable in the block that installed the current handler. This local variable holds the addresses of the handler, its activation record pointer and the previous value of the global pointer.

This alternative approach makes saving and restoring exceptions a little faster because there is no need to copy the contents of the exception object. On the other hand, the code to raise an exception is slower because it has to follow a static pointer to the current exception object instead of using a static location directly. If exceptions are thrown rarely but there are many local handlers, this will probably lead to a performance improvement.
 

Code that installs a handler
; activation record format for the Show Distance routine
RETAD   =       0
X1      =       4       ; saved parameters
Y1      =       8
X2      =       12
Y2      =       16
ARSIZE  =       20

SHOWDIST:       ; expects (R3,R4) = (x1,y1) point 1
                ;         (R5,R6) = (x2,y2) point 2
        STORES  R1,R2           ; -- save return address
        STORE   R3,R2,X1
        STORE   R4,R2,Y1
        STORE   R5,R2,X2
        STORE   R6,R2,Y2        ; -- save parameters

                                ; try {
        TRY     CONSTRAINT_ERROR,MYHAND

        LOAD    R3,R2,X1        ;   -- parameters
        LOAD    R4,R2,Y1
        LOAD    R5,R2,X2
        LOAD    R6,R2,Y2
        ADDI    R2,ARSIZE
        JSR     R1,DISTANCE     ;   R3 = distance( x1, y1, x2, y2 );

        LIS     R4,1            ;   -- parameter 1
        LIL     R1,PUTDEC
        JSRS    R1,R1           ;   putdec( R3, 1 );
        ADDI    R2,-ARSIZE
				; } catch {
        CATCH	CONSTRAINT_ERROR,MYHAND

        LEA     R3,SDTBIG       ;   -- parameter
        LIL     R1,PUTS
        ADDI    R2,ARSIZE
        JSRS    R1,R1           ;   puts( "too big" );
        ADDI    R2,-ARSIZE

	ENDTRY	MYHAND		; }

        LOADS   R1,R2
        JSRS    R1,R1           ; return

SDTBIG: ASCII   "too big",0

 

Exercises

g) Give code for saving the old handler and installing a new one, code for restoring the old handler, and code for raising an exception, using the alternative representation for exceptions suggested at the end of the section above. Does save code space? Does it save stack space?
 

Traps: Exceptions Detected by Hardware

While the overflow and carry condition codes of the Hawk and other computers can be used to detect exception conditions in integer arithmetic, the most important hardware exception detection mechanisms center around detecting illegal instructions, illegal memory references, and other major program faults. Most processors can detect some or all of the following:

System Reset
Someone pushed the reset button on the computer system, or the machine was just turned on.

Bus Error
An attempt was made to read or write a nonexistant memory location or to write to a read-only location. Following uninitialized or null pointers can cause this.

Illegal Instruction
An instruction was executed that is not defined in the instruction set of this computer. Accidentally jumping to a random memory location can cause this.

Privilege Violation
An instruction was executed that is not permitted in the current operating mode. In the case of the Hawk, the CPUGET and CPUSET instructions are only permitted if the level field of the processor status word is not 11112.

In general, the central processors of modern computers have no understanding of the exception handling mechanisms of high-level programming languages. Instead, processors have a different exceptiom model, referred to as a trap mechanism. Trap mechanisms were introduced on computers of the 1950s and 1960s, well before the exception models of high-level languages. In almost all cases, trap handlers are part of the operating system, not applicaton programs.

Traps on the Hawk operate as follows: First, the instruction that caused the trap is aborted. All traps other than the reset trap can be traced back to the execution or attempted execution of some instruction. Then, several things are saved. The address of the instruction that caused the trap is saved in a special register, TPC or the trap program counter. If the trap involved an illegal memory address, that address is saved in another register, TMA or the trap memory address. At the same time, the level field of the processor status is saved in the old-level field. Finally, after it is saved, the program counter is set to a value determined by the trap, and the level field of the processor status word is set to zero (the most permissive level).

The Hawk processor status word
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
level old level BCD carry N Z V C

It is important to note that the Hawk guarantees that traps will always be detected before an instruction has any side effects on registers or memory. Therefore, the old-level field of the processor status word, the value of the TPC register and the contents of general purpose registers 1 to 15 completely describe the state of the program immediately before it executed the instruction that caused the trap.

When a trap occurs, the new program counter value is set depending on the cause of the trap. Each component of the processor that detects trap conditions provides a 4-bit code to identify the cause; this code sets bits 4 to 7 of the new program counter value, with the remaining bits set to zero. As a result, memory locations from 0 up to 10016 are reserved for trap processing. These locations constitute the trap vector of the Hawk. The following locations in the trap vector are used by the trap hardware described here:

The Hawk Trap Vector
System Reset 0000000016   Power-on or external reset.
Bus Error 0000001016 Illegal memory reference.
Illegal Instruction 0000002016 Use of an unimplemented opcode.
Privilege Violation   0000003016 CPUGET or CPUSET when level is 11112.

Higher numbered locations in the trap vector are available for use by coprocessors, and they are used by interrupt service routines, topics we will discuss later.

We need tools to examine and set the special fields of the processor status word and the TPC and TMA registers. These are all manipulated using the CPUSET and CPUGET instructions. For example, CPUGET R4,TPC loads the trap program counter TPC into R4, and CPUSET R4,PSW stores the contents of R4 into the processor status word. These instructions can access 16 special registers within the processor. In commercial processors, special registers are commonly included for self testing, and sometimes, the interfaces for devices closely allied to the processor. There is also a TSV register, advertised as the trap-save register but actually just 32 bits with no committed purpose – although we will need it as a save location during the entry to a trap handler.

When CPUGET is used to load a special CPU register into register zero, this is interpreted loads the program counter, and it has the side effect of restoring the level field of the PSW to the old level. Thus, if the only instruction of a trap handler is CPUGET R0,TPC the effect is to completely undo the effects of the trap and try, again, to execute the instruction that caused it. Doing this without any other code to handle the trap is almost certain to lead to another trap, putting the processor into an infinite loop. The hawk.h file defines the instruction RTT, meaning return from trap as an alternative name for CPUGET R0,TPC.

Exercises

h) What special registers are available within the Hawk processor and how are they encoded for reference by the CPUSET and CPUGET instructions; you will have to look this up in the Hawk manual.

i) Why is the CPUSET instruction dangerous enough that it forces a trap when a user tries to execute it when the level field is 11112?

A Trap Handler

The default trap handlers installed by the Hawk monitor simply print out the contents of the trap program counter and trap memory address registers before halting the program. This helps during debugging, but if a program needs to handle the exception, it is not enough. To handle an exception, the program must install code in the appropriate location in the trap vector to jump to its own trap handler. This handler should then translate from the Hawk's trap model to the exception handling model of your program.

Each entry in the Hawk trap vector is only 16 bytes long, which is 8 halfwords or 4 words. If we have no desire to be able to examine or restore registers, this is more than enough to allow a jump to the user's own trap service routine:
 

Minimalist code to intercept bus error traps
.       =       #10             ; bus error entry in the trap vector
        LIL     R1,BUSER
        JUMPS   R1              ; go to rest of handler

 

This minimalist code destroys R1, which might prevent us from writing a handler that displays R1 for debugging. The Hawk monitor uses a special register to solve this problem, TSV, the trap-save register. TSV exists only to solve this one problem and is never used by application programs.
 

Better code to intercept bus error traps
.       =       #10             ; bus error entry in the trap vector
        CPUSET  R1,TSV          ; save R1
        LIL     R1,BUSER
        JUMPS   R1              ; go to rest of handler

 

This second version of the trap vector code given above uses a total of 8 bytes of code. There is room to squeeze in four more halfwords of code, if we wanted to do so.

Given either of the above bits of code, we can write a trap handler to convert bus error traps into constraint error exceptions, using the Ada convention that the exception CONSTRAINT_ERROR is to be raised whenever a program tries to use an invalid pointer.

Better code to intercept bus error traps
        COMMON  CONSTRAINT_ERROR,EXSIZE

BUSERR:                ; intercept bus error exceptions
        LIL     R1,CONSTRAINT_ERROR
        LOAD    R2,R1,EXAR      ; R2 = CONSTRAINT_ERROR.EXAR
        LOAD    R1,R1,EXHAND
        CPUSET  R1,TPC
        CPUGET  R0,TPC          ; PC = CONSTRAINT_ERROR.EXHAND
                                ; note:  PSW.level is restored!

If we wanted our default trap handler to be able to print out all of the registers as they were at the time of the trap, we would have to add more code to the above to save the values of registers in some dedicated memory area before using any registers for trap handling. This is illustrated in the next section.

Exercises

j) Modify the bus error trap handler given above so it saves copies of every register it modifies in a common block called TRAPSAVE, including the copy of R1 that was initially saved in TSV. This lets the trap handler recover this information if it is needed.

Emulation Traps and Virtual Machines

Traps have important uses other than exception handling. Of these, one of the most important is in the creation of a virtual machine that supports features not present in the actual hardware. To a user writing machine language code, a virtual machine appears to be hardware, except that some instructions take much longer to execute than others. These are the instructions that are executed by software in trap handlers, not by hardware.

The term virtual machine borrows from the field of optics. Systems of mirrors and lenses may cast several images. Some are real. If you put an image sensor where a real image appears to be, it will capture that image. Others are said to be virtual. They look like images, but if you put the sensor where a virtual image seems to be, you get nothing but a blur. Virtual images are always seen through lenses, reflected in mirrors, or near holograms.

From the programmer's viewpoint, a virtual machine looks like a real machine, but it is an artifact created by a mixture of hardware and software. The Hawk virtual machine is created entirely by software, the Hawk emulator. Virtual machines that mix real hardware with software are created using trap handlers.

The Hawk has several unimplemented opcodes. Executing any of them causes an illegal instruction trap. In addition, if the level field of the processor status word is 11112, CPUSET and CPUGET cause privilege violation traps. As we will see later, this, in combination with a memory management unit, allows us to securely confine an untrusted program to a virtual machine with access to only limited parts of main memory.

The opcode 0001dddd 1000ssss2 is unimplemented on the Hawk. As a halfword, with the register fields set to zero, this is 801016. We can write an unimplemented instruction trap handler to make this into a virtual register-to-register multiply instruction. The software to do this will need a way to access any register, so our handler needs to save everything in a register save area on entry and then restore it on exit.

Defining a Register Save Area
        COMMON  SAVEAREA,SAVESIZE
PCSAVE  =       0  << 2         ; displacements in the save area
R1SAVE  =       1  << 2
R2SAVE  =       2  << 2
        .
        .
        .
RESAVE  =       14 << 2
RFSAVE  =       15 << 2
PSWSAVE =       16 << 2
                                ; additional temporaries needed
                                ; by the trap handler can go here
SAVESIZE=       17 << 2

Note that we arranged the registers in order so we can use a 4-bit register number as an array index. The location labeled PCSAVE is entry zero in this array because when the program counter is referenced by the register field of an instruction, it is register zero. This does not matter in our code to save and restore registers, but it matters inside a trap handler that does virtual machine emulation.

The code to save the registers begins in the trap vector, right after saving R2 in the TSV register. Each entry in the trap vector is 16 bytes, so we will use this space before jumping to the remainder of the handler.

Unimplemented instruction trap vector code
.       =       #20             ; unimplemented instruction trap
        CPUSET  R2,TSV          ; save R2 temporarily
        LIL     R2,SAVEAREA
        STORE   R1,R2,R1SAVE    ; save R1 properly
        LIL     R1,UNIMP
        JUMPS   R1              ; go to the handler
        ; the above code occupies exactly 16 bytes!

Why did we lay out the save area as if it was an activation record? Why did we use R2 as our save-area pointer? The reason is, within the trap handler, we use the save area in much the same way we have used the stack. By using R2 to point to the save area, the handler will look, to a large extent, just like a subroutine which is called in an odd way and must return oddly. If we want to call subroutines within the trap handler, we can even add space for a stack at the end of the register save area. When a stack is needed by a trap handler, it is usually small.

The remainder of the trap handler is closely tied to the above code. On entry, it must assume that only R1 has been completely saved, and that R2 is still saved in TSV. The fact that the Hawk LOAD and STORE instructions do not touch the condition codes is essential to the correct functioning of this code, since the PSW has yet to be saved. The following code saves the entire CPU state:

Saving the rest of the registers
; unimplemented instruction trap handler continuation

UNIMP:  ; expects R2 points to the register save area
        ;         R1 has been saved properly
        ;         TSV still holds the saved value of R2
        ;         PSW (including CCs) still unsaved
        CPUGET  R1,PSW
        STORE   R1,R2,PSWSAVE   ; PSW is now properly saved
        CPUGET  R1,TSV
        STORE   R1,R2,R2SAVE    ; R1 is now properly saved
        CPUGET  R1,TPC
        STORE   R1,R2,PCSAVE    ; PC is now properly saved
        STORE   R3,R2,R3SAVE    ; continue saving R3 to R15
        STORE   R4,R2,R4SAVE
        STORE   R5,R2,R5SAVE
                .
                .
                .
        STORE   R13,R2,RDSAVE
        STORE   R14,R2,RESAVE
        STORE   R15,R2,RFSAVE   ; entire CPU state is now saved

The code for detecting the instruction we want to emulate goes next, but before we distract ourselves with this, we will focus on the problem of returning from the trap. We do this by restoring the registers, and then, finally, we use CPUSET and CPUGET to restore the program counter and processor status word.

Restoring the registers
; return from any trap handler
TRAPEXIT:
        ; expects R2 points to the register save area
        LOAD    R15,R2,RFSAVE   ; start restoring registers
        LOAD    R14,R2,RESAVE
        LOAD    R13,R2,RDSAVE
                .
                .
                .
        LOAD    R5,R2,R5SAVE
        LOAD    R4,R2,R4SAVE
        LOAD    R3,R2,R3SAVE    ; R15 to R3 now all restored
        LOAD    R1,R2,PCSAVE
        CPUSET  R1,TPC          ; PC partially restored
        LOAD    R1,R2,PSWSAVE
        CPUSET  R1,PSW          ; PSW now restored
        LOAD    R1,R2,R1SAVE    ; R1 restored without changing CCs
        LOAD    R2,R2,R2SAVE    ; R2 restored without changing CCs
        CPUGET  R0,TPC          ; return from trap!

The above code uses the strange semantics of R0 on the Hawk machine. In the context of the CPUGET instruction, R0 refers to the program counter, so the final instruction above loads the program counter from the TPC register. This has the side effect of restoring the level field of the PSW.

In general, different machines use different but equally difficult to follow instruction sequences for trap entry and exit. The net result is almost always an orderly block of memory holding the complete saved CPU state.

The code that does the actual work of our virtual machine, between saving and restoring the registers, begins by finding out what instruction was attempted. If it was not 0001dddd 1000ssss2, the instruction we are trying to emulate, we should raise the same exception that the original handler would have raised.

Checking what instruction caused the trap
; see if the unimplemented instr was 0001dddd 1000ssss (#8s1d)
        ; expects all registers have been saved
        ;         R2 points to the register save area
        LOAD    R3,R2,PCSAVE    ; get the saved program counter
        LOADS   R4,R3
        EXTH    R5,R4,R3        ; get the halfword it points to
                                ; -- the instruction that trapped!
        LIL     R6,#F0F0
        AND     R6,R5           ; clear all but the opcode bits
        CMPI    R6,#8010        ; is it the target instruction?
        BEQ     ISMULT

        LIL     R3,PROGERR      ; not the target, raise exception!
        LOAD    R4,R3,EXAR
        STORE   R4,R2,R2SAVE    ; edit saved processor state
        LOAD    R4,R1,EXHAND
        STORE   R4,R2,PCSAVE
        JUMP    TRAPEXIT        ; normal trap exit code finish it

In the above, once we separated out the instruction or instructions we intend to virtualize, the remaining illegal instructions are converted to program error exceptions.

In this code, references to saved registers refer to the values in the usere's registers at the time of the trap, completely independent of the current values in the hardware registers. The distinction between the saved register values and the actual registers is crucial in understanding how the following emulation code works.

With the above in place, we can extend the user's virtual machine by writing the code that gives meaning to the undefined opcode we selected for multiply. The code begins by getting copies of the user's saved source and destination registers, and after it multiplies them, it returns the result to the user's saved destination register. In addition, the handler increments the saved program counter before return.

Making an unimplemented instruction do multiplication
ISMULT: ; illegal instruction trap, instruction was #8s1d
        ; expects R2 points to register save area
        ;         R3 is the saved program counter
        ;         R5 is the halfword pointed to by PC, #8s1d
        MOVE    R6,R5           ; copy the instruction
        SR      R5,8
        TRUNC   R5,4            ; R5 is the src register number
        TRUNC   R6,4            ; R6 is the dst register number

        ADDSI   R3,2
        STORE   R3,R2,PCSAVE    ; increment saved PC

        ADDSL   R5,R2,2         ; R5 points to saved dst register
        ADDSL   R6,R2,2         ; R6 points to saved src register

        LOADS   R3,R5           ; R3 is the dst operand
        LOADS   R4,R6           ; R4 is the src operand

        ... code to multiply R3 = R3 * R4 not using R5 and R6 ...

        STORES  R3,R5           ; update the saved dst register
        JUMP    TRAPEXIT        ; go return to user

The above code ignored some issues. It did not set the saved condition codes to tell about the result; it would make sense to report if the result was zero, and if we are using signed multiply, the sign of the result. If the result was too big for 32 bits, setting the overflow bit would make sense. Also, like the built-in hardware arithmetic instructions, the result should not be saved when it is R0.

Now that we have defined a multiply instruction, we should extend our assembly language with a multiply macro added to hawk.h. Except for the comments, the following macro is the as we would have used if the Hawk had a built-in multiply instruction with the opcode our trap handler uses.

How to assemble the new instruction
        ; note: the following uses an unimplemented
        ; opcode and is implemented by emulation using
        ; a trap service routine.
        MACRO   TIMES   =dst,=src
          B #10 | (dst - R0)
          B #80 | (src - R0)
        ENDMAC

In this macro, we wrote dst-R0 and src-R0 in order to allow the assembler to do some error checking. By explicitly subtracting R0 from each register, we force the assembler to check that dst and src are relocated relative to the same relocation base as is used for the registers. This trick is used in the hawk.h file, where an external symbol is used as the relocation base in defining the symbols R0 to R15. This external symbol is never defined, and it never needs definition because its value is subtracted from every place where register numbers are actually used to form values loaded into memory.

Exercises

k) Find all of the opcodes in the Hawk instruction set that cause illegal instruction traps. Do this by searching the lists of opcodes in the appendices of the Hawk manual.

l) What code should you add to the start of the main program if you want user code to be able to use the TIMES instruction defined here?

m) Suppose you wanted to be able to call Hawk monitor routines from your trap handler, for example, to use the multiply or divide routines in the monitor. Some of these use R2 as a stack pointer. What code would you add to the code given above above for trap handlers in order to allow this? (Hint: We used R2 as a pointer to the register save area for a reason.)

n) Assume you are not allowed to call subroutines from the trap handler. Add the multiply code to the figure above in place of the text ... code to multiply R3 = R3 * R4 ... (You will have to add this code in-line, without calling a subroutine).

o) The virtual multiply instruction given above does not set the condition codes. Add code so that it sets the saved values of N and Z to report whether the result is negative or zero before it returns.

Interrupts

The trap handling mechanisms described above serve a second purpose on most computers, handling external events signalled by input-output devices. When these mechanisms are triggered by a device, we refer to the event as an interrupt, not a trap, but the action taken by the CPU remains the same. The control transfer uses the trap vector, so it is also called the interrupt vector, and the code to save and restore registers is the same. Only the cause of the event and the handler are different.

While the transfers of control to trap and interrupt handlers are quite similar, the handlers differ. With the exception of traps used to create virtual machines, trap handlers usually raise exceptions in the running program to indicate some type of failure. In contrast, interrupt handlers usually handle input-output activity and then return to the interrupted program without making any changes immediately visible to that program.

In essentially all modern computers, the fetch execute cycle of the hardware is more complex than we have indicated up to this point, in part because of the need to handle exceptional conditions, and in part because of the need for interrupt handling. The Hawk processor is quite typical; here is a more honest and complete presentation of the algorithm implemented by the Hawk processor:

The fetch-execute cycle
int trapcode, interruptcode;
while (TRUE) do {
        try {
                fetch_and_execute_one_instruction();
        } catch( buserror ) {
                trapcode = 0x10;
        } catch( illegalinstr ) {
                trapcode = 0x20;
        } catch( privilegeviol ) {
                trapcode = 0x30;
        }
        if (trapcode == 0) {
                trapcode = interruptcode;
                interruptcode = 0;
        }
        if (trapcode != 0) {
                PSW.old_level = PSW.level;
                PSW.level = 0;
                TPC = PC;
                PC = trapcode;
                trapcode = 0;
        }
}

Under normal circumstances, where instruction raise no exceptions and no devices request interrupts, the extra logic shown here does nothing. It behaves like the simple fetch-execute cycle that most programmers imagine underlying their code. If an instruction raises a trap, on the other hand, or if an external device requests an interrupt, things are different. The extra logic shown above would slow down a software implementation, but if this is done in hardware, there is no run-time cost except when there is actually an interrupt or trap.

Interrupts and traps both use the same trap vector. In the case of traps, it is up to the hardware inside the processor to specify which trap vector entry will be used. In the case of interrupts, it is up to the external device to specify this. Typically, a set of wires in the input-output bus will be used to convey the entry number when a device requests an interrupt. The above code shows assignment of full-sized memory addresses to trapcode and interruptcode, but in a real system, only a few bits of these actually vary. If the Hawk was implemented as a microprocessor chip, 4 pins on the chip would be used for the trap and interrupt codes, while the remaining bits would always be set to zero.

On the software side, interrupt service routines save and restore registers in exactly the same way that trap service routines do. Between saving and restoring registers, what an interrupt service routine does is usually completely different. The primary job of an interrupt service routine is to do the input/output that the device ready condition indicates can be done. So, if a serial output device requests an interrupt, the corresponding interrupt service routine should output one character. The details of this are the primary subject of a significant part of a typical course on operating systems.

External devices should only raise their interrupt request (IRQ) lines when both the deivice is ready to perform an input or output operation and the program running on the central processor is ready to deal with it. All modern computers have are hierarchies of interrupt enable bits. Each device has its own interrupt enable bit, and there are also one or more global interrupt enable bits. In the case of the Hawk, the global interrupt enable bits are the level field of the processor status word. If this field is 00002, all interrupts are disabled, while 11112 enables all interrupts. Intermediate values enable just some.

Typically, the ready bit in each device's status register determines when that device will attempt to request an interrupt, but only when that device's interrupt enable bit is set. The device-specific interrupt enable bit is completely separate from the global interrupt enable bits in the central processor. The Hawk keyboard interface register is typical:

Interrupt request logic in the keyboard status register
Keyboard Status Register

The Hawk level field is typical of modern processors. It determines the interrupt priority level. Higher priority devices can interrupt lower priority devices, but not the reverse. Many processors have a set of outputs that directly report the current priority level to external logic, and that logic gives meaning to these bits. In the case of the Hawk, there are two good ways to do this. First, each bit of the level field could be used to enable a different group of devices. In that case, we could have 5 levels, encoded as 00002, 00012, 00112, 01112, and 11112. This allows simple interface hardware, but it is only appropriate if there are only a few device classes.

The other approach is to add a priority interrupt decoder. This allows for 16 levels, encoded as a 4-bit binary number. Mixtures of these schemes are perfectly possible. For example, the most significant bit of the level field on the Hawk enables the memory management unit, so we could use the least significant 3 bits to create 8 different interrupt levels. This mixed model allows use of trap vector entries 8 to 15, at addresses 8016 to F016 for the corresponding interrupt levels.

Assigning interrupt priority involves two issues: First, if two devices request interrupts at the same time, the priority determines which device wins. Second, if one interrupt service routine is running, the priority of a device requesting an interrupt determines whether that request will be granted before the first routine terminates.

On the Hawk, if two interrupt service routines use separate register save areas, then the lower priority interrupt service routine can re-enable interrupts between the time it saves its registers and the time it restores them. It must not enable its own interrupt, but it can set the level field of the processor status word (using the CPUSET instruction) to allow all higher priority interrupts.

Exercises

p) If each bit of the level field enables interrupts for one device, and if the keyboard interrupt enable bit in the level field of the processor status word is 00102, what instructions should be used just after saving all of the registers to enable all other interrupts?

q) If the least significant 3 bits of the level field of the PSW are used to encode a binary interrupt level, with 00002 meaning disable all interrupts and 01112 meaning enable all interrupts, and if keyboard interrupts are enabled whenever the interrupt level is 00102 or greater, what instructions should be used just after saving all of the registers to enable all other interrupts?

r) If interrupts were enabled within an interrupt service routine, what code must be inserted just before the code to restore the values of the registers in order to make sure that the job of restoring the registers is not itself interrupted.

s) Suppose two different devices have equal priority and they use the same trap vector entry. How can the software determine which device requested the interrupt? Hint: Each device has its own ready bit and its own interrupt enable bit.