Compiler Construction

In the following code, copied from the previous lecture notes, there are a huge number of common subexpressions involved in the nesting relationships.

x: procedure
    a: var int32;
    b: procedure(bup:@xrecord)
        c: var int32;
        c = bup@.a;
    end;
    d: procedure(dup:@xrecord)
        e: var int32;
        f: procedure(fup:@drecord);
           fup@.e = fup@.dup@.a; fup@.dup@.a = fup@.dup@.a + 1; b(fup@.dup);
        end;
        f(this);
        if (dup@.a < 2) then d(dup);
    end;
    a = 0; d(this);
end;

The question is, how can we get rid of these common subexpressions. One old solution, and it turns out, a solution that breaks if we move from pure nesting to an object-oriented model, is to use a data structure called a display.

The Display

The basic idea of a display is that we build a global array indexed by nesting level holding the pointers to the current activation records at each level. We call this array the display. On entry to each procedure, we must save the previous display entry for that procedure's level, and on exit, we restore that entry. This lets us rewrite the above as follows:

x: procedure -- nesting level 1
    dsave: genericpointer;
    dsave = display[ 1 ];
    display[ 1 ] = self;
    a: var int32;
    b: procedure -- nesting level 2
        dsave: genericpointer;
        dsave = display[ 2 ];
        display[ 2 ] = self;
        c: var int32;
        c = display[ 1 ]@.a;
        display[ 2 ] = dsave;
    end;
    d: procedure -- nesting level 2
        dsave: genericpointer;
        dsave = display[ 2 ];
        display[ 2 ] = self;
        e: var int32;
        f: procedure -- nesting level 3
            dsave: genericpointer;
            dsave = display[ 3 ];
            display[ 3 ] = self;
            display[ 2 ]@.e = display[ 1 ]@.a;
            display[ 1 ]@.a = display[ 1 ]@.a + 1;
	    b;
            display[ 3 ] = dsave;
        end;
        f(this);
        if (display[ 1 ]@.a < 2) then d;
        display[ 2 ] = dsave;
    end;
    a = 0; d;
    display[ 1 ] = dsave
end;

Note in the above, that we got rid of large numbers of common subexpressions but at the expense of a complex sequence of assignments at the entry to every subroutine and one assignment at the end.

We don't need all of these assignments, however. If a subroutine does not declare any local subroutines, there will be no up-level addressing to the local variables of that subroutine, so it does not need to modify the display. If a subroutine does not declare any local variables that can be the targets of up-level addressing, it also does not need this code. An optomizing compiler will only generate the extra code when it is needed.

Another thing to note is that the subscripts in the array addressing are always constants, so we can rewrite display[1] as, for example display1. As a result, while we have used the display as if it is a global array, in fact, it is a collection of global variables, one per nesting level.

One approach to writing a middling-quality optimizing compiler is to limit the number of nesting levels and then set aside that many general-purpose registers for the display. This leads to good machine code at the expense of what, to the programmer, appear to be arbitrary (and usually small) limits on the number of nesting levels.

Problems with the Idea

Unfortunately, displays begin to break when we allow either objects with methods or if we allow subroutines to be passed as parameters. In either case, we end up with the possibility of a single call that jumps from one context to another where the new context needs more than one change to the display. Up-links handle this naturally, the display idea does not. Consider this little bit of nested code:

a: procedure -- level 1
    b: type record -- level 2
        c: var int32
        d: procedure -- level 3 -- a method
            c = 1
        end;
    end;
    e: var b; -- one object
    f: var b; -- another object
    e.b;
    f.b;
end;

In the above, calls to the e.b and f.b involve changing the display at levels 2 and 3. If you look at the original example we began with, all calls were either to code at an outer level, the same level, or one level in. Once you allow classes and objects, or once you allow subroutines to be passed as parameters, it becomes possible to call a subroutine (procedure, function or method) that is nested at any level. This breaks the idea of a display.

Now look at what happens if we allow calls to subroutines that were passed as parameters:

a: procedure -- level 1
    b: procedure( c: procedure ) -- level 2
        d: var int32
        c 
    end 
    e: procedure -- level 2
        f: var int32
        g: procedure -- level 3
            f = 1
        end
        b( g )
    end
end

In the above, the call to c within b is really a call to g. If we try to use a display, the call to c must somehow restore the display entry at level 2 so that g can reference f. If we do not do this, the display at level 2 will stil lpoint to the activation record for b containing c and d.

The idea of a display can almost be rescued to handle procedure parameters. The rescue involves replacing a generic display indexed by nesting level with one global variable per procedure holding a pointer to the current activation record for that procedure. On entry to a procedure, the previous value of that procedure's display entry is saved and restored just as it was with the level scheme. Unfortunately, this idea breaks if a recursive procedure passes a pointer to its own local variable to itself in a recursive call.

a: procedure( b: procedure )
    c: var int32
    d: procedure
         c = 1
       end
    d
    b
    a( d )
end

In the above, after the first recursive call, where d is passed as a parameter, the direct call to d and the call to b both call d, but the direct call to d modifies c in the topmost activation record for a stored on the stack, while calling b modifies the c in an older activatio record for a.

Notice that the counterexamples that break the different versions of the display are very difficult to construct and understand. As a result, the idea of a display has continued to appeal to compiler writers despite its shortcomings.

Lecture 32, The Display

The Overhead of Up-level Addressing

The Display

Problems with the Idea