Lecture 26, More Dynamic Semantics

Part of the notes for 22C:196:003 (CS:4908:0003)
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

References

The previous lecture rested, in large part, on this example. In this restatement of the example, however, we have added two lines at the end. These two lines illustrate another issue:

x: type record
    a: const 5;
    b: type a .. a+3;
    c: var b;
    d: type record
        e: var b;
        f: var b;
        e = c;
        f = e;
    end;
    g: var d;
    h: var d;
end;
y: var x;
z: var x;

p: var y.d; -- added
q: var z.d; -- added

The new variables in the outer block are both of very similar types, the record type d declared local to the record type x. You might imagine referring to this as the type x.d, but in this case, that is not correct because the initializer for d references a non-local variable, c. In the initialization of p, c refers to y.c, while in the initialization of q, it refers to z.c.

Had it not been for this issue, we could have referred to x.d, and in fact, it would not be probematic to permit references to the type x.b because this type is simple, a mere subrange with statically determined bounds. The problem is, it takes considerable work to determine that x.b (or any other type definition) is well defined. We must ask whether the type (or any type on which it rests) has a definition that depends on up-level addressing. To avoid this, we simply forbid access to public fields of types, and only permit reference to fields of type instances.

Comparison with C and C++

In C, there are three distinct contexts in which declarations are permitted, global declarations, local declarations, and declarations of fields of structures. The declaration int i; is legal in all three contexts, but other declarations are only legal in a subset of these contexts. Functions must be global, for example.

In C++, a new category is added, Class definitions. Again, int i; is legal in this new context, and functions defined as members of a class become methods of that class. The notation of C++ is irregular, however, with a.b being used for references to components of an object or to components of a structure, while the notation a::b is used for references to components of a class.

The design of Kestrel is an attempt to eliminate this irregularity. Thus, where C++ uses the class std to bundle such things as standard input-output streams, an appropriate way to do this in Kestrel might be to have std be a global variable of an anonymous class. Consider something like this:

std: var record
    stream: type record
        put: procedure (c: char)
            ...
        end;
        get: function
            ...
        end;
    end;
end;

In the context of this definition, the type std.stream is available to Kestrel programmers the way std::stream is available to C++ programmers. That is not to say that the standard I/O mechanisms of Kestrel should be defined in order to ape the style of C++, but rather we are demonstating that it can be done that way should that approach prove to be desirable.

Summary of Attributes

Constants have just a value, where that value has a type. We constrain the value to be statically defined.

Subrange Types (whether anonymous or named) have a base type (integer, character, or some enumeration type), and two constants, the bounds, giving the maximum and minimum permitted values. The base type is determined by the type of the constants, and the two bounds must have the same base type. Note that the base types integer and character are themselves predefined subranges, since our language does not support unbounded integers.

Enumerated Types (again, whether anonymous or named) can be effectively eliminated as they are encountered by the compiler. The list of members of the enumeration can be added to the environment as constant definitions (constants with that enumeration as their base type). The type itself can be recorded as a subrange type, with its maximum and minimum set to match the maximum and minimum members of the enumeration.

Array Types (whether anonymous or named) have an element type and an index type. The element type is unconstrained, the index type must be a subrange type.

Pointer Types (whether anonymous or named) have a referent type, the type of the object pointed to.

Record Types (whether anonymous or named) have a list of components given by the block used to define the record. This list includes components that are constants, types and variables. At run time, each record has implicit knowledge of the record global to it, the up link. Every record type had an initializer, code that is run when an instance of that type is allocated. The initializer is given a copy of the up link when it is called.

Optimization: If a record type includes no code (that is, neither statements in its initializer, nor variable components that have initializers), there is no need for an initializer for this record.
Optimization: If a record type includes no code that uses the up link, (that is, neither the initializer (if any) nor any code in subsidiary blocks refers to identifiers global to this block), then there is no need for an up link.

Procedures and Functions are essentially record types, where a call consists of allocation, initialization and then deallocation, all in quick succession. As a result, procedures and functions have all of the attributes of record types. In additionm, procedures and functions have parameter lists -- essentially lists of public variable components that receive values from the caller, and functions have a return type, essentially the type of an anonymous public component that is inspected by the caller immediately before deallocation of the block.