The Hawk emulator supports a number of commands to help in observing what a program does:
By default, it shows the registers rather frequently, seriously slowing the execution of Hawk programs with the effort needed to update the screen. By setting the internal timer to 1, the screen will be updated after almost every or every other instruction. Setting it to 1000 will give only a few updates per second, and by setting it to 10000, execution will be very fast. These numbers are in hex, and formally, what is being counted is memory cycles (32 bit fetches and stores).
Warning! If control does not return to the location where the iterate command was issued, the emulator will run until you hit control-C. On the HP machines, you may really have to bang on control-C to get it to notice (the HP system is still buggy, but at least the worst of its bugs are fixed!)
To make effective use of the proceed command, open a second window and use it to look at the assembly listing of the program you are debugging. Then, you can note the location of some instruction that you want to execute up to, enter the address, and then hit p.
The Hawk emulator will jump to location #0010 if a program attempts to address nonexistant memory, or to write in ROM, and it will jump to location #0020 if a program attempts to execute an unimplemented instruction. These jumps are called traps, and the locations #0010 and #0020 are called the trap vector.
To simplify debugging of programs that "run wild" and cause a trap, we have provided a minimal "operating system" to monitor these addresses and recover some information on how your program got there. This system, monitor.o may be included with your program in two ways. Either include it when you run the emulator:
hawk /group/22c018/monitor.o yourcode.oOr, include it into your program directly, by starting your source file as follows:
TITLE yourcode USE "/group/22c018/hawk.macs" USE "/group/22c018/monitor.o"With this monitor, when your program runs wild, the monitor will print an error message and show the program counter, offending memory address, and program status word at the time of the trap. The program counter shown is the address from which the instruction was fetched that committed the offence. The memory address, in the case of a bus trap, is the illegal memory address that was referenced. The monitor program terminates with a jump to zero; this will usually halt the emulator unless a breakpoint was set at the time.
Note that, when the monitor catches a trap, it uses registers 1 to 7 to print its error message, but before jumping to location zero (to force a halt, assuming no odd breakpoints are set), it restores these registers and the PSW to their state at the time of the trap. Thus, the display of those registers on the screen when the monitor halts is correct!
When allocating an array or a record, it is natural to imagine the following:
As a result, the SMAL Hawk assembler includes an ALIGN directive (actually a macro in the hawk.macs file) that can be used to force alignment:
ALIGN 1 ; align to a byte boundary ALIGN 2 ; align to a halfword boundary ALIGN 4 ; align to a word boundaryConsider the following C declaration and its naive translation to SMAL:
struct rec { char a; int b; char c; int d; } array[2] = { { 'x', 1, 'y', 2 }, { 'z', 3, 'w', 4 } };(This declares an array named array of 2 records of 4 fields each, with initial values given. A naive translation of this data structure to SMAL would be:
array: B 'x' ; array[0].a W 1 ; array[0].b B 'y' ; array[0].c W 2 ; array[0].d B 'z' ; array[1].a W 3 ; array[1].b B 'w' ; array[1].c W 4 ; array[1].dIn memory, the SMAL assembler would store the following:
byte 3 2 1 0 ----------------------- | #00 | #00 | #01 | 'x' | 1 |-----------------------| | #00 | #02 | 'y' | #00 | 2 |-----------------------| | #03 | 'z' | #00 | #00 | 3 word |-----------------------| | 'w' | #00 | #00 | #00 | 4 |-----------------------| | #00 | #00 | #00 | #04 | 5 -----------------------This is exactly 5 words, 4 characters plus 4 full word integers, but writing a program to some field of an arbitrary array element is very messy! Even on machines that support non-aligned memory references, there is a significant performance penalty! Reading a non-aligned word operand from memory takes two memory cycles, and on many machines, writing a non-aligned word operand to memory takes 4 memory cycles (two reads and two writes, although many machines have hardware to speed up writes when they follow immediately after a read from the same location).
Because of this, even if the Hawk machine tried to make non-aligned memory references look inexpensive, we would be better off storing this array in memory as follows:
array: B 'x' ; array[0].a ALIGN 4 W 1 ; array[0].b B 'y' ; array[0].c ALIGN 4 W 2 ; array[0].d B 'z' ; array[1].a ALIGN 4 W 3 ; array[1].b B 'w' ; array[1].c ALIGN 4 W 4 ; array[1].dThe effect of this is to store the array in memory as follows:
byte 3 2 1 0 ----------------------- |/////|/////|/////| 'x' | 1 |-----------------------| | #00 | #00 | #00 | #01 | 2 |-----------------------| |/////|/////|/////| 'y' | 3 |-----------------------| | #00 | #00 | #00 | #02 | 4 |-----------------------| word |/////|/////|/////| 'z' | 5 |-----------------------| | #00 | #00 | #00 | #03 | 6 |-----------------------| |/////|/////|/////| 'w' | 7 |-----------------------| | #00 | #00 | #00 | #04 | 8 -----------------------This wastes a significant amount of storage (it comes close to doubling the amount of memory required, in this example), but all of the fields of the array are easy to fetch and manipulate.
In the Pascal programming language, any array or record declaration may be preceeded by the keyword packed. This tells the compiler that it is OK to pack the fields of the array or record as tightly as possible, even if this requires complex and slow code to access components of the resulting structure. C has nothing analogous to this! The semantics of C requires that record fields be allocate in memory in the order they appear in the declaration, while in Pascal, the compiler may reorganize records for more efficient storage.
With the example record, to force more efficient storage allocation, a C programmer can group all character and short-integer fields together, or, at least, group character fields in groups of 4 and short-integer fields in groups of 2. Doing this for the example gives:
struct rec { char a; char c; int b; int d; } array[2] = { { 'x', 'y', 1, 2 }, { 'z', 'w', 3, 4 } };This would typically imply a structure such as the following on a machine that did not allow non-aligned words:
byte 3 2 1 0 ----------------------- |/////|/////| 'y' | 'x' | 1 |-----------------------| | #00 | #00 | #00 | #01 | 2 |-----------------------| | #00 | #00 | #00 | #02 | 3 |-----------------------| word |/////|/////| 'w' | 'z' | 4 |-----------------------| | #00 | #00 | #00 | #03 | 5 |-----------------------| | #00 | #00 | #00 | #04 | 6 -----------------------If the machine allowed non-aligned words, the C programmer might be advised to write:
struct rec { char a; char c; char pad1,pad2; /* unused fields for padding */ int b; int d; } array[2] = { { 'x', 'y', '#', '#', 1, 2 }, { 'z', 'w', '#', '#', 3, 4 } };This explicitly adds extra unused fields to force the integer fields to be aligned on a word boundary, thus allowing single cycle access to those fields. C (or C++) programmers should generally avoid this kind of fiddling with the details of data structure allocation except when the last iota of speed or size must be squeezed out of a program! Furthermore, these kinds of fiddles depend immensely on the details of the CPU and compiler being used. What leads to a significant improvement on an Intel Pentium may do noting for a DEC Alpha, or vica versa!
When programming in assembly language, on the other hand, you must be aware of how fields are packed. If one word is allocated for each character field in a structure, the code is simple. If multiple characters are packed per word, the code is somewhat more complex. If full-word variables are allocated so they straddle word boundaries in memory, the code required to read or write those variables is far more complex.