The Hawk emulator supports a number of commands to help in observing what a program does:
By default, it shows the registers rather frequently, seriously slowing the execution of Hawk programs with the effort needed to update the screen. By setting the internal timer to 1, the screen will be updated after almost every or every other instruction. Setting it to 1000 will give only a few updates per second, and by setting it to 10000, execution will be very fast. These numbers are in hex, and formally, what is being counted is memory cycles (32 bit fetches and stores).
Warning! If control does not return to the location where the iterate command was issued, the emulator will run until you hit control-C. On the HP machines, you may really have to bang on control-C to get it to notice. (We can hope HP has fixed the bugs that made catching control-C difficult.)
To make effective use of the proceed command, open a second window and use it to look at the assembly listing of the program you are debugging. Then, you can note the location of some instruction that you want to execute up to, enter the address, and then hit p.
The Hawk emulator will jump to location #0010 if a program attempts to address nonexistant memory, or to write in ROM, and it will jump to location #0020 if a program attempts to execute an unimplemented instruction. These jumps are called traps, and the locations #0010 and #0020 are called the trap vector.
To simplify debugging of programs that "run wild" and cause a trap, the linker combines your program with a minimal "operating system" to monitor for traps and recover some information on how your program got there. This system, monitor.o is automatically included with your program when you run the Hawk linker.
link yourfile.oWith this monitor, when your program runs wild, the monitor will print an error message and show the program counter, offending memory address, and program status word at the time of the trap. The program counter shown is the address from which the instruction was fetched that committed the offence. The memory address, in the case of a bus trap, is the illegal memory address that was referenced. The monitor program terminates with a jump to zero; this will usually halt the emulator unless a breakpoint was set at the time.
Note that, when the monitor catches a trap, it uses registers 1 to 7 to print its error message, but before jumping to location zero (to force a halt, assuming no odd breakpoints are set), it restores these registers and the PSW to their state at the time of the trap. Thus, the display of those registers on the screen when the monitor halts is correct!
You will need to look at the assembler listing file to interpret error messages from the monitor and to use the debugging features discussed here! Furthermore, you will have to be aware that addresses and values marked in the assembly listing with a + are values that will be adjusted by the linker before the program is run. Consider the following assembly listing:
SMAL32, rev 6/97. Hello World Program, by D. 09:01:34 Page Mon Jun 16 199 1 TITLE Hello World Program, by D. 2 USE "/group/22c018/hawk.macs" 3 USE "/group/22c018/hawk.system" +000000:+00000000 4 +00000000 +00000000 +00000000 +00000000 +00000000 +00000000 5 S START ; set start 6 7 START: 8 LOAD R2,PSTACK ; set up th +00001C: F2E0 FFE0 9 CALL DSPINI +000020: F1E0 FFE0 10 LEA R3,HELLO F131 +000026: F3C0 000A 11 CALL DSPST +00002A: F1E0 FFE2 12 CLR R1 F131 +000030: D100 13 JUMPS R1 ; stop! +000032: F031 14 +000034: 48 65 6C 6C 15 HELLO: ASCII "Hello World!",0 6F 20 57 6F 72 6C 64 21 00 16 17 END no errorsThis listing shows the program starting at address +00001C, but the linker places this in memory starting at address 0000100016, so the actual starting address of the program is 0000101C16.
Note that, unlike the Hello World program given previously, this one does not use the CALL macro or the header file for access to the system. Instead, it exposes, directly, the actual assembly code used to link to the operating system routines.
If you start the hawk emulator on the link.o file resulting from linking the assembly output from the above, the initial emulator output will show the program counter has exactly this value:
HAWK EMULATOR /------------------CPU------------------\ /----MEMORY----\ PC: 0000101C R8: 00000000 001018: #0206 PSW: 00000000 R1: 00000000 R9: 00000000 00101A: #0000 NZVC: 0 0 0 0 R2: 00000000 RA: 00000000 ->00101C: LOAD #2,#001000 R3: 00000000 RB: 00000000 001020: LOAD #1,#001004 R4: 00000000 RC: 00000000 001024: JSRS #1,#1 R5: 00000000 RD: 00000000 001026: LEA #3,#001034 R6: 00000000 RE: 00000000 00102A: LOAD #1,#001010 R7: 00000000 RF: 00000000 00102E: JSRS #1,#1 **HALTED** r(run) s(step) q(quit) ?(help)Furthermore, the emulator shows the code at this address is a LOAD instruction, loading register 2 with the contents of memory location 100016. This brings up another issue! The assembly listing shows the value +00000000 being stored in addresses +000000 to +000018. Since each of these is preceded by a + sign, each is subject to modification by the linker! The addresses are translated to 0000100016 to 0000101816 by the linker, and the contents of these addresses are adjusted so that the point to the operating system data areas specified by the file /group/22c018/hawk.system.
The program listing shown by the emulator is not created by examining the source code of your program! Instead, it is created by "disassembly" of the code in memory! As a result, macros in the source program are shown in expanded form. The sequence LOAD/JSRS at addresses 102016 and 102416, for example, is the result of expanding the the CALL macro on line 9 of the source program! This macro was defined in the file /group/22c018/hawk.system.
If you wish to see the values the linker has assigned to symbols that weren't locally defined in your program, look at the file "link.map" produced by the linker. This is called the linkage map, or the map of the linker output, and the map for the example program above is as follows:
SDSPPTR= #00000004 STRAPBUF= #00000038 CT= #00000171 RDSPINI= #00000178 RDSPAT= #00000186 RDSPCH= #000001AE RDSPST= #000001C0 RDSPHX= #000001D6 RDSPDEC= #00000206 RKBGETC= #0000024C RKBGETS= #00000260 RTIMES= #000002AE RDIVIDE= #000002C8 SSTACK= #00001000 R= #00001044 RSTACK= #00010000 RTRAPBUF= #00011000 RDSPPTR= #00011038 C= #0001103C RUNUSED= #0001103C RUNAVAIL= #00020000Note that the map file is sorted by value. An extra letter is added at the front of each identifier; those that start with R are normal external symbols. This map shows that the stack referenced on line 8 of the assembly program begins at location 0001000016 (given by the value of RSTACK) and that the entry point for the DSPST routine in the operating system is at location 000001C016.
When allocating an array or a record, it is natural to imagine the following:
As a result, the SMAL Hawk assembler includes an ALIGN directive (actually a macro in the hawk.macs file) that can be used to force alignment:
ALIGN 1 ; align to a byte boundary ALIGN 2 ; align to a halfword boundary ALIGN 4 ; align to a word boundaryConsider the following C declaration and its naive translation to SMAL:
struct rec { char a; int b; char c; int d; } array[2] = { { 'x', 1, 'y', 2 }, { 'z', 3, 'w', 4 } };(This declares an array named array of 2 records of 4 fields each, with initial values given. A naive translation of this data structure to SMAL would be:
array: B 'x' ; array[0].a W 1 ; array[0].b B 'y' ; array[0].c W 2 ; array[0].d B 'z' ; array[1].a W 3 ; array[1].b B 'w' ; array[1].c W 4 ; array[1].dIn memory, the SMAL assembler would store the following:
byte 3 2 1 0 ----------------------- | #00 | #00 | #01 | 'x' | 1 |-----------------------| | #00 | #02 | 'y' | #00 | 2 |-----------------------| | #03 | 'z' | #00 | #00 | 3 word |-----------------------| | 'w' | #00 | #00 | #00 | 4 |-----------------------| | #00 | #00 | #00 | #04 | 5 -----------------------This is exactly 5 words, 4 characters plus 4 full word integers, but writing a program to some field of an arbitrary array element is very messy! Even on machines that support non-aligned memory references, there is a significant performance penalty! Reading a non-aligned word operand from memory takes two memory cycles, and on many machines, writing a non-aligned word operand to memory takes 4 memory cycles (two reads and two writes, although many machines have hardware to speed up writes when they follow immediately after a read from the same location).
Because of this, even if the Hawk machine tried to make non-aligned memory references look inexpensive, we would be better off storing this array in memory as follows:
array: B 'x' ; array[0].a ALIGN 4 W 1 ; array[0].b B 'y' ; array[0].c ALIGN 4 W 2 ; array[0].d B 'z' ; array[1].a ALIGN 4 W 3 ; array[1].b B 'w' ; array[1].c ALIGN 4 W 4 ; array[1].dThe effect of this is to store the array in memory as follows:
byte 3 2 1 0 ----------------------- |/////|/////|/////| 'x' | 1 |-----------------------| | #00 | #00 | #00 | #01 | 2 |-----------------------| |/////|/////|/////| 'y' | 3 |-----------------------| | #00 | #00 | #00 | #02 | 4 |-----------------------| word |/////|/////|/////| 'z' | 5 |-----------------------| | #00 | #00 | #00 | #03 | 6 |-----------------------| |/////|/////|/////| 'w' | 7 |-----------------------| | #00 | #00 | #00 | #04 | 8 -----------------------This wastes a significant amount of storage (it comes close to doubling the amount of memory required, in this example), but all of the fields of the array are easy to fetch and manipulate.
In the Pascal programming language, any array or record declaration may be preceeded by the keyword packed. This tells the compiler that it is OK to pack the fields of the array or record as tightly as possible, even if this requires complex and slow code to access components of the resulting structure. C and C++ have nothing analogous to this! The semantics of C requires that record fields be allocate in memory in the order they appear in the declaration, while in Pascal, the compiler may reorganize records for more efficient storage.
With the example record, to force more efficient storage allocation, a C programmer can group all character and short-integer fields together, or, at least, group character fields in groups of 4 and short-integer fields in groups of 2. Doing this for the example gives:
struct rec { char a; char c; int b; int d; } array[2] = { { 'x', 'y', 1, 2 }, { 'z', 'w', 3, 4 } };This would typically imply a structure such as the following on a machine that did not allow non-aligned words:
byte 3 2 1 0 ----------------------- |/////|/////| 'y' | 'x' | 1 |-----------------------| | #00 | #00 | #00 | #01 | 2 |-----------------------| | #00 | #00 | #00 | #02 | 3 |-----------------------| word |/////|/////| 'w' | 'z' | 4 |-----------------------| | #00 | #00 | #00 | #03 | 5 |-----------------------| | #00 | #00 | #00 | #04 | 6 -----------------------If the machine allowed non-aligned words, the C programmer might be advised to write:
struct rec { char a; char c; char pad1,pad2; /* unused fields for padding */ int b; int d; } array[2] = { { 'x', 'y', '#', '#', 1, 2 }, { 'z', 'w', '#', '#', 3, 4 } };This explicitly adds extra unused fields to force the integer fields to be aligned on a word boundary, thus allowing single cycle access to those fields. C (or C++) programmers should generally avoid this kind of fiddling with the details of data structure allocation except when the last iota of speed or size must be squeezed out of a program! Furthermore, these kinds of fiddles depend immensely on the details of the CPU and compiler being used. What leads to a significant improvement on an Intel Pentium may do nothing for a DEC Alpha or vica versa!
When programming in assembly language, on the other hand, you must be aware of how fields are packed. If one word is allocated for each character field in a structure, the code is simple. If multiple characters are packed per word, the code is somewhat more complex. If full-word variables are allocated so they straddle word boundaries in memory, the code required to read or write those variables is far more complex.