22C:18, Lecture 13, Fall 1996

Douglas W. Jones
University of Iowa Department of Computer Science

Not All Variables are Words

On the Hawk machine, the natural size for a variable is one word, but there are many applications where this is not the appropriate size. If you look at the Pascal programming language, for example, users are free to declare new types with arbitrary ranges, for example:

	type funny = -352 .. 517;

This says that variables of type funny may have values ranging from -352 to 517. The predefined type integer in Pascal has a size that depends on the host machine's wordsize, and the programmer can find out what that allows because the compiler provides a predefined constant called maxint giving the maximum legal integer value.

In C, the programmer can't arbitrarily set the upper and lower bounds on user-defined integer types. Instead, a fixed selection of integer types is provided. The standard integer types in C are listed below, along with the minimum range of values each should be able to store:

	signed char        --         -127 to +127
	unsigned char      --            0 to +255
        short int          --       -32767 to +32767
        unsigned short int --            0 to +65535
        long int           --  -2147483647 to +2147483647
        unsigned long int  --            0 to +4294967295

The ANSI standard says that every C compiler should provide the actual limits for each integer type in the header file <limits.h>. As in Pascal, the most commonly used types, char and int, are not in the above list! Compilers are free to use either signed or unsigned characters for variables of type char, and they are free to use either short int or long int variables when the user asks for int (but the compiler should disclose its choice in <limits.h>).

In both Pascal and C, this freedom, on the part of the compiler, is a major source of portability problems when programs are moved from one machine to another or even from one compiler to another, but this freedom gives compilers a licence to take advantage of whatever size and sign conventions work best on each machine!

On the hawk machine, we have the following storage units:

    byte     --        -128 to +127        or 0 to +255
    halfword --      -32768 to +32767      or 0 to +65535
    word     -- -2147483648 to +2147483647 or 0 to +4294967295

When a variable is stored in memory, you can't tell whether it is signed or unsigned, it is only a bit pattern! The decision to interpret the most significant bit of the variable as a sign bit is made when the program manipulates that variable, and the sequence of instructions used to do that manipulation determines whether the sign is there or not!

Obviously, a C compiler should use bytes for char types, halfwords for short int types, and words for long int types, but what about types int and char? Because fetching and storing words is fast, while fetching and storing halfwords is slower, a C compiler written for the Hawk should use words for variables of type int, but the question of signed versus unsigned characters requires looking deeper into the peculiarities of the architecture.

Sign Extension versus Truncation

When a fractional word variable is fetched into a register on the Hawk machine, it is extended to a 32 bit value, because the Hawk machine does not support short registers. By default, this extension involves sign extension, because the EXTB and EXTH instructions sign extend their operands. Consider the following example:

	LOADS	R4,R3
	EXTB	R4,R4,R3

This fetches, into r[4], the byte pointed to by r[3]. If the byte in memory was:

	m[r[3]] = 01100001

The value loaded in r[4] will be:

	r[4] = 00000000000000000000000001100001

If the byte in memory was:

	m[r[3]] = 10100011

The value loaded in r[4] will be:

	r[4] = 11111111111111111111111110100011

This sign extension implies that C compilers for the Hawk machine should, by default, implement type char as signed char and not unsigned char.

So, what if you want unsigned characters or unsigned halfwords? You must add instructions to your program to explicitly truncate the loaded value to 8 or 16 bits! There are many ways to do this:

Truncation with and:

The logical and operator takes corresponding bits of the the two operands and combines them, producing a one in the result only when both operands were a one, and producing a zero otherwise. Consider the above example:

	r[4] = 11111111111111111111111110100011
	and    00000000000000000000000011111111
        ---------------------------------------
        result 00000000000000000000000010100011

Thus, the following sequence of instructions will fetch an unsigned byte into a register

	LOADS	R4,R3
	EXTB	R4,R4,R3
	LIL	R5,#0000FF
	AND	R4,R5

The constant FF, while only 8 bits, was loaded using the LIL instruction because the LIS instruction sign extends its operand! Thus, LIS with the constant #FF would load #FFFFFFFF and not #000000FF. If there are many references to unsigned bytes, it would be sensible to load the constant #FF in some register globally, or at least outside any loops that reference many unsigned bytes, and then use it as needed.

Truncation with shifting:

The high bits of a register can be discarded by shifting that register left, and zeros can be shifted in to replace them by shifting that register right. As a result, the following sequence of instructions will work for loading an unsigned halfword:

	LOADS	R4,R3
	EXTH	R4,R4,R3
	SL	R5,16
	SRU	R5,16

Note that, with a 4 bit field on the shift instructions used to indicate how many places the register should be shifted, it would seem that the count of 16 would be illegal, but this is not so! Shifting a register zero places is of no particular use, so the shift instructions interpret a count of 0 as meaning 16, and coding a shift count of 16 in the assembly language is quite safe; the least significant 4 bits of this value are 0 and the extra bits above this are discarded!

Using shifting to truncate bytes is not recommended on the Hawk machine because this requires two successive 24 bit shifts, but the Hawk only allows 16 bits of shift per shift instruction. As a result, it takes two shift instructions to move 24 bits (how the job is divided between these instructions makes no difference at all), as follows:

	LOADS	R4,R3
	EXTB	R4,R4,R3
	SL	R5,16
	SL	R5,8
	SRU	R5,12
	SRU	R5,12

In the above, two successive shifts wer

Truncation with a truncate instruction:

If unsigned fractional word and byte operands are common enough in real programs, it would make sense to add a truncate instruction to the Hawk architecture. Consider the following:

	LOADS	R4,R3
	EXTB	R4,R4,R3
	TRUNC	R4,8

This TRUNC instruction zeros out all but the least significant 8 bits of the loaded register. TRUNC 16 would zero out all but the least significant 16 bits, and if someone needed it, TRUNC 5 would zero all but the least significant 5 bits. Adding a corresponding SXT instruction would allow sign extending a bit pattern of arbitrary length. These instructions will be added to the Hawk shortly, but it is an interesting exercise to try to write macros that do the same thing using the instructions currently avaialble on the machine. (Hint: consider using sequences of shifts!)

Not All Variables are Smaller than a Word

Some applications require big variables! Modern cryptographic algorithms, for example, require computation on 512 and sometimes 1024 bit values, and conversion between calendar systems, for example, between the Jewish mixed solar-lunar calendar and the Gregorian calendar we use today, requires very high precision arithmetic (some wonderful research on calendars, done by Ed Reingold at Illinois, has led to algorithms for converting between the Hindu, Jewish, approximate Moslem, Julian, Gregorian and ancient Maya calendars; only the approximate Moslem date can be computed because the Moslem calendar relies on actual observation of the new moon to determine when each month begins).

So, what do you do if you need high precision? Most machines provide some way to build high precision arithmetic out of low precision pieces. This is essential on machines with short words, but the examples above suggest how useful it is on machines with longer words. Languages like C and Pascal provide essentially no help in this area, since the fundamental problem is propagation of carry infromation from one add or subtract to the next, and it is extremely difficult to find the value of the carry bit after an add in C, and even harder in Pascal! (It can be done, if it couldn't, emulators such as the Hawk emulator couldn't be written in C or Pascal!)

At the machine language level, the typical solution to the carry propagation problem is to have a special instruction to add the carry from one add into another. Suppose, for example, we have an application that needs to add two 96 bit integers, one stored in R1-3, and the other stored in R4-6, with R1 and R4 holding the least significant 32 bits, and R3 and R6 holding the most significant 32 bits. On the Hawk, we would do this as follows:

	ADD	R1,R1,R4  ; add low bits
	ADDC	R2,R5     ; add middle bits
	ADDC	R3,R6     ; add high bits

The ADDC (add with carry) instruction adds its source operand to its destination operand, and at the same time, adds one extra if the C bit is set. Similarly, there is an SUBB (subtract with borrow) instruction that subtracts from its destination operand and in addition, subtracts one more if a borrow is needed (signalled by the carry bit being reset!).

The ADDC and SUBB instructions correctly report the condition codes for the entire long operation, not just the last 32 bits. Thus, after the above 96 bit operation, N will be set if the entire result was negative, Z if the entire result was zero, V if the entire result had an overvlow, and C if there was a carry out of the entire result.

If very high precision is needed, so that not all operands can be fetched into registers before the add or subtract is started, it becomes essential that the load and store operations used to move the component words to and from memory not meddle with the carry bit between steps! This is one reason the Hawk architecture provides both LOAD and LOADCC -- the former does not meddle with the condition codes, the latter does.