Formally, to compute the address of an element of an array, given the address of the array itself and an integer subscript, the following computation is involved:
a - address of first element of array A i - subscript s - size of an element in bytes f - subscript of first array element l - subscript of last array element e - address of A[i] error if i > l error if i < f e = a + s*(i - f)Some programming languages have simplified models of arrays. For example, in C, f is always 0, and the compiler does not check for errors in the subscript range. In Pascal, Ada and many other languages, arrays may be declared as running between two arbitrary integers, and subscripts outside that range are illegal.
Consider the following Pascal declaration:
var a: array [1 .. 10] of integer;This declares a as an array with legal subscripts from 1 to 10, where each element is an integer, typically a 32 bit word. Given that the array is a local variable of a procedure, allocated in the activation record of that procedure at offset A from the base of the activation record, and given that R3 is to be used as the array index, the indexed word of the array, a[R3], can be computed as follows:
; code to address a[i], where R3 = i CMPI R3,10 ; test upper bound BGT ERROR ADDSI R3,-1 ; adjust by lower bound BLT ERROR LEA R4,R2,A ; R4 is address of array a ADDSI R3,R4,2 ; R3 is now the address of A[i]Note in the above code that the comparison with the lower bound was combined with the adjustment of the subscript to account for the fact that the lower bound was nonzero.
This code can be simplified if we know that the array subscript is within bounds. In languages like Pascal and Ada, the type of each variable states the upper and lower bound of that variable, and declarations can be adjusted to specifically state the actual range of values a variable takes on. In such languages, it is frequently possible for the compiler to eliminate the tests of the array bounds. This can allow the above code to be reduced to:
; code to address a[i], where R3 = i ; assumes that i is known to be within bounds LEA R4,R2,A-4 ; R4 is address of A[0] ADDSI R3,R4,2 ; R3 is now the address of A[i]This code uses assembly time subtraction to adjust for the nonzero lower array bound instead of run-time computation used in the previous example. Thus, R4 is loaded with the address of the illegal array element A[0] even though A[0] does not exist -- note that the address exists, in the sense that you can add to it to get the address of legal array elements, even though the array element A[0] does not.
Here is another example, using a 2 dimensional array of characters, declared in C as:
static char screen[24][80];This array consists of 24 sub arrays, each made of 80 characters. The keyword static declares screen to be allocated at a fixed address in RAM instead of in the activation record of a procedure, so the array might be declared in an assembly language program as:
COMMON screen, 1920 ; 24 * 80 characters Pscreen:W screenBecause of linker restrictions on references to external symbols, references to the array screen would begin by loading Pscreen, the pointer to screen. To load screen[x][y] (character y on row x), the following code would be used, assuming that x and y are local variables in the current activation record:
LOAD R3,Pscreen ; r3 points to screen, the whole array LOAD R4,R2,x ; get local variable x ADDSL R4,R4,2 ; times 5 ADDSL R4,R3,4 ; times 16 plus Pscreen ; r4 now points to screen[x], one row of the array LOAD R3,R2,y ; get local variable y ADD R3,R4,R3 ; r3 now points to screen[x][y], one element of the array LOADS R4,R3 EXTB R4,R4,R3 ; r4 now holds a copy of screen[x][y]Note that no test of the array bounds was needed because C does not perform such tests, and that no subtraction to compensate for nonzero lower bounds on each dimension was needed because all C arrays are indexed starting at zero.
Compilation of case statements (switch-case in C, case in Pascal, select in Ada) is complex. This form of control structure was poorly understood in early programming languages and only came into its own in languages developed after 1970. In general, there are many alternative ways of compiling such a statement. Consider the following bit of Pascal code:
case x of 0: item(0); 1: item(1); 2: item(2); end;The most obvious suggestion most programmers would have for implementing this is to translate it to the following equivalent Pascal code:
if x = 0 then item(0) else if x = 1 then item(1) else if x = 2 then item(2) else error;This code is easy to translate to assembly code, but the translation is not particularly efficient if the number of alternatives is large.
The problem is, the straightforward chain of if statements amounts to a linear search. A good compiler will always find a more efficient way to get to the various cases. One of the fastest ways of dispatching control to the various cases is the indirect indexed jump. Consider the following C example:
switch (i) { case 1: option(1); break; case 2: option(2); break; case 4: option(4); }It would be possible to compile this as:
if (i == 1) option(1); else if (i == 2) option(2); else if (i == 4) option(4);But again, this is an inefficient linear search. Note, above, that we have illustrated a difference between C and Pascal! In Pascal, the case statement ends each case with an implicit branch to the end of the statement, and if the value of the index variable is not listed in the case list, it is an error. In C, cases in the switch statement must be followed with an explicit branch to the end of the statement -- done with a break statement, and if the index has a value that is not in the case list, it is not an error.
The efficient translation of this example cannot be directly expressed in either Pascal or C, but, but we can approximate it with a C-like pseudocode:
if ((i >= 1) !! (i <= 4) { static code * jumptab(4) = { L1, L2, Lend, L4 }; goto jumptab[i-1]; L1: option(1); goto Lend; L2: option(2); goto Lend; L4: option(4); Lend: ; }This can be translated directly to assembly code as follows, assuming that the index variable i is a local variable and that option is a local procedure:
LOAD R3,R2,i ; get the case index variable CMPI R3,4 ; test for i within bounds BGT Lend ADDI R3,-1 ; combine test of lower bound BLT Lend ; with adjustment of index LEA R4,jumptab ADDSL R3,R4,2 ; multiply index by 4 as it is used LOADS R3,R3 ; get jumptable entry JUMPS R3 ; use it! ALIGN 4 jumptab: W L1 W L2 W Lend W L4 L1: LIS R3,1 JSR R1,option BR Lend L2: LIS R3,2 JSR R1,option BR Lend L4: LIS R3,4 JSR R1,option BR Lend Lend:In the above code, note that bounds checking on the array reference is essential, and note how the test of the lower bound has been combined with adjusting the index variable so that the actual array indexing is done with a lower bound of zero.
More importantly, note that the array is stored in-line with the code, not somewhere else, and that it must be aligned.
In some contexts, the following variation on the above code is used:
LOAD R3,R2,i ; get the case index variable CMPI R3,4 ; test for i within bounds BGT Lend ADDI R3,-1 ; combine test of lower bound BLT Lend ; with adjustment of index LEA R4,jumptab ADDSL R3,R4,2 ; multiply index by 4 as it is used JUMPS R3 ; jump to array entry jumptab: JUMP L1 JUMP L2 JUMP Lend JUMP L4 L1: LIS R3,1 JSR R1,option BR Lend L2: LIS R3,2 JSR R1,option BR Lend L4: LIS R3,4 JSR R1,option BR Lend Lend:In this version, the jump table is an array of JUMP instructions instead of an array of addresses! Because the JUMP instructions are position independent, this means this code can be moved anywhere in memory without requiring changes to accomodate the relocation. On the other hand, jumping to a JUMP instruction is not recommended on modern machines that use pipelined execution because it causes a pipeline flush.
Pipelined execution is done on most modern processors. It involves fetching one instruction while the previous instruction is being executed while the results of the instruction prior to that are being stored. All forms of control transfers disrupt the pipelined execution, and jumps to jump instructions do so doubly!
Note that, when case or switch statements are compiled to use a jump table, sparse lists of case labels result in inefficient use of memory. If the only case labels are 1 and 1000, the jump table will contain 1000 entries, all but two of which point to the end of the case statement (in C) or the error handler (in Pascal). Some compilers first accumulate the list of case labels and then inspect it, selecting between alternative approaches to compiling it depending on the size of the table and the fraction of the table containing unused entries. Alternative approaches might include hashing (for very sparse tables), binary search and the indixed indirect jump described above.