22C:122/55:132 Notes, Lecture 31, Spring 2001

Douglas W. Jones
University of Iowa Department of Computer Science

Skip Instructions
Prior to the introduction of the IBM System 360 in 1965, the most common machine level conditional control structures were based on one of two models:
Conditional skip instructions were extremely common. These only worked if the instruciton size was rigidly fixed, but this was as commonly the case on early machines. If the fetch-execute cycle is:
```
	repeat
	    IR = M[PC];
	    PC = PC + 1;
	    execute
	forever
	
```
an unconditional skip instruction would do:
```
	    execute skip:
               PC = PC + 1;
	
```
Executing a skip instruction causes the CPU to skip over the immediately following instruction in the instruction stream. Typical skip instrucitons included:
- skip if accumulator negative
- skip if accumulator non-negative
- skip if accumulator zero
- skip if accumulator nonzero
- skip if accumulator even
- skip if accumulator odd
At the cost of only a small amount of extra logic, it was possible to add:
- skip if accumulator positive (non-negative and nonzero)
- skip if accumulator non-positive (negative or zero)
Particularly when these were absent, programmers became quite creative at combining skip instructions to achieve interesting effects, for example:
```
	skip if negative
	skip if nonzero   -- only executed if non-negative
	x                 -- skipped if accumulator non-positive

	skip if non-negative
	negate            -- takes the absolute value

        skip if negative
	branch            -- branches if non-negative
	
```
This approach to conditionals is natural on a single accumulator machine with a short word size, where the address field of the instruciton is used to encode the skip condition (and perhaps other non-memory-reference operations). When there are multiple accumulators, we typically add a register field to the skip instruction to indicate the register to be tested.
When the word size is large, we have an address field that invites some use:

Use the address field for a branch destination address.
In this case, what would have been a skip instruciton becomes a conditional branch instruction.

Use the address field as an operand for a compare instruciton
In this case, what would have been a skip instruciton becomes a conditional compare and branch instruction.

A very common variant on the second option above was a compare and skip instruction with the following semantics:
```
	    execute compare and skip:
	       if AC >= operand
		  PC = PC + 1;
	       if AC > operand
		  PC = PC + 1;
	
```
Typically, this was available in both an immediate form, where the operand is stored in the address field of the instruciton, and in a memory addressed form, where the operand was stored in memory at the indicated operand address. In either case, we get the following basic programming models:
```
	compare and skip, operand
	x                 -- only executed if AC < operand
	y                 -- only executed if AC <= operand
	z                 -- always executed

	compare and skip, operand
	branch x          -- only executed if AC < operand
	branch y          -- only executed if AC = operand
	z                 -- only executed if AC > operand
	
```
Clever programmers, of course, found other useful approaches to using these constructs, for example, for bounds checking:
```
	compare and skip, upper
	compare and skip, lower
	branch x                  -- AC < lower or AC = upper
	branch x                  -- AC = lower or AC > upper
	y                         -- lower < AC < upper
	
```
On some machines with a small word size, the compare and skip operation always compared the accumulator with zero; on others, it was possible to compare any of several accumulators with arbitrary constants.
This compare and skip family of instructions was present on the IBM 704 computer on which the FORTRAN language was developed in the mid 1950's. This explains the arithmetic IF statement that was the primary control structure in the original versions of the FORTRAN language. The statement IF(E)10,20,30 would goto 10 if the value of E was less than zero, 20 if E was zero, and 30 if E was greater than zero.
Because FORTRAN was one of the two dominant programming languages of the 1960's, computer architects in that era felt obligated to include 3-way conditional branch constructs in their machines.
Condition Codes
The designers of the IBM System 360 family in the mid 1960's adopted a new approach to handling conditional branches. These machines had a 2-bit status indicator in the CPU, and all instructions that could produce a result worth testing for a conditional branch set those status bits to indicate the results. These bits were called the condition code bits, and they were part of the machine's program status doubleword.
The IBM 360 condition codes were difficult to implement and difficult to interpret because the meaning of each condition code combination depended on the instruction that produced it in a somewhat unpredictable way.
The DEC PDP-11 family of computers introduced in 1970 introduced the view of condition codes that we now take for granted. On this machine, essentially every operation that routed its operands through the ALU set the condition codes in a uniform way; there were 4 condition code bits:

N
set when the ALU output is negative.
Z
set when the ALU output is zero.
V
set when there is a 2's complement overflow.
C
set when there is a carry out of the ALU.

This model was adopted with only a few changes on the Motorola 68000, the Intel 8080 and later on the Intel 80x86. The most significant changes included:
- Renaming the V condition code as the O condition code. DEC was concerned about the potential confusion of O and 0 (oh and zero), so they refused to use the letter O here.
- Different processing for the C bit in subtract instructions. DEC openly allowed the C bit to reflect the realities of 2's complement arithmetic: C means not borrow. Some programmers found that confusing, so some machine architects designed machines that inverted the sense of the carry bit after subtraction. found that ar
Given this PDP-11 set of condition codes, there are 14 useful conditional branch instructions, and if we add always and never to the list, we get 16 alternatives, suggesting that a 4-bit field of the branch instruction can be reserved to select the condition to be tested.
- branch if N reset
- branch if Z reset (branch if not equal)
- branch if V reset
- branch if C reset (branch if greater or equal, unsigned)
- branch if greater or equal (~N xor V)
- branch if greater than (~(N or Z) xor V)
- branch if greater than, unsigned (C and ~Z)
- branch never (false)
- branch if N set
- branch if Z set (branch if equal)
- branch if V set
- branch if C set (branch if less or equal, unsigned)
- branch if less than
- branch if less or equal
- branch if less or equal, unsigned)
- branch always
The branch conditons listed in the second half of this list are all the simple inverses of those listed in the first half. Note that, after using a subtract instruction to compare two numbers, testing the Z condition code tests for equality. If the two numbers are both signed 2's complement numbers, the greater-than and less-than relationships are conveyed by the sign of the result, but if there is an overflow, the sign is wrong. If the two numbers are both simple unsigned numbers, the carry bit must be tested to determine which number was greater.
On the PDP-11 and essentially all of its successors that copied this condition code model, there were groups of 16 branch instructions that allowed testing of these condition codes. In the case of the PDP-11, these instrucitons were simple 16 bit instrucitons with a very short relative branch displacement field. This meant that conditonal branches could only be taken to target instructions that were nearby. On some later machines, the conditonal branch instruction was generalized to take an arbitrary operand address to specify the destination of the branch, but the truth is, most branches are to nearby instructions, so this may have been a waste of bits in the program representation.
Condition Codes in a Pipelined Setting
Consider a pipelined machine such as our running example:
```
	          ______       _____
	      IF |______|-----|     |
	          |>___|   _  |     |
	      AC |______|-| | |     |
	          |>___|  | | |     |
	      OF |______|=| |-|  M  |
	          |>___|  |R| |     |
	      OP |______| | | |     |
	          |>___|  | | |     |
	      OS |______|=|_|-|_____|
	
```
Whether this uses the now standard PDP-11 model of condition codes, or the older IBM System 360 model, the condition codes cannot be set until the operate stage in the pipe.
If all condition code tests are done in the same pipeline stage, then the tests will see the correct condition codes and conditional branches will be executed correctly. This is possible in a simple pipeline, but it does not generalize to a superscalar pipeline, and it makes out-of-order execution difficult.
The chief designer of the DEC PDP-11 (C. Gordon Bell, well known for the textbook on computer architecture he co-authored) later concluded that one of the serious shortcomings of the PDP-11 design was that it overused the condition codes. Empirical studies of compiler output showed that the condition code values set by many instructions were never used, so the machine would have worked just as well if those instructions had not set the condition codes. Had this been the case, it would have made it far easier to apply ideas such as out-of-order execution to this machine.
In the general case, the condition codes must be treated as just another operand register. Instructions that depend on the condition codes must fetch the value of this register as they gather their operands, and instructions that set the condition codes must set this register as they set other destinations. This model adds significantly to the complexity of the pipeline interlock logic, but it introduces no new conceptual ideas.
This model is annoying enough that many designers of pipelined machines have decided to abandon the idea of condition codes. The designers of the DEC Alpha, for example, eliminated condition codes in favor of instructions that tested the values in registers. So, you could branch if register positive, zero, or negative. Exceptional conditions such as overflow that fit so neatly into the condition code model were relegated to exception handlers -- on the assumption the use of 64 bit general registers would eliminate almost all of the overflows that traditionally plague programmers on machines with short word sizes.
Another alternative is to return to skip instructions. If there is no instruction reordering or if the interlock logic prevents reordering from applying to skip instructions, it becomes straightforward to implement skip instructions in a pipelined machine. The skip instruction simply invalidates its successor in the pipeline in order to cause the successor to be skipped.