14. The IBM System 360

Part of the 22C:122/55:132 Lecture Notes for Spring 2004
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

The Quintessential Mainframe

In 1965, IBM announced the IBM System 360. This announcement and the family of compatable machines that followed had effects that are still with us to this day.

The machine is covered in chapters 43 and 44 of Bell and Newell.

http://research.microsoft.com/~gbell/Computer_Structures__Readings_and_Examples/

Prior to 1965, the computer industry had pretty well split into providing two radically different kinds of machines; on the one side were character-addressable decimal machines, with 6-bit character-addressable memory and digit serial BCD arithmetic units operating on variable length words. These machines were sold for business applications, most of the high level language code was written in COBOL and RPG, and most of the applications were centered on report generation and simple database management tasks, mostly using magnetic tape and punched cards as the primary storage media.

The other widespread class of machines were word-parallel binary machines, mostly with 36-bit words but a few (notably made by Burroughs and Control Data Corp) with 48-bit words. Some of these machines offered rudimentary or better support for 6-bit characters, but generally, they operated on full-word binary integers and on floating-point numbers. FORTRAN was developed in this environment, and the first Algol compilers were applied here as well.

There were also an extremely small number of very small computers, the CDC 160 and the DEC PDP-5, with 12-bit words, and the DEC PDP-1 and 4, with 18-bit words, but the impact of such small machines would explode only in the early 1970's.

In this era, IBM felt stressed by the costs of maintaining market leadership in both worlds, trying to upgrade and develop both character-sequential commercial machines (the 702, 1401, 1460 and 1620) and word-parallel scientific machines (the 701, 704, 709, 7040, 7090 and 7094), so they undertook a major program working to develop a universal computer architecture that would meet the demands of both markets. In doing this, they felt considerable pressure from Burroughs, because the B5000 and its descendants were clearly designed with similarly universal goals. While IBM did this, they also set out to correct a number of problems with their existing machines and incorporate all of the good ideas they were aware of into one architecture.

Among the most basic contributions of this effort:

8-bit characters. There was widespread agreement that the 6-bit codes that dominated the market were inadequate, and having a power of 2 bits per character made good sense.
32-bit words. Having a power-of-2 characters per word simplified many issues.
Instructions that operated on characters, halfwords, words and double words; this was a necessary consequence of trying to meet the requirements of both COBOL and FORTRAN compiler writers. The B5000 was similar in its support of both worldviews, but with the 360, IBM took this to the limit, the same limit we expect in today's machines.
An architecture defined independently of its implementation. Prior to the System 360, architectures were defined by a particular machine, and then, if they were successful, later machines were designed to offer compatability. In the case of the 360, IBM planned, from the start, to offer both slow, inexpensive versions of the architecture, with microcoded implementations based on 8 or 16-bit internal data paths, and fast, expensive versions with 32-bit data paths and tightly optimized control units.

IBM combined a host of good ideas from older machines into the System 360 architecture. Among them

Indexed addressing; IBM went beyond this, offering double indexing. Indexing was a British invention from the mid 1950s; IBM's previous 36-bit machines had supported simple indexed addressing, but the System 360 went far beyond this. The Burroughs model could be thought of as going far beyond what IBM did, but it did so at a cost.
General purpose registers; all registers (but the PC) could be used as either index registers or accumulators, and with an 8-bit byte and 16-bit word, it was natural to use 16 registers, an unusually large number for 1965. These were a British invention dating back to 1956.
Condition codes for conditional branches on the result of arithmetic operations. Previous machines had a carry bit but other conditional branches typically tested the signs of registers or direclty compared registers and memory.
Very limited direct addressing; Burroughs had demonstrated that real programs have little need for large displacements, so IBM, while designing in terms of a 24-bit physical address space (that was obviously expandable to 32 bits), used only 12-bits for their displacement and direct-addressing field.
Use of direct-memory access channels; IBM's channel processors were extremely complex, comparable to small general purpose computers, but they were not able to execute general purpose programs (channels did, however, have fetch execute cycles!). Control Data Corporation, at about the same time, used small general purpose computers as input-output processors on their great supercomputer design, the 6600.

The System 360 was innovative in another way: It was clearly a child of both the system designers and marketing! In looking at the start of the project, the people in marketing observed that there had been two previous generations of computers, vacuum tube machines and discrete transistor machines, and they declared that the 360 would be the first of a new generation of machines, integrated circuit computers. In this, they pushed a few limits and bent a few terms, because most early 360 models were based on hybrid integrated circuits, that is, ceramic substrates with multiple diode and transistor chips bonded to them.

Even the name 360 was carefully crafted by marketing, it was the third generation machine for the 1960's.

The System 360 was a huge success, with at least 20,000 built in the first 5 years of production. Fully 2/3 of these were the smallest (and slowest) two models, and 1/3 supported only a subset of the architecture, but while these built the market, the high-end machines pushed the limits of performance.

As the 1960's ended and the 1970's began, marketing demanded that the machine be renamed, so the name 370 was used. With this name change came the completion of the transition to monolithic integrated circuit technology, consolodation of a number of good ideas that had emerged in some 360 models, and increased sales.

In the 1970's, Gene Amdahl, the chief architect of the System 360 project quit and formed his own company, Amdahl computers, inc. Using VLSI technology, he developed the Amdahl 470, a fully compatable machine and the the second clone faced by IBM. The RCA Spectra 70 was also a 360 clone, but its eventual role in the marketplace was less serious. It took antitrust lawsuits to force IBM to licence their operating systems to run on the Amdahl 470. Digital Equipment Corporation also faced clones in this area.

Today, IBM's line of Enterprise Servers are fully compatable, at least at the application level, with the 360; their underlying hardware is completely different, and their input-output architecture is unrelated to the old 360 channel architecture, but applications compiled in the late 1960's can still be run, and in practice, many of these old applications are almost certainly still running.

In retrospect, the 360 architecture has many characters of RISC design, particularly when contrasted with the Burroughs stack architectures, which are clearly CISC designs. Nonetheless, in the early days of the 360, it was seen as a complex instruction set. Today's VLSI implementations compete very well with RISC processors, and like much of the RISC marketplace, they have proven to be excellent platforms for supporting Linux.

The Basic Instruction Format

Given 16 registers and a 32-bit word, it is not surprising to find that IBM chose to base their architecture on the following basic instruction format:

  _______________ _______ _______ _______ _______________________ 
 |_______________|_______|_______|_______|_______________________|
 |   opcode (8)  | R (4) | X (4) | B (4) |      disp (12)        |

Essentially all memory addressing was based on the following model

ea = reg[X] + reg[B] + disp

That is, the effective memory address is the sum of the registers specified by the X and B fields of the instruction with the displacement taken from the instruction. Register zero was special; unlike the others, it was not formally general purpose, in that its contents were ignored in address compuation, so if X or B was zero, the value zero was used instead of the contents of reg[0]. Both the X and B fields selected registers from the same set of 16 general purpose registers, but the intended use of these two was quite different:

X - the index register; this was expected to be used for array subscript computation.

B - the base register; this was expected to point to the base of a record or block of variables extending for up to 4096 bytes. Typical programs set aside one base register for the current code segment one for the current activation record and one for the global variable segment, but aside from being a natural use of the hardware, there was no imposed memory segment structure the way Burroughs had.

Addresses were 32 bits (although in the first 360 releases, only the lower 24 bits were used), and addresses specified a byte. Initially, operands were required to be aligned (words must begin on a word boundary, halfwords must begin on a word or halfword boundary), but later, this constraint was relaxed. The operand size, 8, 16, 32 or 64 bits, depended on the opcode. 64-bit operands were largely the provence of the floating point unit, which is not discussed here (and was optional on low-end machines).

This led naturally to the following instructions:

LC r,x,b,disp -- load byte, reg[r] = memory[ea]
LH r,x,b,disp -- load halfword
L r,x,b,disp -- load (word)
STC r,x,b,disp -- store byte, memory[ea] = reg[r]
STH r,x,b,disp -- store halfword
ST r,x,b,disp -- store (word)

One more instruction was crucial

LEA r,x,b,disp -- load effective address, reg[r] = ea

This allowed loading immediate constants (with x and b equal to zero), and it allowed loading pointers to any operand that could be directly addressed. (The 360 assembly language used instruction mnemonics that varied from 1 letter to 4 or more, quite unlike the "TLA" or three-letter acronym model of assembly language favored on many other machines.)

Arithmetic rested on memory reference instructions, so there were instructions like:

A r,x,b,disp -- add word, reg[r] = reg[r] + memory[ea]
AH r,x,b,disp -- add halfword
S r,x,b,disp -- subtract word, reg[r] = reg[r] - memory[ea]
SH r,x,b,disp -- subtract halfword

The instruction set was far from uniform! Bytes could be loaded and stored, but no add byte from memory was provided. Simmilar irregularities pervaded the instruction set.

Control transfers were based on the following:

BAL r,x,b,disp -- branch and link, reg[r] = pc; pc = ea
BC c,x,b,disp -- branch conditional, if condition(c), pc = ea

On calling a function, the return address is left in a register, so the function can return by using that register to compute the effective address of a branch; if the return address is discarded (in register 0?), this gives us a simple branch; if the function wishes to call other functions, it must save its linkage register in memory (easily done while saving other registers it may need to restore after the call).

The conditional branch instruction tested the condition code register; all arithmetic instructions set this register to indicate the the result. By today's standards, the condition code register was a bit rudimentary, since it was only 2 bits and therefore, condition code encodings were a bit irregular compared to modern systems.

Short Instruction Formats

Many sequences of instructions operate entirely register-to-register, so they have no need for the long operand field of the basic memory reference format. To support this, the 360 family fetched instructions in increments of 16 bits, fetching the second 16 bits only if required to do so by the opcode. Many memory reference instructions were available in short form:

  _______________ _______ _______ 
 |_______________|_______|_______|
 |   opcode (8)  | R (4) | X (4) |

for these instructions, the effective address computation was similarly abbreviated:

ea = reg[X]

Among the instructions using this form were:

BALR r,x -- branch and link register, reg[r] = pc; pc = ea
BCR c,x -- branch conditional register, if condition(c), pc = ea

These provided simple register-indirect branches and calls; used for, among other things, returning from previous calls.

Another important group of short instructions were register-to-register arithmetic instructions:

  _______________ _______ _______ 
 |_______________|_______|_______|
 |   opcode (8)  | R1(4) | R2(4) |

These included things like:

AR r1,r2 -- add register, reg[r1] = reg[r1] + reg[r2]
SR r1,r2 -- subtract register, reg[r1] = reg[r1] - reg[r2]
NR r1,r2 -- and register, reg[r1] = reg[r1] & reg[r2]
OR r1,r2 -- or register, reg[r1] = reg[r1] | reg[r2]

Note that the logical operations were only available register to register

Other Instruction Formats

With 8 bits for the opcode, the 360 had plenty of room for odd things, and there were some immediate to memory instructions, character string instructions (memory to memory), and load and store multiple instructions, useful for loading and storing whole groups of registers. The most complex instruction in the standard instruction set was probably translate and test, which copied a character string from memory to memory, translating and testing each character through a lookup table along the way. Memory to memory variable-precision decimal arithmetic was similar in complexity.

The irregularity of the 360 architecture would vex architects for the next decade, leading to some remarkable architectures that tried to impose regularity on the chaos of the 360 instruction set. Microprogramming invites irregularity, as does the belief that logic is almost free, but programmers intent on learning the instruction set would much rather have a very regular and easy to memorize structure.