7. Bus Level Design

Part of the 22C:122/55:132 Lecture Notes for Spring 2004
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Introduction

Busses are not required in the design of a computer system. All digital systems that use busses can can be built, in theory, using multiplexors and demultiplexors to serve the same function. Nonetheless, we use bus-based design in almost all modern systems, and it has been present, to some extent, even in the earliest digital computers.

Bus-oriented design has some important advantages:

Sidenote The term bus in electronics comes from the term bus-bar in power-plant design; this term may date back to Edison. In a classical power plant, with many dynamos powered by piston engines, a pair of copper bars was used to combine the outputs of the dynamos and transfer the output to the various power transmission lines leading from the plant. These bars, taken together, were the bus for the powerplant. The topology of the entire system was quite similar to the topology of a computer system with its own bus, except that the subsystems were dynamos and external transmission lines, and there were only two conductors in the first generation of these early busses.

Low-Level Issues

Physically, a typical bus consists of many wires. Some are ground or return wires, required to complete the electrical circuits of the underlying electronics, and some are signal lines.

Because the lines in a bus are typically relatively long, they must be driven using more power than the short lines carrying logic signals within a subsystem. Furthermore, long lines are more likely to act as antennas than short ones, picking up interference from the outside world. Because of this, we generally use bus receivers, special circuits with a better noise rejection characteristic than the common logic circuits. Aside from this distinction, the actual encoding of logical values on bus lines is typically similar to that used elsewhere in the digital system.

When a bus is long enough that the transmission time on the bus is significant compared to the speed of the logic used in the system, echoes from the ends of the bus must be controlled. This is done using bus terminators. Typically, these are resistor networks at the ends of the bus, matched to the characteristic impedance of the bus lines, and connected to either ground, the logic supply voltage, or a carefully selected neutral intermediate voltage.

Bus lines can be divided into two categories:

There are two kinds of drivers appropriate for a multiple-source bus line. Both are common in digital systems:

Bus Masters

Components that attach to a bus may be divided roughly into two groups: bus masters and bus slaves. The bus master or bus masters attached to some particular bus are the ones that control the transfer of data on that bus, while the slaves react, responding to instructions from the masters.

If we use the CPU-to-memory bus of a computer system as an example, the CPU is a bus master, while each memory module is a slave. The CPU initiates all bus transactions in this system, while the memory modules passively do what the CPU instructs.

Generally, bus-based systems may be divided into two groups depending the number of bus masters:

In most desktop and larger computer systems, multimaster busses are quite common. Looking at the CPU-to-memory bus, for example, The CPU and the DMA controller are both bus masters, able to independently initiate memory read or write operations.

Multi-master busses are complex, so the busses in smaller computer systems are almost all single-master designs.

A Typical Single Master Bus

A typical single master bus might be designed as follows:

	 ----------
	|          |----------------------- Data  (tristate)
	|          |----------------------- Lines
	|          |
	|          |-\--------------------- Address (single source)
	|   BUS    |-/--------------------- Lines
	|  MASTER  |
	|          |->--------------------- Direction (read or write)
	|          |                        (single source)
	|          |->--------------------- Strobe or clock
	|          |
	 ----------
From the master's perspective, a bus read cycle involves first placing the desired address on the bus and setting the direction line to indicate a read cycle, and then putting a pulse out on the strobe or clock line. The slave will try to put data on the data lines for the entire duration of this pulse, and typically, the master will clock its flipflops to consume this data at the end of the strobe pulse.

A write cycle on this bus is similar. The master begins by setting the desired address and by setting direction signal to indicate a write cycle, and then forcing the desired data onto the data lines. Once this data is stable, the master outputs a strobe or clock pulse to indicate that the addressed slave should consume the data.

The following timing diagram illustrates a read and a write cycles:

	                read cycle       write cycle
                     (data from slave) (data from master)
	          |         ____            ________
	     data |---------____------------________----
                  |        :   :             :   :
	          |     __________        __________
	  address |-----__________--------__________----
                  |        :   :             :   :
                  |                    _________________
        direction |___________________|
                  |        :   :             :   :
                  |         ___               ___
           strobe |________|   |_____________|   |______
                  |            
                 -|--------------------------------------
                  |
In the above, the following notation is used:
	            
	          |                  ________________ 
	          |------------------________________---
	          | invalid or no    valid (each line
	          | defined value    in this group is
	          |                  either 0 or 1)
During a read cycle, we speak of the required delay between the time the address becomes valid and the strobe pulse to read the data. If an insufficient delay is provided by the master, the device will put invalid data on the bus during the strobe pulse. This time interval is the read-time of the device. The receiver of the data being read only looks at the data during the strobe pulse. During a write cycle, we talk about the delay between data valid and the strobe pulse. This is the setup-time of the device. The total delay from address valid to the significant edge of the clock pulse, typically the read-time plus the setup time, must be shorter than the time from address valid to the significant edge of the clock pulse.

Bus Slave Design

Consider a very simple component, a one-word register R that can be read or written at address X. This can be realized as follows in terms of the bus master outlined above:

	--------------------------------------------- Data
	-------------------------------   ----   --- Lines
	                               | |    | |
	-\-----------------------------| |----| |--- Address
	-/-----   ---------------------| |----| |--- Lines
	       | |                     | |    | |
	->-----| |--------o------------| |----| |------ Direction
	       | |        |            | |    | |
	->-----| |-----o--|------------| |----| |------ Strobe or clock
	     __\_/__   |  |            | |    | |
	    |  =X   |  |  |   ___      | |    / \
	    |_______|  |  o--|   |   __\_/__  | |
	        |      o--|--|AND|--|<-_  R | | |
	        o------|--|--|___|  |_______| | |
	        |      |  |   ___      | |    | |
	        |      |   -O|   |     | |    / \
	        |       -----|AND|-----| |---/___\
	         ------------|___|     | |____| |
                                       |________|

The above trivial example can trivially converted into several different useful devices. For example. if we change only the bottom part of the figure, we get a very rudimentary parallel output port:

                    strobe               
             address   | direction   data in and out
	     __\_/__   |  |            | |    | |
	    |  =X   |  |  |   ___      | |    / \
	    |_______|  |  o--|   |   __\_/__  | |
	        |      o--|--|AND|--|<-_  R | | |
	        o------|--|--|___|  |_______| | |
	        |      |  |   ___      | |    | |
	        |      |   -O|   |     | |    / \
	        |       -----|AND|-----| |---/___\
	         ------------|___|     | |____| |
                                       |  ______|
                                       | |
                                     external
                                     connector

This parallel port allows the contents of the register R to be inspected by some device in the outside world; it also allows the host system to read the register; this function is useful in hardware diagnostics (to verify that the register works or if any outputs have been short circuited to power or ground), and with minor added hardware (series resistors between the register and the inputs) it allows the port to be used for input as well as output. The IBM PC parallel port expands on this in one crucial way: There is a second register that controls output-only bits that are used, for example, to tell the external device when data is ready, and there are input-only bits that allow the external device to indicate when it is ready.

As another example, consider a memory module constructed as follows:

                    strobe               
             address   | direction   data in and out
	    ___|_|___  |  |            | |    | |
	   || |   | || |  |            | |    | |
	    | |   | |  |  |          __\_/__  | |
	    | |   |  --|--|--------\|       | | |
	  __\_/__  ----|--|--------/|addr   | | |
	 |  =X   |     |  |   ___   |  MEM  | / \
	 |_______|     |  o--|   |  |       | | |
	     |         o--|--|AND|--|<-_-   | | |
	     o---------|--|--|___|  |_______| | |
	     |         |  |   ___      | |    | |
	     |         |   -O|   |     | |    / \
	     |          -----|AND|-----| |---/___\
	      ---------------|___|     | |____| |
                                       |________|

Here, we have broken the address into two parts, using the most significant bits of the address for device selection and using the least significant bits as the memory address input to the memory module itself.

This is very close to the design used for small static RAM modules sold in the mid 1970's. DRAM systems are more complex because of the need for refresh logic. Typical 1970's and even early 1980's memory subsystems used jumpers or miniature switches to set the address X to which the memory module responds.

Today's SIMM modules have added complexity because there are only, for example, 11 address input pins on a 72 pin SIMM with a capacity of 16 or 32 megabytes (4 to 8 million 32 bit words). The internal memory address register on the SIMM is 22 bits long, divided into two 11 bit sub-registers (referred to as page and word in page, or as row and column select registers). A memory cycle requires that the address be strobed into the chip in two cycles before the data can be inspected or stored. But note! 22 bits only allows addressing 4 megawords. This SIMM, when used in the 8 megaword configuration, had separate strobes for the two banks on the one SIMM. The decision to clock data into the SIMM was made by an external memory controller subsystem.