8. Arithmetic and Logic.

Part of 22C:60, Computer Organization Notes
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Digital Logic

19th century logicians, most notably George Boole, identified the basic operators of what we now know as Boolean algebra, the and, or and not operations. These operators are, of course, the familiar operators used in Boolean expressions in common programming languages as well as being among the basic operators of the logic that philosophers have studied for millenia. What logicians like Boole did was bring these into the realm of mathematics by demonstrating that these operators formed an algebra in just the way that addition and subtraction on the integers form an algebra.

Our goal is to examine enough Boolean algebra to understand how basic arithmetic operations can be performed using only Boolean operations. This allows us to study how the arithmetic unit of a computer works. We will not discuss how Boolean operations are implemented in hardware, but will accept, on faith, that this can be done. Thus, we will not have to deal with transistors or other devices.

Boolean logic operates on the values true and false, and it was George Boole who recognized that if true is represented by the value one and false is represented by the value zero, we can think of or as being similar to addition, where adding zero does nothing, and we can think of and as being similar to multiplication, where multiplying by zero always gives zero, and multiplying by one does nothing.

Based on these observations, Boole proposed that we use the symbols for arithmetic operators to stand for logic operators. So, in Boolean notation, a+b means a or b, and ab means a and b. This notation is still widely used, with only one minor change. Where Boole used the minus sign to mean not, modern logicians tend to use an overbar, so modern logicians write a instead of -a to mean not a. One other symbol has come into common use in Boolean algebra, the use of ab to mean a exclusive-or b. This is true if a is true or b is true, but not both.

The field of Switching theory emerged before World War II, largely motivated by the use of complex systems of electromechanical relays in telephone exchanges. This field applies Boolean logic to practical engineering problems, and with the development of electronic calculators and then computers, switching theorists quickly moved into the center of certain aspects of computer design.

Engineers in general and electrical engineers in specific have a long tradition of using graphical notations to do their design work, and by the late 1960's, a standard graphical notation emerged for Boolean logic that is well suited to the problem of expressing the uses made of this logic formalism in the context of computer design. The basic elements of this notation are given on the next page. In this notation, inputs are shown on the left and outputs are shown on the right. Lines connecting components are used to show the flow of data from outputs to inputs.

Schematic notation for digital logic gates
            graphic notation for digital logic gates            

 
Boolean logic operations
inputs the output c
a b  and
  a b  
 or
 a + b 
 nand
a b
 nor
 a + b 
  not 
a
 xor
 ab 
0 0 0 0 1 1 1 0
0 1 0 1 1 0 1
1 0 0 1 1 0 0 1
1 1 1 1 0 0 0

In addition to the conventional Boolean operators and, or and not, engineers quickly introduced two new operators, nand and nor. These correspond to and and or with their results inverted by a not operator, as indicated in the algebraic notation for Boolean logic by an overbar over the expression. These were introduced largely because the simplest amplifiers used in digital logic perform the not function, and a typical hardware implementation of the and or or function produces an output that must be amplified before it is used as an input to another logic operator.

The triangle symbol is an old symbol for an amplifier in electronic schematics, and it is still used in this way in some schematic diagrams for logic circuits, standing for a gate with inputs and outputs that are equal. The small circle on the output of the nand, nor and not gates indicates logical negation. Conventionally, this is placed on the output of the gate, but occasionally, you will see this little circle placed on an input in order to indicate the presence of an inverter or not gate built into that input.

The exclusive or operator is also given a special symbol in this schematic notation, but not because it is a primitive operator that is easy to build in hardware. In fact, the exclusive or operator is difficult to build, requiring hardware equivalent to 4 nand operators, but it is very common in arithmetic.

Electrical engineers refer to hardware devices that perform the functions of Boolean operators as logic gates. Each physical logic gate has connections for a power supply as well as connections for inputs and outputs. A schematic diagram shows the interconnection of inputs and outputs as lines between the symbols for the individual gates; these are frequently referred to as wires, even if the actual hardware involves conductive traces on silicon or on a printed circuit board. The following collection of simple gates, for example, are shown connected to perform the exclusive or function:

Constructing exclusive-or from and, or and not
                schematic notation for exclusive-or built from and, or and not                
c  =  ab  =  (a b) + (a b)

When explaining the exclusive or function, there are several reasonable explanations. For example: The exclusive or is true if one of the inputs is true and the other is false. The formulation given in the figure above, in both schematic and equational forms, corresponds to the this English explanation.

Electrical enginners use some standard notational conventions in drawing schematic diagrams. First, inputs are usually on the top or left, and outputs are usually on the bottom or right. Thus, information within a logic circuit usually flows from top to bottom and from left to right. Second, where wires cross, this may be indicated by a schematic indication that one wire hops over the other, while where wires are connected, a dot (historically representing a screw) is added to emphasize the connection. To avoid ambiguity, connections are usually shown with lines meeting to form a T, while crossings form an X.

Historically, engineering students would buy templates to help draw the symbols for the logic gates, since freehand drawings can be quite sloppy. If you do draw them freehand, note that the key features that distinguish an or symbol from an and symbol are that the or has a curved input side and a slight point on the output side, while the input side to the and is straight and the output side is a perfect half-circle.

Exercises

a) Consider the logic diagram given above for constructing an exclusive or gate from and, or and not gates. What function does this compute if the 2 not gates are replaced by a single nand gate that combines the a and b inputs and passes its result to both of the inputs originally served by separate not gates.

b) Consider the problem of comparing two 1-bit binary numbers for equality. This can be done with an exclusive-or gate and one other gate to give an output of 1 if the numbers are equal and an output of zero if they are different. Draw the schematic diagram for a a solution to this problem and give an equation in Boolean logic notation.

The Laws of Boolean Algebra

An algebra is characterized by a set of values and a set of functions over those values. The common numerical algebras start with a set of numbers and add the arithmetic operators. There are many numerical algebras, depending on whether we start with real numbers, complex numbers or something else. In Boolean algebra, we starte with a set of just two values operated on by the Boolean operators.

The basic laws of this algebra are the laws that define the behavior of these operators. Many of these laws look very much like the laws of numerical algebra, but a few are different:

The basic laws of Boolean algebra
zero a + 1 = 1
a 0 = 0
identity a + 0 = a
a 1 = a
commutative   a + b = b + a
a b = b a
associative (a + b) + c = b + (a + c)
(a b) c = b (a c)
distributive a (b + c) = a b + a c
a + b c = (a + b) (a + c)

The first and the last laws given above do not hold for arithmetic. Our numerical algebra has only one zero, but Boolean algebra has two values an algebraist would describe as zeros, because not only does anding anything with 0 give 0, but also, oring anything with 1 gives 1. In numerical algebra, multiplication distributes over addition, but addition does not distribute over multiplication. In Boolean algebra, and distributes over or and or distributes over and. We can prove this using a truth table:

Proof that or distributes over and
 a   b   c   bc   a+bc   a+b   a+c   (a+b)(a+c) 
0 0 0 0 0 0 0 0
0 0 1 0 0 0 1 0
0 1 0 0 0 1 0 0
0 1 1 1 1 1 1 1
1 0 0 0 1 1 1 1
1 0 1 0 1 1 1 1
1 1 0 0 1 1 1 1
1 1 1 1 1 1 1 1

This is a simple proof by exhausting all possible combinations of the three variables a, b and c. The column with the heading a+bc gives the left side of the distributive law in question, while the column labeled (a+b)(a+c) gives the right side. The fact that these two columns are identical demonstrates the truth of the law. The other columns show terms that were used to compute these two key columns.

The exclusive-or operator is not covered by the basic laws because it is considered to be a derived operator. It is possible to prove from these laws that the following identities hold for it:

The laws of exclusive or
identity a ⊕ 0 = a
inverse a ⊕ 1 = a
commutative   ab = ba
associative (ab) ⊕ c = a ⊕ (bc)

We can prove the above laws using algebra, starting with the algebraic definition of exclusive-or in terms of and, or and not. Students of algebra know that algebraic proofs can be hard. To prove the equivalence of two expressions, you have to find a path through a maze of algebraic substitutions. With only two values in the Boolean system, it is frequently easier to simply use a truth table.

There are two laws of Boolean algebra that have no precedent in numerical algebra, DeMorgan's laws. These complete the story of the symmetry of and and or that is apparent in the basic laws given above.

DeMorgan's laws
    a + b   =   a b    
a b   =   a + b

We can use DeMorgan's laws to create new symbols for some of the basic logic gates:

DeMorgan's Laws in schematic form
            schematic notation for inverted digital logic gates            

Remember, bubbles on the inputs and outputs of gate symbols mean not. Under DeMorgan's laws, an and gate with inverters on its inputs and outputs acts like an or gate, while an and gate with inverters on its inputs only acts like a nor gate. This allows us to redraw the schematic diagram already given for an exclusive-or gate, using three nand gates to replace the cascade of and and or gates in the original:

DeMorgan's Laws applied to constructing an exclusive or gate
                            schematic diagram for exclusive-or built using nand gates                            

Exercises

c) Construct the truth table to prove that and distributes over or.

d) Construct the truth table to prove both versions of DeMorgan's laws.

e) Write out the algebraic formula for ab using only the nand and not operators.

f) Here is a fragment of C code. Rewrite it using DeMorgan's Laws to replace the or operator with an and operator. Express the result as simply and compactly as possible:

if ((a < 0) || (a > 9)) error( "out of bounds" );

Using Logic to Increment a Binary Number

Each bit of a binary number can obviously be associated with a Boolean value, and George Boole's convention that 1 is used to represent true and 0 is used to represent false apply naturally when doing this. Thus, we can think of the 32-bit binary number a as being composed of a vector of Boolean values a0 to a31, where the subscripts incicate the power of two in the place-value system.

Given numbers represented as arrays of bits, how can we build hardware to do arithmetic operations on them? To start with, how can we increment a number? It is useful to start with a review how to increment a number by hand. We add one to the least significant digit, and if the result is greater than the radix, we carry by adding one to the next digit. doing this may cause yet another carry. We have a local operation that we perform at each digit; this computes the output value for that digit and also the carry that is to be added to the next digit. In schematic terms, we have the following:

Top level schematic view of an incrementor
Top level schematic view of an incrementor

In the above diagram, the 4-bit addend a is incremented by c0 to produce the 4-bit sum s. The carry bits ci violate our standard convention for the direction of data-flow, left to right and top to bottom, because we have allowed our convention for writing numbers to force the least significant bit to appear to the right. The carry input to the entire incrementor is c0. If this is zero, the sum s will equal the addend a, while if it is one, s will usually equal a+1. The intermediate carry bits c1 to c3 hold the carry out of one stage of the incrementor and into the next. The final carry bit, c4 holds the carry out of the adder; if this is one, the input a must have been 1111 and the output s is 0000, an unsigned overflow.

The logical operation performed at each bit position in the above incrementor, within the box marked with a plus sign, is fairly simple. There are two inputs, ai and ci, and two outputs, si and ci+1, so the truth table for this function will have 4 rows and two output columns. Here it is:

Increment
inputs outputs
 ai   ci  ci+1  si 
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0

The above truth table was filled in by summing the bits in the input row and then recording this sum, as a binary number, in the output row. The least significant bit of the output is the sum output for that bit position, while the most significant bit of the output is the carry into the next bit position. Having built the table, it is easy to compare each output column with the outputs of the various standard Boolean functions. On doing this, we see that ci+1 is just the logical and of ai and ci, and that si is the exclusive or of ai and ci. This leads to the following gate-level design for each digit of the incrememtor:
Gate-level schematic view of an incrementor
Gate-level schematic for incrementing a bit
                            ci+1 = ai ci                            
si = aici

Exercises

g) Work out the truth table for decrementing a binary number, with inputs ai (a bit of the input) and bi (the borrow input to that bit) and outputs bi+1 and di (the decremented output). Follow through and give a gate-level schematic for one bit of a decrementor. Note: Assume bi is true when one is to be taken from ai and false when no change is to be made.

h) From algebra, a-1 is -((-a)+1), and in the two's complement system, -x is x+1. Combining these allows us to build a decrementor from an incrementor and some added inverters. Show the logic diagram for this.

Using Logic to Add Binary Numbers

We can attempt to design an adder using the same approach we used for an incrementor. As with the incrementor, addition involves propagating carries from one digit to the next, but instead of a single input, we have two inputs, the addend, which we will call a and the augend, which we will call b. Addition is commutitive, so we could call them both addends; the terms addend and augend distinguish the number that is augmented from the number added to it. At the top level, a 4-bit adder will look like the following:
Top level schematic view of an adder
Top level schematic view of an adder

The truth table for one bit of an adder is very similar in form to that for one bit of the incrementor, but with 3 inputs instead of 2. As with the truth table for the incrementor, we can read each two-bit output row as a binary number giving the number of one bits in the corresponding inputs, but now, our sums run from zero to three instead of from zero to two. Here is the table for the adder:
 

The truth table for addition
           
inputs outputs
 ai   bi   ci  ci+1  si 
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
           

The first 4 rows of the above table output are the same as the table for the increment function, but the whole table is messy enough that it is hard to see how to get the outputs from the inputs.

Focusing only on the sum output si, this output is one only when an odd number of ones appear in the corresponding input row; furthermore, so long as any input is zero, this output is the exclusive or of the other two inputs. In fact, si=aibici We do not need to parenthesize here because exclusive-or is associative, so every order of evaluation gives the same result.

The carry output ci+1 is more complex, but it is not difficult to determine that ci+1=aibi+aici+bici.

Another ad-hoc reasoning path leads to the fact that an adder can be made from two incrementors plus an additional or to combine the carry outputs. Because of this, the increment circuit is sometimes called a half adder and the three-input adder is distinguished from the incrementor by calling it a full adder.

A full adder made from two half adders
                    A full adder made from two half adders                    

Ad-hoc design can be slow and labor intensive unless you are lucky enough to have the right flash of inspiration. If we want to build interesting computers without waiting for inspiration, we need a systematic approach to designing logic circuits that implement arbitrary Boolean functions. Switching theory, a field that began to develop in the 1920's motivated by the development of the dial telephone system, is a field that combines this practical interest in engineering with the abstract mathematics of Boolean logic.

Switching theorists have come up with the following general methodology for converting any truth table to a logic circuit to compute the Boolean function described by that truth table:

  1. Arrange a column of and gates, one gate per row of the truth table but only for those rows where the output section of the truth table is not all zeros. Number these and gates using the row-numbers in the truth table, where the row number is obtained by reading the input values for that row as a binary number. The and gates have their inputs on the left and their outputs on the right.

  2. Arrange a row of or gates below and to the left of the and gates, using one or gate per output of the function. The inputs to these gates are on top, and their outputs are on the bottom. Label the outputs in the same order they are in the truth table.

  3. Arrange the inputs in a row above and to the left of the column of and gates, and for each input, add an inverter so that both the input and its inverse are available.

  4. For each of the and gates, wire its inputs according to the ones and zeros in the corresponding row of the input half of the truth table. The input half of the truth table is sometimes known as the and array because of this! If there is a one for some input in that row, that input should be connected directly as an input to that and gate. If there is a zero, the inverse of that input should be connected as an input to that gate. Each of these and gates will have an output of one only when the inputs are set to the binary code for that row.

  5. For each gate in the row of or gates, wire it according to the ones in the corresponding column of the output half of the truth table. The output half of the truth table is sometimes known as the or array because of this! For each one at the intersection of a row and column in the output half of the truth table, wire the output of the and gate for that row to the input of the or gate at the base of that column. This completes the circuit!

Except in some trivial special cases, this methodology produces results that are optimal in speed, assuming that the delay through any logic gate is constant, but it does nothing to minimize the number of logic gates. It is worth noting, however, that the designs that emerge from this methodoloty are excellent starting points for optimization exercises. The circuit below illustrates the use of the above methodology to implement a full-adder:

Gate-level schematic view of an adder
Gate-level schematic for adding two bits     The and array, redrawn in long-winded style

We have made a common notational abbreviation on the schematic to the left. Instead of cluttering the diagram with inverters, we used bubbles on the inputs to the and gates to indicate the need to wire those inputs to the inverse of the corresponding overall input. This allows us to read off the binary representation of the row number from the inputs of each gate, reading the bubble standing for an inverted input as a zero and the line standing for an un-inverted input as a one. The and array is redrawn on the right without abbreviation.

Once a circuit has been developed using this brute-force method, we can optimize it by combining rows when those rows have identical outputs and inputs that differ in exactly one bit position. Where this is true, one and gate can be substituted for two, where the new gate has one less input than the old one. If we combine all combinable rows that differ in the first bit position, and then combine all combinable rows that differ in the second bit position and so on, it is possible that this will produce an optimal circuit. We will not emphasize optimization here.

Exercises

i) Show the truth table for a subtractor that computes d=a-b, with borrow signals connecting the digit positions. The top-level schematic view of this subtractor should be identical to the top-level schematic view of the adder. Also, answer this auxiliary question: What function does it compute when the least significant borrow input is true?

j) Draw out the gate-level schematic view of an adder given above, using explicit not gates instead of bubbles on the inputs to the and gates. Note: This can be done with exactly 3 inverters, since no input needs inversion more than once.

The Condition Codes

We are ready to ask how the condition codes are derived from the outputs of an adder built as discussed above. The N condition code is the simplest. Because the Hawk is a two's complement machine, we can treat the most significant bit of the sum as the sign and copy it into the N condition code.

The Z condition code should be set if every bit in the sum is zero. Equivalently, Z should be reset if any bit in the sum is nonzero. The equivalence of these two follows from DeMorgans laws. As a result, we can use a multi-input nor gate to combine all of the bits of the sum to produce the Z condition code.

The C bit is simply the carry out of the most significant bit of the adder, but V, the overflow bit, is more difficult. There are several equivalent ways to think about overflow detection. We could declare an overflow if the signs of the addends are the same and the sign of the result is different. This require two sign comparisons. Alternatively, there is an overflow if the carry into the sign bit of the adder differs from the carry out of the sign bit. This is the form used here:

Derivation of the condition codes
Derivation of the condition codes

This schematic diagram violates the standard conventions for drawing such diagrams. Inputs were supposed to be on the top and left and outputs on the bottom and right, so that data flows from left to right and top to bottom. Here, we have the carry input c0 on the right and all 4 condition codes as outputs on the left. We do this because all but one of the condition codes report on things that concern the most significant bit of the sum, while the carry input concerns the least significant bit, and we always writte numbers with the least significant bit on the right.

Exercises

k) The text mentions that there is an overflow when the signs of the operands are the same but the sign of the result is different. Express this as a truth table, assuming a 4-bit adder, so the inputs are a3, b3 and s3. Reduce this truth table to logic gates using the systematic brute-force method.

l) Draw the part of the schematic diagram for the derivation of the condition codes that pertains to the Z condition code, but using an and gate and, if necessary, some inverters.

Arithmetic Logic Units

We can use a simple adder for the arithmetic involved in indexed addressing, but what we really want at the heart of our central processing unit is a general purpose block of logic gates that can both add subtract, as well as performing other operations. This is called an arithmetic logic unit or ALU because, usually, the operations of Boolean logic are included among the operations this unit can perform.

How do we take our adder and extend it so that it can perform other operations? First, note that addition sits at the heart of the we use two's complement subtraction operation. So, instead of building a special subtraction unit, we can subtract by adding the two's complement, which is the one's complement plus one. Furthermore, the carry in to the adder can be used to add that extra one, so we do our subtraction by adding the one's complement with a carry in of one.

Next, recall that a ⊕ 0 = a while a ⊕ 1 = a; this means that we can use the exclusive or operator to select between a logical value, a in this case, and its inverse a. This leads to the following design for an adder-subtractor:

An adder-subtractor
An adder-subtractor

 
Here, if the sub input is zero, the b inputs are not changed by the exclusive-or gates and c0, the carry in to the adder, is zero. As a result, the circuit computes the sum s=a+b+0 in this first case.

On the other hand, if the sub input is one, the b inputs are inverted by the exclusive-or gates and c0 is one, so the adder computes s=a+b+1. Given that we are using two's complement representations, so that b+1=-b, we can conclude that the circuit computes the difference s=a-b in this second case.

We need more operations. Looking at the set of operations supported by the C programming language, for example, we find not just addition and subtraction, but a much longer list that includes and, or and exclusive or as well a shift operators and more advanced arithmetic operators such as multiply and divide. We will ignore multiply, divide and floating point for now, and concentrate on the simpler of these operators.

Modern hardware designers build highly optimal arithmetic logic units to perform these, but we are not interested in optimal design here. We simply need a simple circuit that allows us to select between functions depending on a control input. In this case, we need to select between addition and subtraction, on the one hand, and the and, or and exclusive or logical operators on the other hand.

A logic circuit that has its output equal to exactly one of several data inputs depending on a control input is called a multiplexor, abbreviated MUX. In general, a multiplexor with n data inputs inputs must have log2n control inputs. For our example arithmetic-logic unit, we need a multiplexor with 4 data inputs and 2 control inputs. The Hawk CPU, with 15 registers, must contain several 16-input multiplexors to select among these, each with 4 control inputs. We will not give full truth tables for these larger multiplexors, the tables are too big, but the truth table for a multiplexor with 2 data inputs d0 and d1 and one control input c is as follows:

2-input MUX
inputs   f  
  c    d0   d1  
0 0 0 0
0 0 1 0
0 1 0 1
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 0
1 1 1 1

The above truth table reduces to a simple logic circuit involving only 4 gates:
A 2-input multiplexor
A 2-input multiplexor, gate level A 2-input multiplexor, schematic symbol
f  =  (d0 c) + (d1 c)

 
The schematic symbol shown at the right is sometimes used for other functions that combine multiple inputs into one output, but if there is no label indicating another function, you can assume that it is a multiplexor. Here is the gate-level design for a 4-input multiplexor:
A 4-input multiplexor
A 4-input multiplexor, gate level A 4-input multiplexor, schematic notation
f  =  (d0 c1 c0 ) + (d1 c1 c0 ) + (d2 c1 c0 ) + (d2 c1 c0 )

Here, we have used an abbreviated notation for inverters. While there are 4 inversion bubbles in the figure, we would only use 2 inverters to build this circuit, one per control input. In the abbreviated notation to the right, the reader can infer that the labels 0, 1, 2 and 3 inside the multiplexor symbol correspond to 00, 01, 10, and 11 on the control lines, where c1 is the most significant bit and c0 is the least significant.

We now have the parts we need to build a simple arithmetic logic unit that combines an adder-subtractor with three simple logic gates using a multiplexor. Here, we have drawin just one one bit of this unit, with data inputs ai, bi and ci, data outputs si and ci+1, an f2 input to control the add-subtract function of the adder, and two more control inputs f1 and f0 to control the multiplexor.

A bit slice of an arithmetic logic unit
      A bit slice of an arithmetic logic unit      

We call this a bit slice of the arithmetic logic unit. The full arithmetic logic unit for a machine like the Hawk would require a total of 32 such bit slices.

It should be noted that a high-performance arithmetic logic unit would rarely be constructed as shown here. There are two major categories of optimization that are typically pursued. First, the carry path through the arithmetic logic unit is almost always carefully optimized, so that carry propagation from the carry in to the least significant bit to the carry out from the most significant bit is far faster than is suggested by the designs given here.

The second common optimization is to make the adder itself compute the exclusive or function instead of using a separate gate for this, and to replace exclusive or gate used for the add-subtract function with a more complex bit of logic that takes both ai and bi as inputs, as well as two more control inputs, f0 and f1, thereby eliminating the multiplexor on the output of the adder.

Exercises

m) In the adder-subtrtactor, the carry output means carry, exactly, when the circuit is used for addition. When the circuit is used for subtraction, however, the situation is a bit muddy. What is the value of the carry output when the high bit of the subtractor outputs a borrow request and when the high bit of the subtractor does not output a borrow request?

n) Naive students frequently assume that the borrow signal should flow from more significant places in the subtractor to less significant places because when you borrow one, you take one from a more significant place and give it to a less significant place. Explain why this is wrong.

o) Redraw the 2-input multiplexor given above with only nand gates (this is an application of DeMorgan's laws).

p) Another way to build a 4-input multiplexor is with a binary tree of 2-input multiplexors. Draw the schematic diagram for this tree, and then estimate the number of and, or and not gates required to build this tree and compare this with the 7 gates required to build the 4-input multiplexor directly. Can you give corresponding figures for an 8-input multiplexor built using this binary tree approach compared to an 8-input multiplexor built directly?

q) The bit-slice of an arithmetic logic unit given above has 3 control inputs, f2, f1 and f0. Therefore, it can compute 8 distinct fuctions of its a and b inputs. List them, and then go throught the Hawk instruction set and figure out which register-to-register instructions on the Hawk use each of these functions. Ignore memory-reference instructions; they don't use the arithmetic logic unit; indexed addressing is done using a separate adder.

r) Show how to build an exclusive-or gate from a 2-input multiplexor plus an inverter. Use the trapezoid symbol to expres your result.

s) Show how an adder can be built from two 8-input multiplexors, one to compute the sum and one to compute the carry. Hint: All of the data inputs should be constants, either 0 or 1.


Shifters

In addition to arithmetic and logic operations, most computers and many programming languages support shift operators. For example, languages descended from C, such as C++ and Java, include the operators >> meaning shift right and << meaning shift left. The effects of these operators are illustrated below, starting with the value 8:

Shift operations in C
expression value
binary decimal
i 00001000 8
i >> 1 00000100 4
i >> 2 00000010 2
i >> 3 00000001 1
i << 1 00010000 16
i << 2 00100000 32
i << 3 01000000 64

Shifting is included in the instruction set for two reasons: First, it allows moving bits around, for example, to extract successive bits from a word for transmission over a serial data line, or to pack multiple 4-bit data items into a word. The second reason is apparent when you examine the decimal equivalent of the binary values in the example above. Shifting left c places multiplies by 2c and shifting right c places divides by 2c.

A hardware device that takes an n-bit number as input and shifts that number c places is called a shifter. A shifter for an n-bit word can be made from n multiplexors, where the number of inputs per multiplexor determines how many different values of the shift-count c the shifter can handle. Here is a 4-bit right shifter:

A 4-bit right shifter
A 4-bit shifter

We have made a notational compromise above. In order to save space, we have drawn the control inputs as if they passed under the multiplexors. This is commonly done in schematic diagrams, but only when the input signals in question are distributed identically to all of the parts to which they connect. That is the case here.

The shifter shown above computes s = d >> c; that is, it shifts its 4-bit input left from 0 to 3 places. This poses an interesting question: What value should be "shifted into" the output for bits that were not part of the input number? In the above, this value is provided by an auxiliary input, d4. For left-shift operations, the usual answer is to shift zeros in from the left, so that the result is the result you would expect for multiplication by 2c.

For right-shift, there are two useful answers. First, we could shift in zeros, as with left-shift; in this case, the result will be the correct result for division by a power of two, assuming that the input number was treated as an unsigned integer. This is called a logical right shift or an unsigned right shift. The other alternative is to shift in copies of the sign bit. For the 4-bit shifter shown above, this would involve connecting the d4 input to d3. In this case, if the input is a two's complement binary number, the output will be the correct two's complement result for division by a power of two, with one very important exception: If the quotient is not exact, that is, if there is a nonzero remainder after division, the result will be rounded to the next lower integer instead of being rounded toward zero. As a result, -5>>1 is -3, not -2. This is called an arithmetic right shift or a signed right shift.

Note that In C and C++, when you declare a variable to be of type int and then apply the >> operator to it, this implies a signed right shift, while if you declare the variable to be of type unsigned int and then apply the same operator, this implies an unsigned shift. In contrast, in Java, where the built-in integer type is signed and there is no way to declare unsigned integers, the there are two distinct right-shift operators, >> and >>> for the signed and unsigned shifts.

The Hawk machine has both left and right shift operators, and the right shift operators are available in both signed and unsigned form. One feature of the Hawk machine that separates it from many other machines is that the shift operators are combined with a variant of the add instruction, so that, in one instruction, a number can be shifted and added to another number. The Hawk instructions for this are ADDSL, ADDSR and ADDSRU (the latter being the unsighed right shift). To accomodate these shifted versions of the add operation, there is a right-shifter on the output of the Hawk arithmetic logic unit and a left-shifter on one of the inputs of the Hawk arithmetic logic unit.

As with the discussion of arithmetic logic units, the design of shifters used in real computers is frequently more complex than that given here. The primary problem with the approach given here is that a 32-bit shifter with a 4-bit shift-count would take 32 multiplexors with 16 inputs each! In designing computers, it is common to avoid using extravagant amounts of silicon for rarely used functions, and in the case of shifters, the solution is to use a shift tree, also called a barrel shifter, where one stage of the shifter shifts only a short distance, say 0 to 3 bits using a 2-bit shift count, while the next stage shifts in multiples of 4, again using a 2-bit shift count. As a result, we can achieve exactly the same shift function with 64 4-input multiplexors instead of 32 16-input multiplexors.

Exercises

t) Draw the diagram for a 4-bit left shifter able to shift from zero to three places.

u) Draw the diagram for a 4-bit right shifter composed of a cascade of two levels using 2-bit multiplexors at each level, where the top level shifts either 0 places or 2 places, while the second level shifts either 0 places or 1 place. Make sure to label the inputs and outputs so that they are exactly compatable with the 4-bit shifter illustrated above.

v) Explain how the Hawk EXTB instruction is implemented using eight 4-input multiplexors. What bits of what register(s) are used to control these multiplexors? What bits of what register(s) are used as inputs to these multiplexors? What bits of what register(s) take the outputs of these multiplexors.

 

The Cost of Arithmetic

We have shown that addition and subtraction operators can be built in hardware using a number of logic gates proportional to the word size. One bit of an adder can be built using about 12 and, or and not gates. Adding an exclusive-or gate to one input of this adder allows it to be used both to add and to subtract, at a cost of 4 more primitive gates to build the exclusive or. Since we also need to be able to do logic operations, each bit position also needs a second exclusive or, an and and an or. Of course, we also need a multiplexor in each bit position to select between the different functions. This costs us 7 more gates per bit position. Adding these up gives us the cost of one bit slice of an arithmetic logic unit, and multiplying by the 32 bits per word gives us an estimate for the number of gates in the Hawk arithmetic-logic unit.

Accounting for the cost of an arithmetic logic unit
              Adder 12              
exclusive or to make adder-subtractor 4
exclusive or and and and or gates for logic 6
4-input multiplexor for function select + 7

Total gates per bit 29
word size × 32

Estimated gate count for the Hawk ALU 928

This estimate, 928 gates, supports only the simple logic operations, add and subtract. To obtain the left and right shift operations, we must add two shift networks, one for left shift and one for right shift. The Hawk shift instructions take shift counts between 1 and 16, so each shift network can be made of either 32 16-input multiplexors or 64 4-input multiplexors. We will assume the latter.

Accounting for the cost of shift networks
            4-input multiplexor 7            
word size × 32

gates per 32-bit 4-way shifter 224
2 layers of shifters × 2

gates per 32-bit 16-way shifter 448
need both left and right shifts × 2

Total gates for the Hawk shift network 896

Adding the 896 gates for the two shift networks to the 928 gates in the arithmetic logic unit gives us an estimate for the total complexity of the Hawk's arithmetic and shifting hardware: 1,824 and, or and not gates. This estimate ignores many possibilities for optimization, but most ALU optimizations focus on speed, not the total gate count.

Exercises

w) Estimate the cost, in gates, of the ALU-shifter for a 16-bit digital signal processor chip (a DSP), assuming it supports only the following shift operations: byte-swap, left shift one place, right shift one place, or don't shift at all. These options apply to the output of the ALU and are applicable to each ALU operand.

x) The accounting above assumed the use of two layers of 4-way multiplexors to construct a shift network that could shift from 0 to 15 places. How does the result differ if you assume 4 layers of 2-way multiplexors to do this job.