Homework 7

22C:122/55:132, Spring 2001

Due Monday Mar 5, 2001, in class

Douglas W. Jones

Background:
To divide an n-bit divisor into a 2n-bit dividend, we can use the following sequential algorithm (using the same notational extensions to C as were used in the notes describing the classical sequential multiplication algorithm:
```
   divide( int divisor, long int dividend )
   {
      ac|mq = dividend

      repeat n times {
         ac|mq = ac|mq << 1;
         if (ac >= divisor) {
            ac = ac - divisor;
            mq = mq + 1;
         }
      }

      -- use one, the other or both of the following:
      return ac; -- returns the remainder
      return mq; -- returns the quotient
   }
```
The above is written for clarity of expression, but the following comes closer to the implementation used in real low-performance computers:
```
      repeat n times {
         #define t (ac|mq << 1)
         #define th high_half_of(t)
         #define tl low_half_of(t)
         #define tm th - divisor
         if (tm >= 0) {
            ac|mq = tm|tl + 1;
         } else {
            ac|mq = th|tl;
         }
      }
```
In the above rewrite of the loop body, the C (or C++) define construct has been used to name the results of various combinational operations, in order to emphasize that only one assignment is performed per iteration of the loop.
Note that computing A-B using A+notB+1 will produce a carry out if A>=B, and no carry out if A<B.
Part A: Draw a diagram of the data flow through this divide algorithm, showing the relationship between the AC, MQ, ALU, shifter and multiplexor. To the extent possible, follow the topology of the diagram shown near the start of the notes for Lecture 15, but give enough detail that it is easy to see where the control signal from the multiplexor comes from. Ideally, your diagram should describe a system that will perform an n-bit division using n-clock pulses, with no additional control circuitry required.
Part B: What characteristic or characteristics of the system you described in part A prevent a parallel realization of the divide circuit from computing the remainder and quotient in O(log n) time, using a scheme similar to that used to compute products in O(log n) time.
Background: Just as most intiger multiply operations in real programs have constant multiplicands, so do most divide operations have constant divisors. Furthermore, integer division is much less common in real programs than multiplication. Therefore, even if we build fast multiply hardware into a machine, it might be justifiable to omit divide hardware. We could justify this if we had a moderately good algorithm for computing reciprocals. Consider the following (for an n-bit word):
```
   reciprocal( i )
   {
      approx = 0;
      bit = 1 << (n - 1);
      one = 1 << n;
      repeat n times {
         guess = approx + bit;
         if ((guess * i) <= one) {
            approx = guess;
         }
         bit = bit >> 1;
      }
   }
```
This is a binary search for the fixed point binary fraction that is the reciprocal of the number.
Part A: As given here, this requires one multiplication per iteration. Note that, for each iteration, the number of ones in the guess increases by at most one, indicating that at most one new partial product will be added to the sum representing the current product of guess*i. Rewrite this code to take advantage of this fact.
Part B: Based on the exercise you did in part A, and on the exercises you did in problem one, how much simpler is the problem of computing reciprocals from the problem of computing generalized quotients.