4. Finite State Automata, A Breakneck Review
Part of
the 22C:122/55:132 Lecture Notes for Spring 2004

When a digital system contains feedback, the result is a sequential circuit. Fundamental mode sequential circuit design, in which the entire system is viewed in these terms, is very difficult and well will avoid discussing it in this course! A fundamental mode sequential circuit designer thinks in the following terms:
_______________   inputs  combinational  outputs  circuitry      _______________          
The RS flipflop presented in the previous lecture was analyzed as a fundamental mode design. Fundamental mode design is used to construct optimal implementations of type JK or type D edge triggered flipflops, but most digital system designers avoid it.
Most work in digital systems is done with clocked sequential circuits. This is a slightly higher higher approach to sequential circuits. It is only applicable to circuits that change their observable state when a clock edge appears on one of their inputs (either 0 to 1 for a leadingedge triggered clocked sequential circuit, or 1 to 0 for a trailingedge triggered clocked sequential circuit).
Clocked mode circuits are generally built using an edgetriggered register and a combinational circuit, as illustrated in the following:
_______________   inputs  combinational  outputs  circuitry      _______________         ______ state   clock >_______ register            
Here, the feedback from the output of the combinational circuitry to the input is stored in a state register. This register is made of one edgetriggered flipflop per bit of feedback, and all the flipflops share the same clock.
The design and analysis of a clocked mode sequential circuit is based on finite state automata theory, so we must review this before we continue with computer architecture.
The behavior of a sequential circuit can be described by a state diagram. In such a diagram, a circle represents the state of the machine, and an arc represents the change of state from one state to the next. Thus, we might diagram the behavior of a very simple system as follows:
> / \ (A) (B) \ / <
This system has 2 states, A and B. Whenever a clock pulse arrives, it changes state, so on successive clock pulses, it alternates between states A and B. Note that the arcs on the state diagram must be annotated with direction arrows!
The most trivial thing you can do when you examine a state diagram is count the states. If a state diagram has n states, the implementation of the state diagram will require ciel(log_{2}(n)) flipflops. Thus, our example with 2 states requires one flipflop.
The above example was for an automata with no inputs. More typically, we are interested in machines that respond to their inputs. In general, if a machine has k binary inputs, these may take on 2^{k} distinct values. As a result, each state may have as many as 2^{k} distinct successor states depending on the specific value of the inputs at the time of the state transition!
The following figure illustrates a 2state finite state machine with an input that may have 2 values, either x or y:
< >  / x \ / y \ / \ ( (A) (B) ) \ / \ y / \ x /  < >
This system has 2 states. Whenever a clock pulse arrives, if the input is y, the state will change, but if the input is x, the state will remain unchanged. We can also describe this machine with a state table:
input current  next state  state  x A  A y A  B x B  B y B  A
There are two ways of dealing with the outputs of a finite state machine. One approach is to associate the output with the state. In its simplest form, the output of this type of machine is its state! Moore was the first to clearly define this type of finite state automata, so we describe such machines as Moore machines.
An alternative is to associate the output with both the input and the current state. Mealy was the first to clearly define this alternative, so we describe such machines as Mealy machines. The following figure and state table define the same Mealy machine.
< >  / x/a \ / y/b \ / \ ( (A) (B) ) \ / \ y/d / \ x/c /  < > A Mealy Machine input current  next output state  state  x A  A a y A  B b x B  B c y B  A dIn the state diagram for a Mealy machine, each arc (state transition) is labeled with the input that enables that transition, a slash, and the output that is associated with that transition. The output is always after or under the slash!
It is a theorem of finite state automata theory that any Moore machine can be converted to a Mealy machine, and visa versa. Clearly, if a Moore machine has outputs that depend only on the state, and the machine has n distinct output values, it must have n states! As a result, translation from Mealy to Moore frequently results in adding states! The above machine is equivalent to the following Moore machine:
< > > / x \ / y \ / x \ ( (A1/a) <(B1/b) ) \ / / y / / > / / < / / y / / \ ( (A2/d)> (B2/c) ) \ x / \ y / \ x / < < > A Moore Machine input current  next state  state  x A1  A1 y A1  B1 state  output x A2  A1  y A2  B1 A1  a x B1  B2 A2  d y B1  A2 B1  b x B2  B2 B2  c y B2  A2
Note that the description of a Moore machine frequently has two tables, one for the state transitions, and one giving the output associated with each state. In the above, the states that were split have retained their old names, but gained a suffix to distinguish the two Moore states resulting from each Mealy state. The output associated with each state in a Moore machine is written inside the circle for that state, separated from the state name by a slash in the same way that the output of a Mealy machine was separated from the input by a slash.
The above description of Moore and Mealy machines used symbolic names for inputs and outputs. Before converting a finite state machine specification to a digital system, we need to assign encodings to the inputs, outputs and states. Sometimes these encodings will be imposed by the context, but sometimes, we can be arbitrary. For example, we could code the inputs and outputs used in the above examples as:
input  code output  code   x  0 a  00 y  1 b  01 c  10 d  11Similarly, we can encode state names:
Mealy  Moore  state  code state  code   A  0 A1  00 B  1 A2  01 B1  10 B2  11Having made these state assignments, we can convert the state tables and the output table previously given to give the following:
Mealy input current  next output state  state  0 0  0 00 1 0  1 01 0 1  1 10 1 1  0 11 Moore input current  next state  state  0 00  00 1 00  10 state  output 0 01  00  1 01  10 00  00 0 10  11 01  11 1 10  01 10  01 0 11  11 11  10 1 11  01
The assignments of binary codes in the above example was entirely arbitrary, and this is not always a good idea. For example, the the output table in the Moore machine maps 4 states to 4 outputs; if the state names had been encoded directly in terms of the expected output values, or alternatively, if the output values had been encoded to match the state encoding, the need for hardware to compute the state to output mapping could have been completely eliminated. Sometimes, it pays to tinker with these encodings before committing to one or the other of them.
Given the encoded state table for a Moore or Mealy machine, we can reduce these to a functional digital systems as follows:
First, treat the state table (and output table, if it's a Moore machine) as the specifications for Boolean functions. Build hardware to implement these functions following the outline given in the notes for the previous lecture.
Then, substitute the hardware you built into one of the following outlines:
_______________ ___    inputs  Mealy   outputs  nextstate    and output  next    function  state\  current    state  _______________           clock >__           ____________ next ________   state   inputs  Moore  /___  Moore   nextstate  o output  outputs  function  ofunction current       ________ state  ____________             clock >__          In comparing the Moore and Mealy machines, note that the number of bits in the nextstate output will usually be more for a Moore machine than for an equivalent Mealy machine, but that, where the Moore machine only needs the state register to store the current state, the Mealy machine needs to store bothe the current state and the current output.
This requirement of a register to hold the current output is imposed on Mealy machines to prevent the output from changing except when the clock edge arrives indicating that the machine should advance to the next state. If this were not done, the output of the Mealy machine could change whenever any input to the machine changes, and this is generally undesirable.
Of course, if the state assignment job is done carefully, the Moore machine may have a trivial output function, where the state itself encodes the desired output. The examples given above were deliberately clumsy in this regard!
Here is the Mealy machine we get from the state table given:
input _____________________ ___________      o o   __  __   \ /  \ /   O  O __       \     oo )Ooo  1  0  __/ 1 0 1      __          \     oo )Ooo  0  1  __/ 1 1 0      __          \     oo )Ooo  1  1  __/ 0 1 1  ____ ____ ____         \___/ \___/ \___/  O O O     _____    D Q output      lsb    >      _____     _____   D Q output      msb   o>     _____    _____  D Q      state  clock o>    _____  ______________________________________________________________
The above bruteforce circuit can be optimized considerably!
For both sequential and combinational circuitry, it is always possible to use the design methods described above, but it is sometimes very complex to think in such terms. For example, consider the problem of designing a 16 bit binary counter that simply increments its output modulo 2^{16} each time a clock pulse arrives.
It is easy to reduce this design to the following:
 ____________   N       next state    function      _____N+1_____   /16 /16  ____________  clock >____________  state   output (16 bit) /o 16When drawing a circuit with many parallel data lines, it is common to draw one line and then annotate it saying this is one of many by putting a slash through the line with the number of lines written next to it. The slash can be thought of as a schematic symbol for a string tied around the bundle of wires.
The trouble with the above design is that the combinational function required has 16 inputs and 16 outputs. As a result, the truth table for this function has 2^{16} rows, and writing it out takes an inordinate amount of time and paper!
Fortunately, this function has a very regular structure, so it is possible to simplify the design immensely by exploiting this structure. When we exploit the regularity in an nbit function by dividing it into n subfunctions operating on one bit each, possibly with additional horizontal interconnections between these 1bit subfunctions, we call the result a bitslice realization of the target function.
In the case of the 16 bit increment function, the bit slice design requires subfunctions with the following structure:
slice 15 ... slice 1 slice 0 A ... A A 15 1 0  C  C  1  __ 15  __ 1  ___ ___  ___  ___  a c  a c   a c  +  C  +    +  c_s__ 2 c_s__  c_s__ __  __  __     S ... S S 15 1 0In this case, the function of the smaller units is obvious from an elementary school understanding of arithmetic. Each small box takes the carry in to one digit position in the number, adds that carry to the digit input in that position to compute the digit output, and also computes the carry out from that position to the next higher position. The carry in to the lowest digit of the number is one in order to force the number to increment with each evaluation of the entire function.
Here, we have numbered the digits of the input from A_{0}, the least significant bit, to to A_{15}, the most significant bit, and we have numbered the outputs from S_{0}, to S_{15} correspondingly. The carry signals are similarly numbered, where C_{i} is the carry that will be added to S_{i}, so the carry out of the entire circuit (ignored in this context) would be labeled C_{16}.
The bitslice design for the entire counter can be completed by adding one bit of the register to each of the bitslices of the combinational function, giving the following structure for each bitslice of the original counter:
C i one bit slice of  next state function    ___  in + out  a c  a c  c s  +    c_s__ ____  0 0  0 0      0 1  0 1  D Qo 1 0  0 1     1 1  1 0 >C    ____  C S i+1 iThe implementation of this counter stage can be optimized locally, but that isn't the point of this course! Of more importance, the speed of this counter is limited by the time taken by the combinatorial circuitry to propagate information from the C_{i} input to the C_{i+1} output. This is a problem that has been addressed by the designers of digital systems since the mid 19th century. Charles Babbage, for example, put a considerable amount of effort into what he referred to as carry anticipation mechanisms in his later mechanical difference engines, and today, essentially all high performance computers rely on carry anticipation trees in their arithmetic logic units. We will look at this issue later.
The bitslice approach to designing systems that operate on long words is quite common, and it has also been used to manufacture computers. Bitslice computer systems were built in the late 1960's with one slice of the entire system on each on N circuit boards, for an Nbit computer. By the mid 1970's, bitslice CPU and ALU chips were commonly available, sometimes constructed with four bit slices per chip. Nowdays, it is reasonable to use bitslice design for many of the components found on VLSI implementations of computers.