1. Introduction
Part of
22C:40, Computer Organization and Hardware Notes
|
For most of their users, Computers are very mysterious things, machines that can almost think, machines that can execute programs written in languages like Java, C++, Ada, Pascal, Basic, Algol, Cobol, or Fortran, and that can run applications that process what seems to be an infinite variety of data types.
Here, we assume that the reader understands how to program a computer, in at least one of these many and varied programming languages, and we ask, what is really going on inside the box. How does the computer execute that program? How can any physical mechanism carry out instructions expressed in a language as complex as even the simplest of the languages from the programming language zoo mentioned in the previous paragraph?
The answer to this question is best grasped in stages, just as the execution of a program is generally handled in steps. First, the high-level language program is translated to a simpler language, machine language, by a compiler, and then a computer is built that can execute this machine language.
Our goal here is to study an example of such a machine language, and in the process, to study two other things. First, we will informally study the translation process from high-level programming languages to machine language, and second, we will study the computer that executes this machine language.
What is Architecture? The definition of architecture is most easily discussed in the context of its original meaning with respect to the design of buildings. The Architecture of a building includes all aspects of the building's design that its occupants are aware of. This includes such details as the arrangement of rooms, location of doorways and windows, surface finishes and location of switches, lights, wall outlets and other conveniences.
Buildings, of course, are made using some technology, bricks, beams, plaster and stone, and these may be visible elements of the architecture. Brick exteriors and exposed beams are examples of engineering elements that are exposed to the occupant by the architecture. Note, however, that many engineering elements of the structure of a building may be hidden from the occupants, and that some exposed aspects of the engineering may be highly misleading. For example, a brick exterior does not indicate that the bearing walls of a building are brick; Brick exteriors are commonly applied over wood frame walls! Similarly, not all the exposed beams in buildings are actually structural elements. Architects have been incorporating false beams into their structures since Roman times!
The buildings on the University of Iowa Pentacrest provide an excellent example of architecture and engineering. The oldest building on campus, the Old Capitol, is a brick building with a limestone exterior and wood porticos. Most people seeing the porticos think they are stone! Macbride and Schaeffer halls are also brick and stone construction, but with real stone pillars holding up their porticos. Maclean Hall is reinforced concrete construction, with a very thin stone facing, and Jessup Hall is a steel frame building, also with a thin stone facing. These buildings are "architecturally compatable" while being based on very different engineering.
Computers also have architecture and engineering. The architecture of a computer system includes all aspects of the computer visible to the programmer. For the programmer working at the machine language level, for example, the author of a compiler, the architecture and the machine language are so intimately intertwined that the machine language can be described as the primary manifestation of the architecture.
Every architecture must be built, somehow, using some underlying technology. Architects may draw plans on paper, and computer architects may design machine languages, but these are of little use unless they are actually built of brick and stone or of silicon and copper. When an architecture is actually built, some aspects of the underlying engineering may show through in the architecture while others are completely hidden.
The architect Louis Henri Sullivan said that "form follows function," and with many computer architectures, this has been true. On the other hand, just as builders frequently use modern materials to build structures that retain architectural forms that were dictated by ancient building technologies, so too, modern computers are frequently built to be support architectures that were originally designed in terms of far different technologies.
For example, the DEC PDP-5 and PDP-8 architecture first appeared on the DEC PDP-5, sold in 1963; these machines were built using core memory and discrete transistors. The PDP-8, introduced in 1965, was an architecturally compatable reimplementation of this architecture using a newer generation of discrete transistor technology. The PDP-8/I was a reimplementation using TTL integrated circuits; the PDP-8/F, in 1970, was another reimplementation using some MSI chips. The PDP-8/A, in 1974, used some LSI chips, and the VT/78 was a reimplementation based on an architecturally compatable microprocessor. The PDP-8 family ended with the end of the DECmate III+ production run in 1990. The range of size, price and above all, technology represented by this example is immense, yet a programmer who learned to program on the PDP-5 would notice very little change in programming a DECmate III+, except, of course, that a system that used to occupy many 6 foot high relay racks was now a small desktop machine and the available software for this family of computers had grown considerably over the 27 year lifetime of the architecture.
Historical note: Digital Equipment Corporation was one of the most innovative developers of new computer architectures between 1960 and 1992. By the mid 1970's, it had grown to dominate the small computer market, and DEC's VAX series of 32 bit computers were the most widely used machines on the Internet in the mid 1980's.
We will be studying the Hawk architecture. This fictional architecture combines elements of many modern RISC architectures with historical features that date back to the very first computers.
Why a fiction? After all, the Intel 80x86/Pentium family of computers dominates the marketplace, and it is used in many assembly language texts. Unfortunately, this architecture is in some ways comparable to a modern building built in the colonial style. While, under the skin, it may be constructed of steel and concrete, this is hidden under a brick and plastic skin that follows the forms of the Greek revival style that was popular in the Georgian era. This form, in turn, incorporates architectural elements from classical Grecian temples, but these, in turn, were stone structures that were, in many cases, imitations of wooden post and beam structures from the bronze age.
The Intel architecture is the end product of an evolution from Intel's first microprocessors of the early 1970's. In these early microprocessors, form followed function very closely. Since the 1970's, however, Intel has been faced with the demand to offer compatable upgrades to older designs, and at each step, new technology has been carefully hidden behind a veneer that allowed programmers to ignore these technical changes. As such, the 80x86/Pentium family is saddled with immense accidental complexity, making it very poorly suited for teaching.
The Hawk architecture, while fictional, is designed within the RISC (Reduced Instruction Set Computer) framework that dominates much modern thinking about computer architecture. The Apple/Motorola/IBM Power architecture found in the Apple Power PC and the IBM RS/6000 family is in this class, but this is a complex commercially viable architecture, while the fictional Hawk contains few features that are not motivated directly by the instructional context.
The Hawk architecture deliberately incorporates a few elements of older methodology, and as a result, those who have learned the Hawk architecture should not be surprised by elements of other architectures.
Most high level languages do their best to protect programmers from having to learn anything about the computer architecture that actually runs their programs. Users of Java, Ada or Pascal, for example, cannot even determine, from within the bounds of the standard language, whether the machine has bytes within its words! C++ and C programmers, in contrast, can explore the memory addressing model of the underlying machine, but this is usually a source of trouble and not a benefit of these languages.
Assembly language is almost universally used to teach elementary computer architecture, and many compilers produce their output in assembly language instead of machine language. Assembly languages completely expose the computer architecture to the programmer, providing a convenient textual way for expressing programs for the machine while doing nothing to hide the actual behavior of the hardware. Each assembly language statement typically corresponds to exactly one machine language instruction, and the only difference is that the assembly language statements are written in textual form with space for commentary intended for a human reader, while machine languages are expressed in binary codes that are very difficult for human readers to interpret.
Assembly languages went through a great burst of creative development in the 1960's, but by the 1970's, it was clear that the majority of programmers would rarely need to know much assembly language. Today, aside from elementary assembly language and computer architecture classes, assembly languages are primarly used as the target languages for compilers. Thus, while most systems include assemblers, the code they assemble is usually written by other programs and not by humans. As a result, many modern assembly languages are not as well developed as the assembly languages of the mid 1970's that were designed to be read and written by human programmers.
It is worth noting that, while computer architectures are best studied at the assembly language level, assembly languages have only loose connections to the architectures they support. There are historically important examples of the use of assemblers designed to support one machine to assemble code for a completely different computer architecture. For example, all of the early code development for the DEC PDP-11 computer, a machine with a 16 bit word, was done on DEC PDP-10 computers, machines with 36 bit words, using DEC's MACRO-10 assembler for the PDP-10.
We will be using the SMAL assembly language. SMAL stands, creatively, for Symbolic Macro Assembly Language (a fact nobody needs to remember) and it is far ritcher than many of the assembly languages used today, particularly those provided with introductory assembly language texts.
SMAL includes well developed macro features and a syntax representative of some of the best assemblers of the 1970's. SMAL itself was developed in the early 1980's, predating the HAWK architecture by over a decade and even slightly predating the widespread recognition of RISC architectures. This does not have any impact on the utility of SMAL.
Before getting deeply involved in any specific machine language, we will focus on questions of data representation. In high level languages, we take for granted that the machine can represent data, whether it is in the form of numbers or text, but at the assembly language level, the programmer must take direct responsibility for all issues of representation. Conversion between number bases and questions of character coding will be at the center of this!