Homework 4 - Midterm

22C:122, Spring 1998

Due Wednesday Mar 11, 1997, in class

Douglas W. Jones

This assignment is in leu of a midterm exam!

In the textbook, Hennesy and Patterson introduce the DLX architecture in section 4.5. In the web pages for this course, the Hawk architecture is defined. See:

  http://homepage.cs.uiowa.edu/~dwjones/arch/hawk/
Both of these architectures are deliberate composites of ideas borrowed from a number of recent RISC designs, but they have a number of differences. Some of these are the result of subsetting (the Hawk has no floating point operations), while some are motivated by pedigogy (the Hawk has condition codes so that it can be used to teach condition codes to undergraduate CS students).

Study these architectures, note that the textbook has extended discussions of how the DLX can be implemented, and answer the following questions about them:

  1. Identify the dominant differences between these architectures, with attention to the differences that lead to significant differences in implementation.

  2. Consider the problem of pipelined 2-way superscalar interlocked implementation of these two architectures. What instructions on each architecture pose challenges for pipelined implementation - that is, which don't fit the single cycle model?

  3. What difference in code density would you expect for these two architectures. What kind of code would be more compact on the DLX, and what kind of code would be more compact on the Hawk.

  4. What impact on performance would you expect from these differences in code density. Assume the use of equal sized instruction caches for both machines, and assume both use fully-interlocked 2-way superscalar pipelines.

  5. Consider the problem of branch prediction on these architectures. Assume a small branch prediction cache, fully associative, with perhaps only 4 entries, storing successors of the most recent 4 "taken" branches. Cache entires are deleted if they predict incorrectly, and replaced on a pure LRU basis. In use, the fetch stage of the pipeline uses the cache to compute the successors of instructions, and only uses PC++ when there is a cache miss.

    From the point of view of branch prediction, are there any significant differences between these architectures?