Homework 2 Solutions

22C:116, Spring 2002

Douglas W. Jones
  1. Background: Consider a machine that can execute 1,000,000 instructions per second (pathetic, yes, but it is a round number that is easy to work with). Reading a device register into a CPU register, testing or setting a bit in a status register, incrementing an index register, storing from a register to ram, and doing a conditional or unconditional branch each take one instruction.

    To read one byte from the parallel port, you must first wait for the status to indicate data valid, then read the byte, then set the acknowledge bit, then wait for data-not-valid, then reset the acknowledge bit.

    Part A: Work out the algorithm used to read from the parallel port, to a sufficient level of detail that you can count the machine instructions.

    waitvalid:
      test data valid status bit
      if not data valid, go to waitvalid
      read byte into r
      set acknowledge bit
      increment index register
      indexed store r to buffer
    waitinvalid:
      test data valid status bit
      if data valid, go to waitinvalid
      reset acknowledge bit

    The above code requires 9 machine instructions under the given instruction-set constraints.

    Part B: Estimate the peak data rate that could be sustained through this parallel port using your algorithm, in bytes per second. For this computer, any faster data rate will require use of DMA hardware.

    Sustained I/O transfers require embedding the above 9-instruction block in a loop, for example:

      load final value of index register for comparison
    looptop:
      ... block of code from part a
      compare index register to final value
      if not equal, go to looptop

    So, it takes 11 instructions per iteration, assuming that none of the polling loops iterate. As a result, we can transfer one byte ever 11 microseconds, giving a data rate of 90909 bytes per second.

    Part C: Would the use of interrupts allow a faster data rate? Explain.

    No. The same 9 instruction block of code outlined above would be at the core of the interrupt service routine, but instead of 2 additional instructions making up the loop required to transfer a block of data, the core would be surrounded by the overhead of interrupt entry and exit. In general, this requires saving and restoring registers along with similar expensive computations, and as a result, use of interrupts always leads to slower transfer rates than simple polled I/O.

  2. Problems from the Text: Do problems 8, 12 and 15 on page 68.

    Do problem 22 on page 69. This focuses on the specific mechanism used for system calls under UNIX, and forces you to think through something from Wednesday's lecture in more detail than was presented in class.

    Applicaton programs call Read, which in turn issues the system call read. In fact, the system call can have any name, so long as the library call has the name expected by application programmers. The library routine isolates applicaton programmers from any knowledge of the system call that is actually being issued. Furthermore, the actual mechanism of the system call, for example, a reserved instruction that causes a trap, isolates the library routine itself from the actual mechanism used inside the system, so even the library routine is unaware of the identifier used in the code of the system for the routine that receives control from the system call dispatch routine. Therefore, the programmer only needs to know about Read, and it is perfectly permissable for the author of the system to use some other name, perhaps katuv, for the corresponding function within the system. Nobody need ever know!