Homework 2 Solutions

22C:116, Spring 2002

Douglas W. Jones

Background: Consider a machine that can execute 1,000,000 instructions per second (pathetic, yes, but it is a round number that is easy to work with). Reading a device register into a CPU register, testing or setting a bit in a status register, incrementing an index register, storing from a register to ram, and doing a conditional or unconditional branch each take one instruction.
To read one byte from the parallel port, you must first wait for the status to indicate data valid, then read the byte, then set the acknowledge bit, then wait for data-not-valid, then reset the acknowledge bit.
Part A: Work out the algorithm used to read from the parallel port, to a sufficient level of detail that you can count the machine instructions.
waitvalid:
  test data valid status bit
  if not data valid, go to waitvalid
  read byte into r
  set acknowledge bit
  increment index register
  indexed store r to buffer
waitinvalid:
  test data valid status bit
  if data valid, go to waitinvalid
  reset acknowledge bit

The above code requires 9 machine instructions under the given instruction-set constraints.

Part B: Estimate the peak data rate that could be sustained through this parallel port using your algorithm, in bytes per second. For this computer, any faster data rate will require use of DMA hardware.
Sustained I/O transfers require embedding the above 9-instruction block in a loop, for example:
  load final value of index register for comparison
looptop:
  ... block of code from part a
  compare index register to final value
  if not equal, go to looptop
So, it takes 11 instructions per iteration, assuming that none of the polling loops iterate. As a result, we can transfer one byte ever 11 microseconds, giving a data rate of 90909 bytes per second.

Part C: Would the use of interrupts allow a faster data rate? Explain.
No. The same 9 instruction block of code outlined above would be at the core of the interrupt service routine, but instead of 2 additional instructions making up the loop required to transfer a block of data, the core would be surrounded by the overhead of interrupt entry and exit. In general, this requires saving and restoring registers along with similar expensive computations, and as a result, use of interrupts always leads to slower transfer rates than simple polled I/O.
Problems from the Text: Do problems 8, 12 and 15 on page 68.
- 8:
  a) disable interrupts -- if permitted in user mode, this would allow user code to take over the CPU for indefinite periods of time, preventing the scheduler or I/O from taking place.
  
  c) set time-of-day-clock -- if permitted in user mode, this would allow user code to make arbitrary and capricious changes to the system clock, throwing off real-time computations that might depend on the clock.
  
  d) change the memory map -- if permitted in user mode, a user program could address any physical memory resources it wanted, including any system data structures. The system would have no security.
- 12: The two approaches accomplish the same goals, although the values in the registers needed to accomplish these goals differ. The run-time cost of the two solutions are different. Comparing the virtual address with a limit register can be done in parallel with adding it to the base register, so the comparison has no run-time cost. Comparing the physical address with the limit register must be done after the addition but prior to a commitment to carry out a memory cycle. Given that addition and comparison take about the same amount of time using typical hardware, the run-time cost of the second approach is about twice that of the first approach.
- 15: The base register will be 40,000, and the limit register will be 10,000.
Do problem 22 on page 69. This focuses on the specific mechanism used for system calls under UNIX, and forces you to think through something from Wednesday's lecture in more detail than was presented in class.
Applicaton programs call Read, which in turn issues the system call read. In fact, the system call can have any name, so long as the library call has the name expected by application programmers. The library routine isolates applicaton programmers from any knowledge of the system call that is actually being issued. Furthermore, the actual mechanism of the system call, for example, a reserved instruction that causes a trap, isolates the library routine itself from the actual mechanism used inside the system, so even the library routine is unaware of the identifier used in the code of the system for the routine that receives control from the system call dispatch routine. Therefore, the programmer only needs to know about Read, and it is perfectly permissable for the author of the system to use some other name, perhaps katuv, for the corresponding function within the system. Nobody need ever know!