22C:116, Lecture 10, Spring 1997

Douglas W. Jones
University of Iowa Department of Computer Science

Input/Output
This lecture focuses on a review of three different aspects of input output:
- User level I/O requests
- Virtual devices -- the abstractions provided by the system
- I/O drivers -- the implementation of those abstractions
The term virtual device is not standard, but it is realistic! The device interface presented to applications running under an operating system is usually quite different from the actual device interface, at least as different as the virtual memory abstraction is from the actual physical memory resources, and therefore, it seems quite reasonable to use the adjective virtual to distinguish the view of the device supported by the system from the actual resources provided by the device.
Input/Output Abstractions
The Stream is a typical (but by no means universal) input/output abstraction. Typical stream operations are read(ch) and write(ch). These get a character from a stream or put it to a stream, where the streams these commands operate on are either passed explicitly to the access procedure, or they are implicitly passed as in the case of standard input and standard output.
These are abstractions in the sense that they don't necessarily map directly to the actual input/output operations on any particular physical device.
These abstractions do relate naturally to the character-at-a-time operations on some simple devices such as keyboards, asynchronous communications lines, and a few other devices.
The UNIX stream abstraction is somewhat more complex. In UNIX, the operations are:
```
	len = read( f, buffer, length )
	len = write( f, buffer, length )
```
Where f names the stream to be read or written, and data is transferred to or from a named buffer that is able to hold length characters. There is no requirement that the size of the buffer have any relationship to the size of the blocks transferred to or from the underlying device.
The read operation reads up to length consecutive characters from the stream and places them in the buffer, returning len, the number of characters actually read, while the write operation appends up to length characters from the buffer to the stream, returning len, the number of characters actually transferred.
Note that most C programmers know of other stream primitives, fputc(ch) and fgetc(); these are C abstractions that are not part of the UNIX user interface and are implemented in terms of the read and write kernel calls.
Abstractions
Another typical abstraction is block-oriented random-access I/O, where the typical primitives might be:
```
  Read( buffer, address )
  Write( buffer, address )
```
Here, the buffer is a block of consecutive memory locations in the caller's address space, and the address names a block of consecutive locations in the random-access I/O abstraction, perhaps a relative address in a disk file.
This abstraction maps naturally to disk devices in the case where the buffer length equals the sector size and the address is aligned on a disk sector boundary. It maps poorly to real devices in most other cases.
Aside: UNIX files are a generally successful attempt to generalize streams to include block I/O. The operations are:
```
  Read( f, buffer, length )
  Write( f, buffer, length )
  Seek( f, offset, base )
```
The seek() operation (or lseek(), where the l prefix indicates the use of a 32 bit return value), is used to position the read-write pointer in the indicated stream. The offset may be relative to the start of the stream, to the end of the stream, or to the current position in the stream. Obviously, seeking in some streams is meaningless (consider seeking relative to the end of the stream of characters typed from the keyboard or being read from a communications line), and seek operations are not always supported (seeking in the stream from the keyboard would require storing the entire history of what was typed, and this is not worth doing), but on those file types where seeking is useful and can be efficiently implemented, it works.
With a buffer size of one byte and no seek operations, this interface exactly emulates a pure stream. With a buffer size of one disk sector, with a seek prior to each read or write, this interface exactly emulates a conventional random-access model. If the sector size is not matched to the buffer size or if seeks are made to device addresses that are not sector aligned, intermediate buffers and possibly extra input-output transfers must be used to reblock the data between the user and the device.
Of course, if the underlying device is incapable of a seek operation, for example, if it is a printer or a keyboard, the seek operation presented to the user is meaningless.
Implementation of Streams
The following diagram emphasizes that there are two input/output interfaces that concern us, that between user program and system, and that between system and the actual hardware. The system must typically map between these two interfaces, and it typically also provides FIFO buffering to improve I/O performance.
```
       |                                |
    Read( ch ) <---  FIFO <---  Interrupt Driver
       |                                |
    Write( ch ) ---> FIFO  ---> Interrupt Driver
       |                                |
 user  |             system             |  hardware
```
On first generation machines, programs usually were limited to synchronous I/O, where the application program stopped to read input, and where input was lost if the program wasn't ready for it when it arrived.
The use of a FIFO queue to separate the application code from the interrupt service routine allows fully asynchronous operation, where computation and input/output overlap, and no data is lost except if the input queue fills up.
Echoing of Interactive Input
There are at least two different paths that may be followed by characters that a user types when they are echoed to the display:
```
    Read( ch ) <---  FIFO <---  Interrupt Driver
	 |                    /
	B|              ----- A
	 V            /
    Write( ch ) ---> FIFO  ---> Interrupt Driver
```
In path A the input interrupt routine enqueues character to be echoed in output queue. Things echo as soon as they are typed, but they get mixed randomly into the output if you type ahead of the prompt from the applications program.
In path B the read routine enqueues character to be echoed as it dequeues it from the input queue. The output always looks the same whether or not you type ahead of the prompt, but echoing is delayed.
Other paths are possible; for example, the hardware may handle the job of echoing input, or a third queue may be added between the input interrupt driver and the output interrupt driver (with the output driver giving priority to echoing). At the other extreme, application programs may handle echoing. On UNIX, the user of a terminal device may select the echoing mode using some of the less common system calls, stty and ioctl. These calls allow nonstandard and device dependent options to be set, and the set of options they allow is huge.
Block I/O Implementation
As with stream I/O, there are two interfaces that concern us, and the system provides a mapping between these, as well as providing for buffering to improve performance.
```
       |                                     |
    Write( buf, add )      Queue             |
       |   or         --->  of   ---> Interrupt Driver
    Read( buf, add )      Requests           |
       |           ( buf, add, sem, dir )    |
       |                                     |
 user  |                system               |  hardware
```
Read and Write operate on the same queue, placing requests for block I/O in it, instead of data. Each time the device issues an interrupt, it signals that it is ready to begin a new transfer, at which point, driver removes an entry from the queue and gives it to the device, starting the next transfer.
The queue entries consist of a buffer (main memory address) and a device address, along with a semaphore used by the interrupt service routine to signal when the transfer is complete and a code indicating the direction of the transfer.
If user-level I/O is synchronous (that is, if the user process always waits for I/O completion), the Read and Write routines take care of waiting on this semaphore, and only one semaphore is needed per process for this purpose.
If user-level I/O is asynchronous (that is, user processes may explicitly overlap computation and I/O), then the user process must have one semaphore allocated per buffer. This gives the user the option of starting multiple reads and writes in parallel, something the traditional UNIX interface, for example, does not allow.
Minimum Latencycraft
Disk Throughput depends on the order of requests in the request queue. A bad order can cause lots of long seeks A good order minimise the time spent seeking.
Latency is the time between the initiation of an operation and the actual start of the operation. Rotational latency is the delay caused by the rotation, for example, a delay waiting for a particular sector to arrive under the head of a disk. Recirculating shift register memories hava analogous latency problems that are usually described as rotational latency.
Consider the time taken to read all the sectors of one track. Typically, these can be read in one or two disk revolutions, but if the requests are ordered wrong, for example, by reading sector x-1 immediately after reading sector x, it may take almost a whole disk revolution to read each sector.
It is frequently impossible to read consecutive sectors of a disk. The reason is that the intersector gap may be too short to allow an interrupt service routine to process the completion of the previous sector and initiate reading the next. To avoid this problem, many systems arrange the sectors on a disk in interleaved order, so that logically consecutive sectors are separated by one or two others, and the sectors around a track are arranged in some order such as:
```
A G L B H M C I N D J O E K P F
```
(The above example shows 3-way interleaving on a 16 sector track, where sectors are addressed by consecutive letters of the alphabet.)
Head positioning latency is the delay caused by the need to move the disk heads from one track to another on a moving head device. If a group of reads are completed from one cylinder before any reads on an adjacent cylinder are attempted, the head motion latency will be less than would be expected by reading alternate sectors from alternate cylinders.
Disk schedulers can speed up the system by changing the order in which the pending I/O requests are handled. The most widely known scheduling algorithm is called the elevator algorithm.
The elevator algorithm is named after the algorithm used by an elevator in a tall building. Assume that each floor has only one button. The best service results in this case when the elevator sweeps up towards the top of the building, stopping at every floor where there is a pending service request. When it reaches the topmost floor where service was requested, it sweeps downward, again stopping at every floor where there is a pending request. This gives neither maximum throughput nor a minimum waiting time, but it gives a bounded value of the maximum wait along with decent throughput.
If the disk head position on a moving head disk is considered, the elevator algorithm can be used to schedule the movement of the head in and out from cylinder to cylinder.
A disk scheduler could maximize throughput by only serving a small subset of the cylinders where there was always work to do, but in this case, some request for service to another cylinder might wait forever, making the average wait infinite. The average wait can be minimized by a variant on the elevator algorithm, but this will not have optimal throughput if there is always available work on some closly bunched subset of the cylinders.