22C:116, Lecture 10, Fall 1999

Douglas W. Jones
University of Iowa Department of Computer Science

Input/Output
This lecture focuses on a review of three different aspects of input output:
- User level I/O requests
- Virtual devices -- the abstractions provided by the system
- I/O drivers -- the implementation of those abstractions
The term virtual device is not standard, but it is realistic! The device interface presented to applications running under an operating system is usually quite different from the actual device interface, at least as different as the virtual memory abstraction is from the actual physical memory resources, and therefore, it seems quite reasonable to use the adjective virtual to distinguish the view of the device supported by the system from the actual resources provided by the device.
Physical Device Interfaces
The physical device interface of a real system typically consists of a number of device registers. For example, on a typical IBM PC compatable, parallel port zero is addressed by the hardware as follows:
```
                __ __ __ __ __ __ __ __ 
  0378 data    |D7|D6|D5|D4|D3|D2|D1|D0|
               |__|__|__|__|__|__|__|__|
                __ __ __ __ __ __ __ __ 
  0379 status  |BS|AK|PE|OL|ER|x |x |x |
               |__|__|__|__|__|__|__|__|
	BS = 1 when printer not busy
	AK = 1 when data transfer in progress
	PE = 1 when printer is out of paper
	OL = 1 when printer is on line
	ER = 1 when there is no error
                __ __ __ __ __ __ __ __ 
  0379 control |x |x |x |IE|SL|IN|LF|ST|
               |__|__|__|__|__|__|__|__|
        IE = 1 to enable printer interrupt
        SL = 1 to select printer
        IN = 1 for normal operation (zero causes printer reset)
        LF = 1 to force printer to advance paper
        ST = 1 to transfer data into printer
	
```
The protocol for printing one character using this interface was established by Centronics Corp for their line printers in the early 1970's, significantly before the IBM PC came to market. To print one character, the software must load the character in the parallel port data register and then send out a positive pulse on the ST (strobe) line. The printer acknowledges the strobe pulse with a positive pulse on the AK (acknowledge) line and a negative pulse on the BS (busy) line. The contents of the data register must remain valid until the end of the AK pulse, and no new transfer is legal until the end of the BS pulse. Usually, these two pulses will be coincident.
If interrupts are enabled (IE = 1), the printer interface will request an interrupt when AK (acknowledge) goes from 1 to zero. The particular interrupt requested may depend on the settings of jumpers or option switches on the parallel port interface, but it is normally IRQ7, the lowest priority hardware interrupt supported on the PC family of machines.
The SL, IN, LF and ST lines from the control register correspond directly to corresponding output pins in the parallel port connector, and all of the status lines correspond to input pins in the parallel port connector. Aside from the IE (interrupt enable) bit and the interrupt logic attached to AK (acknowledge), the parallel port hardware makes no assumptions about the use made of the control, status and data lines. The meanings of those lines are established by the hardware connected to the printer port.
Virtual Device Interfaces
The Stream is a typical (but by no means universal) input/output abstraction. Typical stream operations are get(ch) and put(ch). These get a character from a stream or put it to a stream, where the streams these commands operate on are either passed explicitly to the access procedure, or they are implicitly passed as in the case of standard input and standard output.
These are abstractions in the sense that they don't necessarily map directly to the actual input/output operations on any particular physical device.
These abstractions do relate naturally to the character-at-a-time operations on some simple devices such as keyboards, asynchronous communications lines, and a few other devices.
The UNIX stream abstraction is somewhat more complex. In UNIX, the operations are:
```
	len = read( f, buffer, length )
	len = write( f, buffer, length )
```
Where f names the stream to be read or written, and data is transferred to or from a named buffer that is able to hold length characters. There is no requirement that the size of the buffer have any relationship to the size of the blocks transferred to or from the underlying device.
The read operation reads up to length consecutive characters from the stream and places them in the buffer, returning len, the number of characters actually read, while the write operation appends up to length characters from the buffer to the stream, returning len, the number of characters actually transferred.
Note that most C programmers know of other stream primitives, fputc(ch) and fgetc(); these are C abstractions that are not part of the UNIX user interface and are implemented in terms of the read and write kernel calls.
Also note that the there are undefined and device dependent characteristics associated with UNIX style streams: For example, under what circumstances will the read and write calls return values other than the user's specified byte count? The answer for a disk file is quite different than the answer for an asynchronous communications line!
It is important to emphasize that both the UNIX file and the C stream abstractions are conceptually objects with methods to operate on them. The fact that the UNIX system call interface and the C standard library interfaces are procedural can hide this, but logically, the operations on an open UNIX file should be thought of as references to methods of a file object. Thus, for example, it is constructive to think of read(f,buf,len) as f.read(buf,len).
Other Abstractions
Another typical abstraction is block-oriented random-access I/O, where the typical primitives might be:
```
  Read( memory_buffer, device_address )
  Write( memory_buffer, device_address )
```
Here, the buffer is a block of consecutive memory locations in the caller's address space, and the address names a block of consecutive locations in the random-access I/O abstraction, perhaps a relative address in a disk file. The block size in this scheme is fixed and therefore is not passed as an explicit parameter!
This abstraction maps naturally to disk devices in the case where the block size is one sector and the device_address is the sector number It maps poorly to almost all other real devices, but disks are so important that it is frequently used.
Aside: UNIX files are a generally successful attempt to generalize streams to include block I/O. The operations are:
```
  Read( f, buffer, length )
  Write( f, buffer, length )
  Seek( f, offset, base )
```
The seek() operation (or lseek(), where the l prefix indicates the use of a long int offset, typically at least 32 bits), is used to position the read-write pointer in the indicated stream. The offset may be relative to the start of the stream, to the end of the stream, or to the current position in the stream, depending on the value specified for the base. Obviously, seeking in some streams is meaningless (consider seeking relative to the end of the stream of characters being read from a communications line), and seek operations are not always supported (seeking in the stream from the keyboard would require storing the entire history of what was typed, and this is not worth doing), but on those file types where seeking is useful and can be efficiently implemented, for example, on disks and tapes, this interface works quite well.
With a buffer size of one byte and no seek operations, this interface exactly emulates a pure stream. With a buffer size of one disk sector, with a seek offset that is a multiple of the sector size prior to each read or write, this interface exactly emulates a conventional random-access model. If the sector size is not matched to the buffer size or if seeks are made to device addresses that are not sector aligned, intermediate buffers and possibly extra input-output transfers must be used to reblock the data between the user and the device.
Of course, if the underlying device is incapable of a seek operation, for example, if it is a printer or a keyboard, the seek operation in the user interface is meaningless.
Implementation of Streams
One of the very first uses of object oriented programming can be found in Christopher Strachey's stream implementation in his OS-6 project in the late 1960's. Strachey, the primary developer of the CPL programming language, naturally chose to write this system in BCPL (Basic CPL, the ancestor of C), but he wrote his code in clearly object oriented form, and today, we would say that he used a class hierarchy something like the following:
```
                Write Only Stream
                     write(c)
                        |
                Read/Write Stream
                      read(c)
                     write(c)
                        |
               Random-Access Stream
                      read(c)
                     write(c)
                      seek(c)
```
In fact, the whole notion of having the class hierarchy have any relationship to the implementation hierarchy fails for I/O, because, for some devices, random block access is the primitive, while stream access is a higher level abstraction. For other devices, the situation is reversed; stream access is primitive, and blocks must be introduced as a high level abstraction. Thus, one of the original inspirations for object oriented programming remains a good test of the methodology!
Implementation of Streams
A device driver is the component of an operating system that supports input or output operations to some device. Conceptually, this bridges the gap between the abstractions presented to the user, but in many systems, the front-end interface to user code (for example, the read(), write() and seek() kernel functions in UNIX) include standard code that applies to all devices, and the input/output drivers themselves have a different user interface, designed specifically to be used only from within the code of the standard front-end code and sometimes also used from within other points in the system.
Stream input/output provides an excellent example of the implementation of an input/output driver. The driver is usually subdivided into two halves, one half deals with the hardware interface and includes the interrupt service routines for the device, and the other half deals with the user interface. It is quite common to include a FIFO queue between the two in order to allow computation and input/output activity to overlap. The following diagram illustrates this overall structure you would expect for a parallel port printer handler under an operating system for the IBM PC:
```
       ||                              ||
    Write( ch ) ---> FIFO  ---> Interrupt Handler
       ||                              ||
 user  ||        device driver         ||  hardware
       ||  (in the operating system)   ||
```
Note that the user interface suggested here only includes the operation to write one character. This is the primitive for this character sequential output device, and any other primitives desired can clearly be implemented using sequences of write-character operations.
For input stream devices, such as the receiver for an asynchronous communications line, we can draw a similar picture:
```
       ||                              ||
     Read( ch ) <--- FIFO  <--- Interrupt Handler
       ||                              ||
 user  ||        device driver         ||  hardware
       ||  (in the operating system)   ||
```
The only difference is that the queue is now fed by the device and emptied by the user.
On first generation machines, programs usually were limited to synchronous I/O, where the application program stopped to read input, and where input was lost if the program wasn't ready for it when it arrived. Interrupts were not used, and the user interface routines directly manipulated the device registers to perform read and write operations.
The IBM PC BIOS and most of the MS/DOS operating systems are typical of first-generation systems, making little or no use of interrupts (although these do support a simplified interface to the interrupt hardware for applications that require it). Modern operating systems for PC compatables generally bypass the BIOS in order to use interrupts.
The use of a FIFO queue to separate the application code from the interrupt service routine allows fully asynchronous operation, where computation and input/output overlap, and no data is lost except in the case of input devices, when input data arrives so quickly that the input queue fills up.
Echoing of Interactive Input
There are at several different paths that may be followed by characters that a user types when they are echoed to the display:
```
    Read( ch ) <---  FIFO <---  Interrupt Driver
	 |                    /    |
	B|              ----- A  FIFO C
	 V            /            V
    Write( ch ) ---> FIFO  ---> Interrupt Driver
```
In path A the input interrupt routine enqueues character to be echoed in output queue. Things echo soon after they are typed, but they get mixed randomly into the output if you type ahead of the prompt from the applications program.
In path B the read routine enqueues character to be echoed as it dequeues it from the input queue. The output always looks the same whether or not you type ahead of the prompt, but echoing is delayed.
In path C an additional (short) queue is introduced. The output interrupt driver must check both queues, with the queue of characters to be echoed taking priority over normal output. This leads to the fastest possible echoing of input, but in addition to allowing input to be mixed arbitrarily with output, it also makes the output driver more complex.
Other paths are possible; for example, the hardware may handle the job of echoing input. At the other extreme, application programs may handle echoing. On UNIX, the user of a terminal device may select the echoing mode using some of the less common system calls, stty and ioctl. These calls allow nonstandard and device dependent options to be set, and the set of options they allow is huge.
Block I/O Implementation
As with stream I/O, there are two interfaces that concern us, and the system provides a mapping between these, as well as providing for buffering to improve performance.
```
       ||                                   ||
    Write( buf, add )      Queue            ||
       ||  or         --->  of   ---> Interrupt Driver
    Read( buf, add )      Requests          ||
       ||          ( buf, add, sem, dir )   ||
       ||                                   ||
 user  ||           device driver           ||  hardware
       ||     (in the operating system)     ||
```
Read and Write operate on the same queue, placing requests for block I/O in it, instead of data. Each time the device issues an interrupt, it signals that it is ready to begin a new transfer, at which point, driver removes an entry from the queue and gives it to the device, starting the next transfer.
The queue entries consist of a buffer (main memory address) and a device address, along with a semaphore used by the interrupt service routine to signal when the transfer is complete and a code indicating the direction of the transfer.
If user-level I/O is synchronous (that is, if the user process always waits for I/O completion), the Read and Write routines take care of waiting on this semaphore, and only one semaphore is needed per process for this purpose.
If user-level I/O is asynchronous (that is, user processes may explicitly overlap computation and I/O), then the user process must have one semaphore allocated per buffer. This gives the user the option of starting multiple reads and writes in parallel, something the traditional UNIX interface, for example, does not allow.
Minimum Latencycraft
Disk Throughput depends on the order of requests in the request queue. A bad order can cause lots of long seeks A good order minimise the time spent seeking.
Latency is the time between the initiation of an operation and the actual start of the operation. Rotational latency is the delay caused by the rotation, for example, a delay waiting for a particular sector to arrive under the head of a disk. Recirculating shift register memories hava analogous latency problems that are usually described as rotational latency.
Consider the time taken to read all the sectors of one track. Typically, these can be read in one or two disk revolutions, but if the requests are ordered wrong, for example, by reading sector x-1 immediately after reading sector x, it may take almost a whole disk revolution to read each sector.
It is frequently impossible to read consecutive sectors of a disk. The reason is that the intersector gap may be too short to allow an interrupt service routine to process the completion of the previous sector and initiate reading the next. To avoid this problem, many systems arrange the sectors on a disk in interleaved order, so that logically consecutive sectors are separated by one or two others, and the sectors around a track are arranged in some order such as:
```
A B C D E F G H I J K L M N O P  -- no interleaving
A I B J C K D L E M F N G O H P  -- 2-way interleaving, almost
A G L B H M C I N D J O E K P F  -- 3-way interleaving
A E I M B F J N C G K O D H L P  -- 4-way interleaving, almost
A N K H E B O L I F C P M J G D  -- 5-way interleaving
```
The above examples show interleaving patterns on a 16 sector track, where sectors are addressed by consecutive letters of the alphabet; the 2-way and 4-way interleaving patterns are imperfect because 2 and 4 evenly divide 16.
Head positioning latency is the delay caused by the need to move the disk heads from one track to another on a moving head device. If a group of reads are completed from one cylinder before any reads on an adjacent cylinder are attempted, the head motion latency will be less than would be expected by reading alternate sectors from alternate cylinders.
Disk schedulers can speed up the system by changing the order in which the pending I/O requests are handled. The most widely known scheduling algorithm is called the elevator algorithm.
The elevator algorithm is named after the algorithm used by an elevator in a tall building. Assume that each floor has only one button. The best service results in this case when the elevator sweeps up towards the top of the building, stopping at every floor where there is a pending service request. When it reaches the topmost floor where service was requested, it sweeps downward, again stopping at every floor where there is a pending request. This gives neither maximum throughput nor a minimum waiting time, but it gives a bounded value of the maximum wait along with decent throughput.
If the disk head position on a moving head disk is considered, the elevator algorithm can be used to schedule the movement of the head in and out from cylinder to cylinder.
A disk scheduler could maximize throughput by only serving a small subset of the cylinders where there was always work to do, but in this case, some request for service to another cylinder might wait forever, making the average wait infinite. The average wait can be minimized by a variant on the elevator algorithm, but this will not have optimal throughput if there is always available work on some closly bunched subset of the cylinders.