10. Input Output

Part of the 22C:116 Lecture Notes for Fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Review

This lecture focuses on a review of three different aspects of input output:

User level I/O requests
Virtual devices -- the abstractions provided by the system
I/O drivers -- the implementation of those abstractions

The term virtual device is not standard, but it is realistic! The device interface presented to applications running under an operating system is usually quite different from the actual device interface, at least as different as the virtual memory abstraction is from the actual physical memory resources, and therefore, it seems quite reasonable to use the adjective virtual to distinguish the view of the device supported by the system from the actual resources provided by the device.

The Physical Device Interfaces

The physical device interface of a real system typically consists of a number of device registers. For example, on a typical IBM PC compatable, parallel port zero is presented to the software by the hardware as follows:

                __ __ __ __ __ __ __ __ 
  0378 data    |D7|D6|D5|D4|D3|D2|D1|D0|
               |__|__|__|__|__|__|__|__|
                __ __ __ __ __ __ __ __ 
  0379 status  |BS|AK|PE|OL|ER|x |x |x |
               |__|__|__|__|__|__|__|__|
	BS = 1 when printer not busy
	AK = 1 when data transfer in progress
	PE = 1 when printer is out of paper
	OL = 1 when printer is on line
	ER = 1 when there is no error
                __ __ __ __ __ __ __ __ 
  0379 control |x |x |x |IE|SL|IN|LF|ST|
               |__|__|__|__|__|__|__|__|
        IE = 1 to enable printer interrupt
        SL = 1 to select printer
        IN = 1 for normal operation (zero causes printer reset)
        LF = 1 to force printer to advance paper
        ST = 1 to transfer data into printer

The protocol for printing one character using this interface was established by Centronics Corp for their line printers in the early 1970's, significantly before the IBM PC came to market. To print one character, the software must load the character in the parallel port data register and then send out a positive pulse on the ST (strobe) line. The printer acknowledges the strobe pulse with a positive pulse on the AK (acknowledge) line and a negative pulse on the BS (busy) line. The contents of the data register must remain valid until the end of the AK pulse, and no new transfer is legal until the end of the BS pulse. Usually, these two pulses will be coincident.

If interrupts are enabled (IE = 1), the printer interface will request an interrupt when AK (acknowledge) goes from 1 to zero. The particular interrupt requested may depend on the settings of jumpers or option switches on the parallel port interface, but it is normally IRQ7, the lowest priority hardware interrupt supported on the PC family of machines.

The SL, IN, LF and ST lines from the control register correspond directly to corresponding output pins in the parallel port connector, and all of the status lines correspond to input pins in the parallel port connector. Some of these lines are inverted, so the voltage levels corresponding to 1 and 0 in the above documentation aren't always as expected. Aside from the IE (interrupt enable) bit and the interrupt logic attached to AK (acknowledge), the parallel port hardware makes no assumptions about the use made of the control, status and data lines. The meanings of those lines are established by the hardware connected to the printer port.

The PC parallel printer port interface is typical of the interfaces to low performance devices where each byte of input or output is transferred via explicit execution of instructions executed by the CPU. High performance devices frequently use direct-memory access interfaces. The simplest direct memory access interface must have at least the following device registers:

                _________________________________
   DMA address |                                 |
               |_________________________________|
                _________________________________
   DMA count   |                                 |
               |_________________________________|
                ________ __ __ ________
       control |        |IE|  |  COM   |
               |________|__|__|________|
        IE     interrupt enable on DMA transfer complete
        COM    command field
                ________ __ ___________ 
       status  |        |DN|           |
               |________|__|___________|
	DN     indication of transfer complete

The direct-memory access processor idles until the program running on the CPU loads the control register with a command requesting an input or output transfer. On receipt of this command, the DMA processor repeatedly transfers data between memory and the device, in a direction indicated by the command. Each transfer addresses the memory location pointed to by the DMA address register, and after each transfer, the DMA address register is incremented by the size of the transfer and the DMA count register is decremented, typically also by the size. Some direct memory access devices transfer one byte at a time, while others transfer larger blocks, up to the word size of memory. When the DMA count register reaces zero, the DMA processor sets the done bit in the status register; if the interrupt enable bit is set, this also results in an interrupt request to the CPU.

The interface described above would be sufficient to support direct memory access transfers to simple devices such as the parallel port, but some devices require other device control registers. Consider, for example, disk drive. Typical disks allow random access to any sector of any track of any cylinder of the disk, and we can speak of the specification of the sector, track and cylinder as the address, on the disk itself, of the data we are referencing. Before a transfer to or from disk, therefore, we must somehow tell the disk drive what disk address to use. More generally, for random-access devices, we speak of the device address and there must be a device address register somewhere in the interface for each such device:

                _________________________________
device address |                                 |
               |_________________________________|

Note that the device address register may have multiple fields, for example, one for the sector number, one for the track number and one for the cylinder number on disk. When these cannot all be packed into one register, the device may have separate registers for these fields.

Prior to any transfer to or from a random access device, the program running on the CPU must load the device address register appropriately. Floppy disks are low performance devices, so the data transfer to and from a floppy disk is frequently done without benefit of direct memory access mechanisms, but direct memory access is required when high-speed hard drives are used. Therefore, the software controlling a typical disk interface must generally loasd the device address register, the DMA address and count registers and finally the device command register in order to initiate the reading or writing of one sector of data.

The Stream Abstraction

The stream is a typical (but by no means universal) input/output abstraction. Typical stream operations are get to read a character from a stream and put to write a character to a stream. In the C standard library, for example, the fgetc(s) function returns the next character read from the stream s, and the fputc(c,s) function writes the character c to the steam s.

Whether these are packaged in procedural or object oriented form, these functions are usually hidden under layers of middleware that allow reading and writing higher-level data types such as character strings or integers. We should consider the conversion routines between these higher level types and the character level stream to be attributes of those data types, so we will not discuss such conversions here.

Streams are abstractions in the sense that they don't necessarily map directly to the actual input/output operations on any particular physical device. Some devices are inherently stream oriented, for example, the simple parallel port discussed above, as well as asynchronous serial communications lines and keyboards, while others are not, for example, disks, high performance network interfaces and graphics display devices.

The UNIX stream abstraction is somewhat more complex than the simple put and get operations suggested above. In UNIX, the operations are:

	len = read( f, buf, buflen )
	len = write( f, buf, buflen )

In both of these f names the operating-system level stream to be read or written, and data is transferred to or from the buffer buf. This buffer is of size buflen; on output, this many characters will be written unless there is a problem; on input, this many characters will be read unless some other condition terminates input sooner. Both of these return len, the number of characters actually read or written.

Note that read(f,&ch,1) is therefore equivalent, in a simpleminded way, to ch=fgetc(s). There are two differences worth noting: First, the parameter f to read is a file descriptor, the integer index of an open file in the open file table for the process, while the parameter s to fgetc is a pointer to a stream object (confusingly enough, this is of type FILE*). Second, each call to read involves a relatively expensive context switch across the user/kernel protection boundary, while the call to fgetc is a simple function call. If the time taken by a function call is too much, the standard C include files also provide getc, a macro that expands as inline code to extract a character from the stream. Typical C stream implementations include a stream buffer with a capacity of around 4K bytes, and only when this is empty does fgetc call read to refill the buffer.

Also note that the there are undefined and device dependent characteristics associated with kernel-level Unix files: For example, in reading a stream attached to the keyboard, it is possible to configure the system to force read to terminate when the user types some key (typically return), it is possible to configure it so read terminates after a timeout if the user types nothing.

It is important to emphasize that both the Unix file and the C stream abstractions are conceptually objects with methods to operate on them. The fact that the UNIX system call interface and the C standard library interfaces are procedural can hide this, but logically, the operations on an open UNIX file should be thought of as references to methods of a file object. Thus, for example, it is constructive to think of read(f,buf,len) as entirely equivalent to f.read(buf,len).

Block Oriented Abstractions

typical abstration used for input/output supports block-oriented random access to input output devices. In this case, typical primitives are:

	Read( device, memory_buffer, device_address )
	Write( device, memory_buffer, device_address )

Here, the buffer is a block of consecutive memory locations in the caller's address space, and the device address names a block of consecutive locations in the random-access I/O abstraction, perhaps a relative address in a disk drive or disk file. The block size in this scheme may be fixed system wide, fixed by the particular device, or passed as an explicit parameter.

This abstraction maps naturally to disk devices in the case where the block size is one sector and the device_address is the sequential sector number on the device. It maps poorly to almost all other real devices, but disks are so important that it is frequently used.

The file abstraction of the Unix kernel is a generally successful attempt to generalize streams to include both sequential and random-access block I/O. The operations are:

	len = read( f, buffer, length )
	len = write( f, buffer, length )
  	lseek( f, offset, base )

The lseek() operation is used to position the read-write pointer in the indicated open file. The offset may be relative to the start of the stream, to the end of the stream, or to the current position in the stream, depending on the value given for base. Obviously, seeking in some streams is meaningless (consider seeking relative to the end of the stream of characters being read from a communications line), and seek operations are not always supported (seeking in the stream from the keyboard would require storing the entire history of what was typed, and this is not worth doing), but on those file types where seeking is useful and can be efficiently implemented, for example, on disks and tapes, this interface works quite well.

With a buffer size of one byte and no seek operations, this interface exactly emulates a pure stream. With a buffer size of one disk sector, with a seek offset that is a multiple of the sector size prior to each read or write, this interface exactly emulates a conventional random-access model. If the sector size is not matched to the buffer size or if seeks are made to device addresses that are not sector aligned, intermediate buffers and possibly extra input-output transfers must be used to reblock the data between the user and the device.

Of course, if the underlying device is incapable of a seek operation, for example, if it is a printer or a keyboard, the seek operation in the user interface is meaningless.

Class Hierarchy for Streams?

One of the very first uses of object oriented programming can be found in Christopher Strachey's stream implementation in his OS-6 project in the late 1960's. Strachey, the primary developer of the CPL programming language, naturally chose to write this system in BCPL (Basic CPL, the ancestor of C), but he wrote his code in clearly object oriented form, and today, we would say that he used a class hierarchy something like the following:

                Write Only Stream
                     write(c)
                        |
                Read/Write Stream
                      read(c)
                     write(c)
                        |
               Random-Access Stream
                      read(c)
                     write(c)
                      seek(c)

In fact, the whole notion of having the class hierarchy have any relationship to the implementation hierarchy fails for I/O, because, for some devices, disks for example, random block access is the primitive, while stream access is a higher level abstraction. For other devices, such as asynchronous serial ports, the situation is reversed; stream access is primitive, and blocks must be introduced as a higher level abstraction. Thus, one of the original inspirations for object oriented programming remains a good test of the methodology!

Implementation of Streams

A device driver is the component of an operating system that supports input or output operations to some device. Conceptually, this bridges the gap between the abstractions presented to the user and the concrete interface of the hardware itself, but in systems such as Unix, the front-end interface to user code, for example, the read(), write() and lseek() kernel functions of Unix, include standard code that applies to all devices and is therefore above the level of the input/output drivers.

Stream input/output provides an excellent example of the implementation of an input/output driver. The driver is usually subdivided into two halves, one half deals with the hardware interface and includes the interrupt service routines for the device, and the other half deals with the user interface. It is quite common to include a FIFO queue between the two in order to allow computation and input/output activity to overlap. The following diagram illustrates this overall structure for a parallel port printer handler under an operating system for the IBM PC:

            ||                                ||
     write(f,buf,len)  ---> FIFO  ---> Interrupt Handler
      when f refers         queue             ||
      to the parallel                         ||
           port                               ||
            ||                                ||
      user  ||         device driver          ||  hardware
            ||   (in the operating system)    ||

Note that the user interface suggested here is that of Unix, but this is only for the sake of example. What is important is that this user interface routine enqueues the successive characters passed by the user in the queue, waiting for space when the queue is full, but otherwise returning to the user as soon as possible. The interrupt handler, in turn, dequeues one character from the queue for each interrupt and transfers that character to the hardware.

For input stream devices, for example, the input driver for a unidirectional input-only communication line interface, we can draw a similar picture:

            ||                              ||
     read(f,buf,len) <--- FIFO  <--- Interrupt Handler
      when f refers                         ||
      to this input                         ||
           port                             ||
            ||                              ||
      user  ||        device driver         ||  hardware
            ||  (in the operating system)   ||

The only difference between this example and the output example above is that the queue is now fed by the device and emptied by the user. When the user calls read, characters are dequeued from the queue for return to the user. With each dequeue, the user may be forced to wait until data arrives, but if data has arrived before the user calls read, the user does not wait.

On first generation operating systems, whether on mainframes of the 1950's or on microcomputers of the 1970's, programs were limited to synchronous I/O, which is to say, neither interrupts nor queues were used in the I/O drivers, and user programs were therefore blocked until each I/O operation was completed. With no overlap of user computation with I/O, any input that arrived while the user program was busy was simply lost unless the hardware contained holding buffers.

The IBM PC BIOS and most of the old MS/DOS operating systems were the last widespread survivors of this first generation of operating systems. Modern operating systems for PC compatables generally only use the BIOS routines to load the operating system and get descriptions of the devices for which drivers should be loaded. Once the system is running, the drivers in the system typically work directly with the hardware device interfaces.

Echoing of Keyboard Input

Keyboard input is somewhat more complex than other input devices because users have come to expect everything they type to be echoed on some kind of display. Teletype terminals, based on hardware dating from the 1920's, always offered two operating modes, full duplex and half duplex. In full duplex mode, everything typed on the keyboard was simply transmitted over the asynchronous line, and the printer printed everything that was received. In effect, the Teletype acted as two independent devices, one input only, and one output only. In full duplex mode, it was up to the remote system to echo keypresses so that everything typed would appear on the printer. In half duplex mode, the printer was connected directly to the keyboard; everyting typed was therefore printed locally as it was transmitted to the remote system. Half duplex operation simplified the design of modems and it reduced the complexity of remote software, but it was less flexible.

By 1970, most interactive computer use was done using full duplex Teletypes as terminals, but there were still several open options for echoing what the user typed. The responsibility for echoing could be left to the application, or it could be handled by the system, and if handled by the system, it could be done in several ways. These are illustrated below:

           ||                              ||
          Read <---  FIFO <---  Interrupt Handler
	   ||   |   ___________/    |      ||
           ||  A|  /     B        FIFO C   ||
	   ||   | |                 |      ||
	   ||   V V                 V      ||
          Write ---> FIFO  ---> Interrupt Handler
           ||                              ||
      user ||        device driver         ||  hardware

In path A above, the read routine enqueues character to be echoed as it dequeues it from the input queue. To the interactive user, this is indistinguishable from echoing doen by the application itself. As a result, input typed while the application is in the midst of generating output will not be mixed with that output but will instead wait in the input FIFO queue until it is read by the application.

In path B above, the input interrupt routine handles echoing. As a result, they are echoed soon after they are typed, independent of when they are read by the application. Because they are enqueued in the same FIFO as other output, echoing may not be immediate when other output is in that queue. Most Unix interactive I/O drivers use this model.

In path C above, an additional (short) queue is introduced. The output interrupt driver must check both queues, with the new queue of characters to be echoed taking priority over the normal output queue. This leads to the fastest possible echoing of input, but in addition to allowing input to be mixed arbitrarily with output, it also makes the output driver more complex.

On Unix, application connected to an interactive terminal may select the echoing mode using some of the less common system calls, stty() and ioctl(). These calls allow nonstandard and device dependent options to be set, and the set of options they allow for use with the terminal I/O driver is huge, including the ability to turn on and of echoing.

Block I/O Implementation

As with stream I/O, there are two interfaces that concern us, and the system provides a mapping between these, as well as providing for buffering to improve performance.

            ||                                   ||
           Write             Queue               ||
            ||  or      --->  of   --->   Interrupt Handler
           Read            Requests              ||
            ||                                   ||
      user  ||           device driver           ||  hardware
            ||     (in the operating system)     ||

Read and Write operate on the same queue, placing requests for block I/O in it, instead of data. Each time the device issues an interrupt, it signals that it is ready to begin a new transfer, at which point, driver removes an entry from the queue and gives it to the device, starting the next transfer.

Each entry in the queue contains at least the following fields:

The device address -- what disk sector, or the equivalent for other devices.
The buffer address -- where in main memory is the buffer.
The transfer direction -- read from disk or write to disk.

In a Unix-like system, the user interface assumes that the user wishes to wait for the I/O operation to complete. Therefore the read or write routine can only return to the user when the I/O transfer is actually done. This happens after the I/O request is dequeued by the interrupt service routine, when the interrupt comes in indicating the completion of the last transfer from the buffer. Therefore, the queue entry must contain the synchronization information needed to return control to the user.

Exercise: Consider the alternatives for solving this problem. You could put a pointer to the user process itself in the queue entry, so that the interrupt handler puts the process back in the ready list when it finishes the operation. Alternatively, you could put a flag in the queue entry; with this scheme, the user can test the flag to see if the I/O is done. Finally, you could put a semaphore in the queue entry; when the I/O is complete, the interrupt service routine signals this semaphore, and when the user wants to see if it is complete, the user can test this semaphore. Whis of these schemes is easiest to use, which is easiest to implement, and which maintains the best isolation between the scheduler and the I/O code.

The simple structure suggested above assumes that each buffer provided by the user holds exactly one block of data to be transferred to or from disk. Most devices have a fixed hardware-defined block size, so this requires that the user code be written in terms of this block size. The Unix model allows random access to the arbitrary byte in the stream, and it allows user buffers to be of any size. Some Unix device drivers can do I/O directly to or from the user buffer if the user happens to request exactly one disk block, but all Unix drivers allow non-aligned I/O transfers, where the user buffer contains more or less than one block and begins at any byte position. This is handled by an additional software layer between the interface described here and the user-level interface to the kernel.

Exercise: Design an appropriate intermediate layer allowing seeking to any byte in a file and reading or writing any number of bytes starting at that byte, assuming that all file I/O must be done in 512 byte blocks that are aligned on 512 byte boundaries. Is your solution efficient?

Input/Output Latency

For most input/output devices, we must recognize that some of the time taken to transfer data to or from the device is spent waiting. For example, with disk devices, we must move the heads to the appropriate track and then wait for the disk to rotate until the desired data sector is under the heads. During the time the heads are moving and while we wait for the right sector, no input/output is taking place.

In general, we use the term latency to describe the time between the initiation of an operation and the actual start of the operation. Latency is a general English word, so for example, we can speak of a latent infection during the time between the actual contact with some germ and the first symptoms of that infection.

For disks, CD-ROM devices and similar rotating memory devices, we use the term rotational latency to refer to the delay caused by the rotation of the storage medium. For example, if a disk rotates at 100 revolutions per second, we might have to wait up to 1/100 of a second before a particular bit of data on that disk is rotated under the heads. It is worth noting that we use the term rotational latency to refer not only to delays caused by physical rotation, but also to delays caused by logical rotation. Thus, we can speak of the rotational latencyh of a shift-register memory, where nothing physical rotates, but the data circulates around a shift register as if the data was on a rotating belt.

We also speak of head positioning latency on devices where there is only one head that may need to be moved to the appropriate storage track before data on that track can be read or written. Disks have been built with one head permanently mounted above each track, but most disks have only one head per recording surface, and this head must be moved to the appropriate track prior to any use of that track.

Rotational and head positioning latency are both examples of mechanical latency (unless we are speaking of the rotational latency of a shift-register memory device). Some devices and device interfaces also have latency characteristics that arise from electronic or logical requirements. We may speak, for example, of the head selection latency on a multi-head disk-drive; although this is usually negligable. Some network protocols lead to a delay before a machine is allowed to send a message; we can describe these as network protocol latency.