Before discussing file systems, it's useful to spend some time on a simpler class of devices, timers. In a sense, a timer is the simplest of all I/O devices, because it performs no I/O. There are many kinds of timers; the simplest is the periodic timer. All it does is request a periodic interrupt.
For example, consider a timer that fires every millisecond. This might present the following interface to the software:
expired -- once a millisecond, the timer sets this single bit interface register. When this bit is set, the timer also requests an interrupt. The timer interrupt service software must reset this bit before returning from interrupt or re-enabling interrupts in order to turn off the interrup request.
This simple timer will be used for the remainder of this lecture, but note that far more complex timers are common. For example, interval timers contain a register that is decremented at a constant rate, for example, once per microsecond. The timer requests an interrupt when the register reaches zero, and the interrupt service routine may load a new count in the register to request the next interrupt.
If the software always loads the same count in the interval timer, the interval timer simulates a periodic timer. The following trivial interrupt service routine simulates an interval timer (that calls interval-complete when the interval expires) using a periodic timer:
periodic-interrupt-service-routine: reset( expired ) countdown := countdown - 1; if countdown = 0 call interval-complete return
Application programs rarely want an interval timer. What they typically want is a service such as the following:
wait( s ) -- when a process calls this operating system service, that process will be delayed s milliseconds (the particular time unit depends on the system).
On a system with only one process and an interval timer, this service would be trivial to implement. On a system with more than one process and a periodic timer, this service is far more interesting. In such a context, many processes may be awaiting different times, and there may easily be more than one process waiting for the same time instant.
Thus, there must be a list of waiting processes, with a record, for each process, of the time it is awaiting. When a timer expires, it must somehow search this list and wake up each process waiting for the current instant.
The need to block processes and then wake them up suggests that each timer record will have a semaphore on which the process blocks. This leads to the following data structure:
Timer record: link fields, as needed for list management delay -- the requested time delay sem -- semaphore used to block process Timer list: a linked list of timer recordsThe wait service seen by an application program would then be as follows"
wait( s ) get a timer record r initialize r.sem to a count of zero ) r.delay := s append( r, timer-list ) P( r.sem )A single timer record can be permanently attached to each process. The interrupt routine needed to support this service is as follows:
periodic-interrupt-service-routine: reset( expired ) for each timer record r in timer-list r.delay := r.delay - 1 if r.delay <= 0 then remove( r, timer-list ) V( r.sem ) endif endloop returnCode written along the above lines is actually used on some systems, but it begins to behave very badly on systems with large numbers of waiting processes. The problem is that interrupt service routines should never perform open ended computations such as scanning of linked lists. This work should be moved to an application process if at all possible.
Nonetheless, the above code implements what can properly be called a virtual interval timer available to each applications process.
The key to avoiding complex interrupt processing in the implementation of timer services is to rely on a sorted list of timer records, each holding the time being awaited. This requires the following data structure:
Timer record: link fields, as needed for list management time -- the requested time delay sem -- semaphore used to block process Timer queue: a priority queue of timer records, sorted by time Clock: the current timeIf the system is to run for long periods, the variables used to hold clock and the time fields of timer records must have sufficient precision that overflows are not likely during the life of the system. A year's worth of milliseconds will overflow a 32 bit counter, so realistic systems should 48 or 64 bit counters. Assuming this is taken care of, the wait service can be coded as follows:
wait( s ) get a timer record r initialize r.sem to a count of zero ) r.time := clock + s enqueue( r, timer-queue ) P( r.sem )The interrupt routine needed to support this service is as follows:
periodic-interrupt-service-routine: reset( expired ) clock := clock + 1 -- count time while head( timer-queue ).time <= clock r := dequeue( timer-queue ) P( r.sem ) end loop returnThis code rests on a priority queue. The simplest such data structure is a sorted linked list, where the enqueue service sorts new items into the list in time order, the head of the list is always available, and the dequeue service simply removes the head element from the list.
This simple implementation moves most of the work to the enqueue service, where it belongs, but if the list is long, the enqueue service may take some time. There are faster algorithms that use binary trees or heap-ordered trees to achieve O(log n) operation times on the queue. You can get more information on this subject from An Empirical Comparison of Priority Queue and Event Set Implementations in Communications of the ACM, 29, April 1986, pages 300-311. Code for interesting priority queue algoritnms is available to WWW users.