22C:116, Lecture Notes, Apr. 3, 1995

Douglas W. Jones
University of Iowa Department of Computer Science

Clocks Revisited!
Real-time systems need to be able to measure time! For example, a stepping motor control program may contain the following bit of code:
```
  loop until position = destination
    delay( steptime );
    step(); position++;
  end loop
```
On a centralized system, this bit of code requires a realtime clock with an interface that privides a separate delay timer for each process. Note that the delays are typically variable, depending on the current speed of the motor rotor. The initial steps and final will typically be long in order to accelerate and decelerate the load.
In a distributed real-time system, the problem arises of how to deal with time. A natural solution is to install a single real-time clock somewhere on the system. A clock server process could await messages from client processes that wish to be informed of the arrival of some particular time or the passage of some particular interval.
A typical message to the clock server process might have the meaning "send me a reply after one second" or "send me a message at ten AM". The clock server would maintain a data structure listing the outstanding requests, and when one of the requested times arrives, it would send a reply.
There are three problems with this approach. First, if communication delays are significant, the accuracy of the clock service will be degraded. Second, if the system is large, the clock server may become a bottleneck, limiting system performance. Third, if messages are lost, recovery is difficult.
All three of these objections lead to the desire to use multiple clock servers; the possibility of lost messages can only be effectively addressed if there is a local clock on each machine, and in that case, it is easy to put a local clock server on each machine.
The problem with this is that the clock servers must be synchronized! Absolute clock synchronization is impossible! The best that can be done is to guarantee that two clocks are within some error bound of each other. As a consequence, the relative times of two events on a system cannot be exactly compared. Either one event is clearly before the other, or the two events occurred within the error bound of the clocks used, in which case we cannot say which event occurred first, an in fact, different observers may draw different conclusions about which was first!
Real Clocks
Real clocks never keep exact time. If you had an idealized clock that ticked exactly once per second and compare this with any real clock that is set up to tick once per second, then start both clocks ticking at exactly the same instant, some of the clock ticks from your real clock would come early or late.
If you graph the time of day shown on the real clock versus the time of day shown on the ideal clock, where you would hope for a straight diagonal line, you will see a slightly wavy line. This "clock jitter" is a natural consequence of the physics of real oscillators, and while it can be minimized, it is unavoidable!
Furthermore, the time shown on your real clock will always drift relative to the time shown on the ideal clock. The drift may include a systematic error, if your real clock is running fast or slow relative to the ideal time standard, and the drift will include a random component. As a result of the latter, even if the real clock is exactly calibrated to keep perfect time (an impossibility), it will tend to drift randomly, and if you plot the error as a function of time, the error will follow a random walk, slowly and unpredictably growing.
As a result, the times shown on any collection of independent real clocks will slowly diverge, even if they are all identical and perfectly calibrated.
It should be noted that truly independent clocks are sometimes hard to make! If two clocks are near each other, they frequently influence each other in unexpected ways. There's a well known story of an 19th century observatory that had three pendulum clocks, and to the surprise of the timekeepers, the pendulums in all three clocks remained perfectly synchronized. It turned out that the clocks were so precisely tuned to the same frequency that once one clock was milliseconds away from a tick, the sound of the tick of another clock would jar its escapement free, forcing it to tick. As a result, if the clocks were started randomly, the natural random drift of the clocks would slowly bring them into near synchronization, and once they were synchronized, they remained exactly synchronized. Radio frequency coupling can similarly synchronize modern quartz clocks that are near each other.
Synchronizing Clocks, the easy way.
If we have a central authority that is trusted, for example, the National Institute of Standards atomic clocks in Fort Collins Colorado, we can synchronize all clocks to that authority. The time from the NIST clocks is broadcast continuously on station WWV; For many years, Heathkit used to sell a WWV receiver with an RS-232 port on the back. This receiver even had a correction you could set to take into account the propagation delay of the time signal as it traveled from Fort Collins to wherever your receiver was located.
Adjusting the time.
The interesting problem is, what do you do if the signal you get from WWV is disagrees with your local clock? A simple assignment of a new value to the local time of day can cause chaos! If the time of day jerks forward suddenly, there may be no damage (but in real-time control systems, even this may cause loss of control). If the time of day jerks backward, some time of day will occur twice, and time will briefly appear to move backward, with the consequence that, for example, some timestamps will appear to have been made in the future. This can lead to severe problems even in non-realtime applications such as database systems.
The solution to this problem is to avoid direct assignments to the time of day. Instead, if the official time of day is ahead of the local realtime clock, the local clock can be run fast until the clocks agree, and if the official time of day is behind the local clock, the local clock can be run slower until the clocks agree.
For example, most modern UNIX systems support a system service called adjtime; this is used to make small adjustments to the system time by advancing or retarding the time by the time interval delta.
The adjustments to the speed of the clock made by these services do not involve changing the frequency of the oscillator inside the clock (typically a quartz crystal). Instead, the adjustments are done entirely in software.
For clock software based on a periodic clock interrupt, for example, one that gives a nominal 100 interrupts per second, the clock might maintain the time of day by counting milliseconds. Normally, the clock software would add 10 milliseconds to the time of day every time there is a clock interrupt. If the clock is instructed to "run fast" in order to gain time, the clock software might add 11 milliseconds for every clock interrupt, while if the clock has been instructed to "run slow", it might add only 9 milliseconds per clock interrupt.
The problem is a bit more complex when the underlying clock hardware is based on a programmable interval timer, but the solution is equivalent. If the next clock event is to be scheduled 100 milliseconds into the future and the clock has been instructed to make up time, it might set the timer to 90 milliseconds and then, when the timer interrupt arrives, it would update the time of day variable by 100 milliseconds. Similarly, if the clock has been instructed to lose time, it might set the timer to interrupt after 110 milliseconds.
Note, in real systems, that the hardware clock is usually precise to better than 1 part in 1000, and that when clock is found to be in error, it is typically reset to its correct value by adjustments to the clock speed on the order of 1 part in 100. The 1 part in 10 adjustments used in the above examples would be too abrupt for many applications. when it is found to be in error is typically by clock record of the time of day by a value
Synchronizing one clock to another
To keep one clock in a network synchronized to another more authoritative clock on the same network, the following code must typically be called periodically:
```
    send time request to remote server;
    t1 = local time;
    await t, the time sent by the remote server;
    t2 = local time;
    try to gain t - (t2-t1)/2 time units.
```
This code assumes that the remote server responds immediately when it gets a time request, and that the outgoing network delay is comparable to the incoming network delay. As a consequence, (t2-t1)/2 is an estimate of the time at which the remote server inspected its clock in order to get the time t.
If the interval t2-t1 is significantly greater than the shortest measured interval in previous transactions, it may be prudent to assume that there was contention for resources somewhere in the system, and thus that (t2-t1)/2 is not a good estimate, and thus, that the returned time should not be accepted. This leads to the following more robust version of the code:
```
    send time request to remote server;
    t1 = local time;
    await t, the time sent by the remote server;
    t2 = local time;
    triptime = t2-t1;
    if triptime < mintriptime
      then mintriptime = triptime;
      else mintriptime = mintriptime + epsilon;
    if triptime < (mintriptime + tolerance)
      then try to gain t - (t2-t1)/2 time units;
```
If there is no authoritative clock in the network, but only a group of less reliable clocks, the solution is to use a generalization of this algorithm on all the time servers in the system. If each of two clocks runs this algorithm, the drift of both clocks will be controlled.
Generalizing this algorithm to more clocks is straightforward. Each machine estimates its error relative to each of the other machines, and then computes an average error, possibly weighting this average by how much trust it places in each of the other machines and discarding those measurements from other machines that appear to be unacceptable, for example, because of trip times that are too long. After each cycle of estimating errors, this average is used to adjust the speed of the local clock.
Essentially this algorithm is used by the various national timekeeping authorities (NIST in the United States) to reconcile their clocks and arrive at the values for Coordinated Universal Time.