Porting an operating system to a symmetrical multiprocessor can be easy, if the operating system was constructed using "textbook methods". The textbook methods that matter are:
Violations of these rules are common in uniprocessor operating systems. If all kernel code is privileged, and if some critical section is very short, the system programmer on a uniprocessor will naturally use disable-interrupts to enter a critical section and enable-interrupts to exit that critical section. Code written this way will port very poorly to a multiprocessor.
The reason for this is: If there are multiple CPUs, disable interrupts does nothing to halt other CPU's, and if we care about performance, we don't want to halt other CPUs! When one CPU needs to update a shared data structure, it must only block those other CPUs that are trying to update the same data structure!
When porting uniprocessor code to a multiprocessor, wherever the uniprocessor code had:
disable interrupts .. critical section .. enable interruptsThe corresponding code on the multiprocessor will have this code:
disable interrupts -- prevent other activity on this CPU wait on spin lock -- block if other CPU in critical section .. critical section .. release spin lock -- allow other CPUs into critical section enable interrupts -- allow other activity on this CPUIn the worst case, any critical section guarded by disable-interrupts on the uniprocessor will wait on a global lock variable. The spin-lock can be implemented using any available locking primitive. In the worst case, somethign as crude as Dekker's solution might apply, but if the machine has instructions supporting better solutions, they should be used.
wait on spin lock (i) loop cc = test-and-set(i) until cc indicates that i was reset release spin lock (i) i = reset
wait on spin lock (i) register r = closed loop exchange-memory-and-register(r,i) until r = open release spin lock (i) i = open
r = load-locked(i) loads from memory location i into register r and sets auxiliary hardware in the CPU to work monitoring location i to see if any store operations to i are executed. This monitoring is called snooping on the memory bus.
cc = store-conditonal(r,i) stores the contents of register r into memory location i, but only if the snooping mechanism indicates that there has been no store to location i since the most recent load-locked, and only if the snooping mechanism is currently snooping for references to locaiton i. The success or failure of store-conditional is returned in some kind of condition code.
wait on spin lock (i) register r = closed register s loop s = load-locked(i) cc = store-conditional(r,i) until (r = open) and cc indicates success release spin lock (i) i = openIn the above, load-locked and store-conditional were used to simulate exchange-memory-and-register. In fact, they can be used for much more. For example, to atomically increment an integer in location i of memory, we can use:
atomic increment (i) register r loop s = load-locked(i) i = i + 1 cc = store-conditional(r,i) until cc indicates successIf no other process does a store in the indicated variable, this will run in only one more instruction than a normal load-increment-store sequence. Only in the unlikely case that another process touches the memory location during this very short critical section does the loop iterate. Most modern RISC CPU's support this model!
When porting uniprocessor code to a multiprocessor, the interrupt mechanism of the multiprocessor must be inspected! There are two common possibilities:
Within each I/O driver, critical sections involving data shared between the interrupt service routine and the user-side access routines for that driver must be identified and guarded by spin-locks. Ideally, a single global lock variable should be avoided, and instead, locks local to the driver should be used -- there is no need to block the ethernet driver when the SCSI interrupt service routine is exploring the disk schedule.
In this case, we need to audit the code for the interrupt handlers to check all references to global resources such as buffer pools. Any use of such a resource must be guarded by spin-locks. Again, it is better if there is a lock to guard each shared resource (such as the head of the linked list of free buffers) instead of one global lock.
Trap service routines always run on the same processor that executed the instruciton that caused the trap. Therefore, all CPU's execute trap service routines, and it is most productive if they are treated as user code running under the authority of the scheduler as part of a user process (but executing in the kernel).
When user processes must deal with shared data, in a uniprocessor, they will usually use semaphores or some equivalent mechanism supported by the kernel. Assuming correct kernel implementation, this will generally work correctly on the multiprocessor, so it requires no changes, however, changes to user code may allow better multiprocessor performance.
Inside trap-service routines, however, there may be places where the code running on behalf of a user process uses disable-interrupts to guard some critical section. Spin locks must be added to each such critical section!
In general, both user and kernel code running on behalf of users has two options to guard each critical section: spin-locks and semaphores (or equivalent mechanisms that involve the process manager). The choice between these is should be made as follows:
The key thing to note is that operations on semaphores involve
transfer of control to a scheduler routine, while operations on
spin locks involve only a few machine instructions if there is
no contention. Therefore, the overhead of spin locks is far
less if there is usually no contention.
The key thing to note is that waiting on a spin-lock monopolizes
the CPU on which the blocked process is running, while semaphores
allow other processes to run.
Therefore, use of semaphores will allows more useful work to be
done if the likelyhood of blocking is high.
These critical sections are usually long, and the probability that
a process will be blocked on entry to one of these sections is high.