31. Multicomputer and Distributed Systems
Part of
the 22C:116 Lecture Notes for Fall 2002
|
Given a network of machines and the decision to distribute an operating system over that network, there are a number of alternative ways to organize the system. These organization alternatives are mirrors, in some sense, of alternative physical organizations. At one extreme, all of the computers could be centralized in one place, perhaps all mounted in one rack, and treated as if the composite result was a single computer.
____________ | __________ | A Rackload || computer || \ of ||__________|| | Computers || computer || | ||__________|| | local network || computer || | ||__________|| | || computer || / ||__________|| || gateway ||--- connection to the world ||__________|| (including user terminals) |____________|
The department of computer science has several of these types of systems currently in room 303 MLH, the largest made of 24 generic Pentium III boxes.
At another extreme, each processor coult be placed, with the appropriate I/O devices on the desk of a particular user, or alternately, in a process control context, each cpu could be placed on or near one of the devices being controlled.
_________ _________ | display | | display | |_________| |_________| _____|____ _____|____ _________ _________ | computer | | computer | | display | | display | |__________| |__________| |_________| |_________| | | _____|____ _____|____ | | | computer | | computer | | | |__________| |__________| | | | | network =o=============o=============o=============o=
Each of these organizations suggests a software organization! On the centralized system, it would be unnatural to dedicate processors to particular clients, while on a distributed system, it is quite natural to imagine dedicating processors to specific jobs on behalf of specific clients.
Each of these organizations leads to different degrees of reliability! When a resource is centralized, whether it be disk storage, processors, power supplies, or communication, it becomes a potential reliability problem! Each organization leads to different degrees of flexibility! When resources are dedicated to specific applications, it becomes more difficult to allocate surplus resources to other applications.
The obvious way to manage resources in a distributed system is to place a single server process in charge of each resource. Thus, a distributed operating system would have one process manager, one file manager, and so on. A process wishing to create another process, for example, would send a message to the process manager. The manager would track the loading of all machines in the system and create the process on the appropriate machine, returning the network address of the created process to the creating process so they can communicate.
This allows processes running on any machine in the system to use resources located anywhere, but it is hard to make such a scheme fault tolerant, and the server for each resource becomes a bottleneck as the system is expanded.
As another example, consider a printer manager; this could automatically select one of a number of printers, depending on such things as the lengths of the various print queues, the locations of the printers and the particular requirements of the print job (is quality print required, is color required, and so on).
Centralized resource managers are the easiest to write. The manager may manage a distributed resource, for example, a pool of printers, where each printer is run by a print-spooler on a different machine. As a centralized manager, such a program can maintain a single database giving the status of each resource it manages, and it can use simple client-server protocols to deal with users.
As an alternative, consider distributing the code for each manager around the network. In this scheme, there may be no central "process manager" or "print queue manager", but instead, numerous cooperating managers that share the respoinsibility for managing each resource. This leads to a fault tolerant architecture, and it eliminates bottlenecks but it is much harder to write the code.
A distributed resource manager can be constructed by distributing the function of a centralized manager. When this function is distributed in the extreme, there is one agent on each machine and no central manager. A user wishing to use a resource contacts the local agent. The agent may assign a local resource to the user, or it may contact neighboring agents for help in finding an appropriate resource.
When this is done, the load on each agent is proportional to the number of local clients with which it deals plus the number of local resources it makes available to remote systems. This sum should be proportional to the capacity of the machine on which the agent runs, and the result scales very well with network size.
Furthermore, the loss of an agent because of a fault does little to the function of the other agents, so this approach is far more fault tolerant.
Chaotic management is not always possible, but when it can be used, the result is a self-organizing resource with no central or distributed authority. As a result, there is no bottleneck and it is fault tolerant, but sometimes, this can be a dangerous approach.
Consider the problem of managing processes in a distributed system. In a conventional manager, whether fully centralized or distributed, the decision over where to place new processes and any decisions about load balancing or process migration are initiated by the process manager and based on knowledge of the system state. Obtaining such knowledge is expensive.
In a chaotic process manager, there is no manager, as such! This scheme was invented in a science fiction novel, The Shockwave Rider, by Brunner, and refined by Shoch and Hupp (at Xerox); both used the term worm to describe such a system. An illustration of the danger of such schemes is given by the release of Morris's Internet worm.
A worm is a program that consists of multiple segments in a distributed computing environment. Each segment is a process, usually on a different computer from other segments. The worm segments typically communicate with each other to observe the number of segments in the worm, and if some segment is lost for any reason, the remaining segments find a machine to host a replacement. The key thing to note is that the operating system does not manage the processing resource! Instead, each worm seeks resources for itself.
The necessary primitives required to support worms allow a process to:
Additional useful primitives for the support of orderly and nondestructive worms allow a process to:
A typical example of the internal communication within a worm would involve a token passing scheme where each worm segment keeps a list of all the other worm segments it thinks are still alive. The token includes a copy of this list. When a worm segment receives the token, it compares the list in the token with its list and updates both to reflect the current status of the worm (perhaps checking with each process that has recently been added or deleted to verify their status).
Once a process updates the token's list, it forwards the list to the next process in sequence (the sequence being determined by the order of processes listed in the token, for example).
If a worm segment doesn't get a token after some interval, it must assume that the token has been lost and initiate the creation of a new token by communicating with the worm segments it knows of.
If a worm segment notices that the process list in the token is too short, it finds a remote machine and creates a new segment, adding that segment's address to its list and the token's list.
If a worm segment notices that the process list in the token is too long, it deletes its entry from the token, forwards the token, and then deletes itself.
The Xerox worms found willing machines by broadcasting a message and waiting for a machine to say it was willing. Machines were willing when their screen savers were running, and when a user hit a key (killing the screen saver), any worm segments on that machine were promptly killed.
Morris's worm had to work harder to find any machine, since a complete map of the internet does not exist. Morris's worm searched routing tables and things like that to find neighboring machines to infect. Morris's worm didn't ask for willing hosts, and once a host was infected, there was no effective provision for killing the worm segment.
Worms with practical applications must have some way for users to locate a worm segment so they can communicate with it. Either the worm segments must log their location in an agreed place, or there must be a broadcast of some kind to search for worm segments.
Finally, the self-replicating fault tolerant nature of the basic worm skeleton requires that the worm itself contain provisions for killing the worm. Worms that contain no such provisions are extremely difficult to exterminate!
The minimal worm is called an existential worm; it merely tries to maintain a fixed population of segments and performs no useful work. Morris's worm was intended to be an existential worm, but the code to control the population failed, and as a result, it replicated out of control on the Internet, bringing large parts of the net to a near standstill, and landing its creator in jail.
Once an extential worm framework is available, there are a number of useful applications that can be supported.
The first useful application at Xerox was the alarm clock worm. A user wishing to be awakened at a particular time would log the time of the desired wake-up call and a phone number. The alarm clock worm would manage this database, calling the indicated phone numbers at the appropriate times using an outgoing modem, and letting the phone ring twice exactly. The worm architecture made it possible to support this function without dedicating any machines or storage to the application. The database was maintained in the worm's memory, and the worm was responsible for finding idle modems to use for placing the phone calls.
The Xerox compute server worm could, in theory, host any computational job; the demonstration job they used was a real-time animation task. The processor displaying the images would request images from the worm in the desired order. Each worm segment would compute one image, deliver it to the host, and the host would assign it a new image in the sequence. The occasional death of a worm segment would cause, at most, one lost frame, and the background extential activity of the worm framework guaranteed that there were always enough active segments to keep the animation going -- so long as a sufficient number of machines on the network were running their screen savers.
A similar worm-based compute server could have been used to do such jobs as distributed prime factoring, cryptographic analysis, or any other intensive computation. The SETI@home system and its relatives are special purpose descendants of this, but it is easy to imagine a worm-enabled screen saver running on PCs that offered a general purpose computing resource to the net whenever the host machine was idle.
The problem of assuring each user that the remote computations performed by worms on their machine were not security problems is a serious one. In essence, we want a host operating system that offers a secure domain to the worm application where there is no way the worm can escape from that domain. At Xerox, they didn't bother with this -- security was not an issue in the internal research network they were using.