Background: Consider the problem of distributed deadlock detection in a system where processes communicate by messages over Demos style links. Each process has a list of links over which it may send messages, and each link designates a particular incoming message queue of a process. The only way a user process may block is by awaiting a message on one or more incoming message queues.
Assume that there are multiple host machines running user processes, and that each has a local operating system kernel. Assume that, when a kernel detects that a process is blocked, it may initiate a deadlock detection algorithm. The deadlock detection algorithm must not involve communication with user processes, but must only involve communication between the kernels involved. The only process state information used by the algorithm must be information available to the kernel.
Assume that message delivery is instantaneous (the only delays are computational) and that all links of a process and the contents of all messages in message queues are available to the local kernel. Assume that all system components are reliable.
The Problem, part A: Describe a distributed deadlock detection algorithm that would work in this context. This algorithm must not rely on timeout, nor may it use any data structures other than those described above! It will not be efficient!
The Problem, part B: What improvements to the above algorithm are possible if kernels maintain an inverted data structure -- that is, in addition to the link table of each process, the kernels maintain a table, with each incoming message queue, of which processes have links to that queue.
The Problem, part C: Discuss the additional kernel to kernel message traffic required to maintain this inverted data structure as links are manipulated by user programs.
The Problem: Explain how scatter-gather primitives in the DMA hardware used to send messages can significantly reduce the overhead imposed by a protocol hierarchy.
On such a system, multiple copies must be saved for any data that is to be preserved through any length of time, and these copies must constantly be monitored to assure that corrupted copies are discarded and replaced with replicas of surviving copies.
The problem: Propose a reasonable approach to implementing stable storage in such a setting.
Assume the primitive spawn(p,x,y...) creates a new thread running procedure p(x,y...), and that the primitive die() allows a thread to kill itself. Threads may synchronize using thread semaphores with the operations thP(s) and thV(s). If no memory is available, spawn(...) will block until there is enough memory to create the new thread.
The problem, part A: Write a procedure called nbsend(...) that allows a program to send a message without being blocked until that message is received. Consider the following example application:
repeat i = i+1; nbsend(i,...) foreverThis should send all of the integers to the recipient, and it should block only if the memory capacity of the sending system is filled with copies of the message and possibly with other ephemera.
The problem, part B: What assumptions do you have to make about the underlying system for your answer to part A to guarantee that the messages are received in the same order that they were sent?
The problem, part C: Describe a protocol that could be used on the underlying system to implement the blocking send and receive primitives.