Final Answers and Commentary

22C:116, Fall 1995

Douglas W. Jones

Grade Distribution

The exam was graded on a 30 point scale, 2 points for each part of each problem.

The average exam score was 14.5; the median was 13.5, and the scores were distributed as follows:

                            X
                        X   X X
                    X X X X X X       X   X X
__________X_____X___X_X_X_X_X_X_X_X_X_X___X_X___X_X_____
 0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 25

Answers

  1. Problem, part A: When a page fault occurs on a machine with a software managed TLB, it is possible that the page fault handler would have to replace an address translation cache entry. A modified clock replacement algorithm would be the best choice for this. How could the software detect use of a cache entry if there is no hardware maintained mark bit?

    A solution: The software can retain a duplicate copy of the access rights for each cache entry in 3 of the 5 spare bits of the cache entry. If the hardware recognized and duplicate copies of the rights agree, the cache entry is considered recently used. If they disagree, the cache entry is less recently used. As the clock hand sweeps the cache looking for a less recently used entry, it invalidates the hardware access rights fields of the entries examined. When a fault occurs, if the cache entry is marked as less-recently used, the duplicate copy of the rights is copied into the hardware field, recording the recent use; this is a soft page fault.

    Comments: Surprisingly, only six did well here. Many suggested solutions to this problem that can be summarized as "change the hardware to add a mark bit."

    Problem, part B: Suppose the higher level system model of the virtual address space is organized as follows:

        _________ _________ ___________ 
       |_________|_________|___________|
         10 bits   10 bits    12 bits
    
         segment   page in    byte in
                   segment     page
    
    What data structures would the system maintain to describe the address space, and how would the software use this to compose the cache entry for a page?

    A solution: The data structures would be those conventional for paged segmented virtual addressing:

         Segment Table   Page Table
            ______         ______          Page
       --->|_____o------->|______|        ______
           |______|       |_____o------->|      |
           |_____o----    |______|       |      |
           |______|       |______|       |______|
           |______|       |______|      4096 bytes
         1024 entries   1024 entries
    
    The page and segment table entries would contain rights (3 bits) plus a 20 bit frame number. The frame referenced by a segment table entry would hold a page table, while the frame referenced by a page table entry would hold a page frame.

    To compose a cache entry for some page, the 20 bit page number from the virtual address (as seen by hardware) would be broken into a 10 bit segment number and a 10 bit page number (the software's view), and these would be used to select a segment table entry and a page table entry. The cache entry would then be composed using the 20 bit page number, the 20 bit frame number from the selected page table entry, and the access rights from the page-table entry, perhaps anded with the rights from the segment table entry.

    Comments: Surprisingly, only four did well here, although five more received a reasonable amount of partial credit, and four more were at least able to give the textbook data structure.

  2. Problem, part A: How would user code interact with an Amulet disk scheduler? How would a user request a read or write, how would the data be communicated, and how would the user learn of the completion of the requested action.

    A Solution: Clearly, both the scheduler and the application running under the Amulet kernel must be composed of chars, so communication of requests to the disk scheduler would be in the form of char calls, and communication of results would take the same form. Consder the following interface:

        call( read, d-w, [buffer,disk_addr,return,d] )
        call( write, d-w, [buffer,disk_addr,return,d] )
        call( return, d, [] )
    
    In the above, read and write are the interface chars to the disk scheduler, d is the user's deadline, w is the expected worst-case time for the scheduler to perform a disk operation, and return is a char provided by the user that the disk scheduler calls to signal completion of the disk operation.

    Comments: Eight did well here, and six more provided sketches of the right kind of answer. Unfortunately, some either didn't understand Amulet (proposing solutions involving semaphores or blocking process-oriented primitives) or regurgitated pseudocode for the Amulet char scheduler itself instead of saying anything about the problem of disk I/O under Amulet.

    Problem, part B: How would you structure your disk scheduler? What internal chars would be involved, and how would they interact?

    A Solution: The read and write chars would add a record of the requested disk operation to a queue of pending disk I/O requests. Disk I/O interrupts would schedule calls to the disk_service char that would remove pending requests from the queue and signal completion of disk operations by calling the return chars provided by users. The disk scheduling policy would be the cooperative result of the interaction of the read, write and disk_service chars on the pending request queue.

    Comments: Only four did well here, and many got no credit at all. In addition to those who confused disk I/O scheduling with process scheduling, there were quite a few who were distracted by internal detail and attempted to invent real-time disk scheduling disciplines or regurgitate details of the elevator algorithm instead of discussing, giving answers that had no connection to the Amulet context.

  3. Problem, part A: If Peterson's mutual exclusion algorithm is run on a multiprocessor where the different CPUs had different width paths to memory, so one CPU operated, for example, on 128 bit words, while another used an 8 bit path, what would you have to guarantee about the allocation of the shared variables and the code used to fetch and store from them to guarantee that this would work.

    A Solution: The variables in Peterson's algorithm need only one bit each, so to make things work, what must be addressed is the problem of making that one bit of each variable equally accessable from each machine. If all machines use 8-bit variables, then updates to an 8-bit field on a 128 bit machine will be made by first reading a full word, then modifying 8 bits, and then writing. Although this is not inherently atomic, if no assignments are made to other 8-bit fields of the same 128-bit word, it will be safe. Thus, one solution is to have all variables allocated as 8-bit words, but to pad them all out to 128 bits with unused fields.

    Comments: Five did well here; ten more gave partial answers (usually proposing that if all variables are 8-bit, that will suffice). Too many students proposed solving the problem by relying on lower level mutual exclusion mechanisms, for example, by using spin-locks to guard each reference to a shared variable.

    Problem, part B: Suppose you had a system with two processes communicating by messages using a blocking receive primitive, a buffered nonblocking send, and the possibility of using multiple threads. Suppose these processes need to achieve mutual exclusion and a decision is made to use Peterson's algorithm. How does a process decide when it can enter the section, and how is the shared "turn" variable managed?

    A Solution: Every "I am interested" message must be answered with an immediate reply. If an "I am interested" message is received between the time an inquiry is sent and the reply is received, there is a conflict over entry to the critical section. Therefore, the reply must take one two forms, either "No conflict" or "Conflict".

    Note, as an aside, that it is possible that when two processes compete, call them A and B, only one of them, call it A will detect the conflict using the above rules; but in this case, the reply saying "Conflict" from A to B must pass the "I am interested" message from A to B, and in this case, B knows there is a conflict before it even received A's inquiry, and B can easily defer processing of the "Conflict" notice until it receives the inquiry.

    Since the variable turn cannot be shared, we must have duplicate copies of turn for each process. The problem, then, is to assure that these copies remain consistant with each other. In the shared memory version of Peterson's algorithm, a process wishing to enter the section sets turn to it's own ID, so in the absence of a conflict, receipt of "I am interested" should set interested_other to true and turn to other; similarly, the receipt of "No conflict" should set turn to self.

    When "I am interested" is received and a collision is known to have occurred, either because of prior receipt of "Conflict" or because we are between an inquiry and its response, interested_other should be set to true and turn should be complemented. Entry to the critical section is legal using the normal rules of Peterson's algorithm once the reply is received.

    Comments: Eight did well here, and four more did reasonably, suggesting such things as division of each process into an agent thread (responsible for sending replies) and a primary thread that actually gets to enter the critical section. Unfortunately nine didn't recognize that shared variables were implicitly precluded by the problem statement, so they left turn shared, while doing everything else by message passing.

  4. Problem, part A: What security risks are posed by this access control scheme:

    Message from client to server:
    Network address of server
    Network address of client
    Resource ID
    Client ID
    Operation
    Parameters to operation

    The server must check that the ACL for the given resource allows the specified client to perform the indicated operation before that operation is performed. The client-ID field of the message contains a globally unique identifier for the client's domain.

    A Solution: The fundamental problem is that the client-ID field is sent "in the clear" so that anyone may use any client ID they wish. Servers may easily collect client IDs and misuse them, and clients may easily try randomly chose ID's to try them.

    Comments: Fifteen did well here; some were distracted by the idea that the "set access rights" operation was unprotected. In fact, this is protected by an access control list entry, and it is only vulnerable to misuse if others can claim the client ID needed to use that entry.

    Problem, part B: Explain the consequences of having clients use a different Client ID to deal with each resource, for example H(<Client_ID, Resource_ID>), where H is a one-way hashing function computed by the client, so that the server never knows the true identity of the client.

    A Solution: Assuming that H is publically known, but that the actual client ID is kept as a tightly held secret by the client, this actually overcomes the problems with servers collecting client ID's. Clients can still make trial and error attacks, but if the valid IDs are few and far between, this can be made arbitrarily difficult. The weakness of the proposed scheme is that, to grant access to a particular client, the granting process must know the actual client ID, thus requiring that this secret be divulged when rights are distributed. This is a serious flaw.

    Comments: Five did well here, another twelve recognized that this proposal didn't defeat random guessing, but too many thought that it either corrected all deficiencies in the original scheme, failed to understand the distinction between network address and client ID, or had other severe problems.

  5. Problem, part A: After a process has migrated from machine to machine over the whole network, visiting each machine at least once, and never sending or being sent any messages during that time, what is the topology of the directed graph composed of the forwarding addresses for that process? Subsidiary questions include such things as, will the graph contain cycles, will it be a connected, and how will the structure relate to the current location of the process?

    A Solution: In short, the graph will be an acyclic directed graph wherein there is exactly one path from each vertex to the vertex that models the current location of the process. Formally, such a structure is a spanning tree of the graph, although all edges of the graph have a counterintuitive direction (they point towards the root, not away from it).

    Comments: Ten did well here, and eight more understood a good part of what was going on.

    Problem, part B: In graph-theoretic terms, what is the maximum possible impact on the directed graph of forwarding addresses when a single message is sent from some machine to the process in question.

    The Solutions: There are two reasonable readings of the problem statement. One leads to a conclusion that every forwarding address along the path from sender to receiver is updated to point directly to the destination, and the other leads to the conclusion that every address along the path is updated to leapfrog its successor (splitting the path into two, each half the length. In the extreme, the former turns a long stringy tree into a flat tree (all leaves being children of the root), while the latter turns the long stringy tree into a two-branched tree half as tall.

    Comments: Ten did well here, while an equal number did very poorly. Many apparently misread the question, missing the key phrase "on the directed graph", and as a result, giving answers that focused on the economics of the transaction.

  6. Problem, part A: What additions to the idea of a probe protocol are needed to build a complete garbage collector? Explain the added components of the protocol, and explain the top-level algorithm that drives the protocol and results in collection of unreachable objects.

    A Solution: The probe protocol given is effectively the marking phase of a garbage collector. If the probe is initiated from all of the roots of the graph of reachable items, it will mark all reachable items. To build a complete mark-sweep garbage collector from this, three things are needed: First, the probe protocol given includes no mechanism for detecting termination. Second, a sweep protocol is needed -- for example, a broadcast message to all servers saying they should delete all unmarked items and erase the marks on all marked items. Finally, there must be a well defined set of roots.

    Comments: Six did well here, while twice as many had serious problems.

    Problem, part B: What problems would you expect to encounter in Amoeba with the implementation of such a protocol? In your answer, emphasize high level problems that might prevent the effective use of garbage collection in Amoeba.

    A Solution: The single most serious problem with such a protocol on Amoeba is that every process on the whole system must be involved, including user processes. As a rule, file servers (and perhaps other kinds of servers as well) don't know which data in their files is just data, and which is capabilities. Only the users of those files (or other objects) know, and every user must faithfully implement the protocol, because any failure to do so will lead to reachable objects that are not marked and are therefore collected. Related to this is the problem of finding the roots from which the probes must begin, since many of the roots are inside user processes.

    Comments: This was very difficult, with twenty receiving no credit at all.

  7. Background: Consider a distributed system that includes many file servers and directory servers. This system is similar to Amoeba in the sense that users operate on files and directories using network capabilities. Directories are logically mappings from names to capabilities, so each directory entry may refer to either a file or another directory.

    Directory servers, of course, use files to store directories. Thus, when a user presents a capability to a directory server, it may have to access a file in order to make use of that capability.

    Recall, also, that the standard on UNIX is that every directory contains two special entries, . (dot), a directory entry referring to itself, and .. (dot-dot), a directory entry referring to the parent directory. With the exception of the above special directory entries, A UNIX directory structure is a tree, with every directory being referenced from exactly one other directory, and exactly one root.

    Problem, part A: Given that an Amoeba-like distributed system makes an effort to accurately mimic the UNIX file system, what atomic transactions would you expect to be involved in creating and deleting files? In creating or deleting directories?

    A Solution: The creation and deletion of both files and directories should appear atomic, so each of these is an atomic transaction.

    Comments: A remarkable number went on at great length to give this essentially trivial answer. About half the class did well here, while many of the others went on at such length that they somehow managed to lose track of the question.

    Problem, part B: Assuming two-phase locking, are these transactions prone to deadlock?

    A Solution: All of these transactions involve at least one directory server and two file servers, and directory creation and deletion involves two directory servers. In the case of file creation and deletion, deadlock is not possible because there is an implicit hierarchy -- the directory server must be involved in the transaction first because it, in turn, would need to lock the file server where the directory is stored. The file being created or deleted can be locked independently of the former, and will not conflict with them.

    In the case of directory creation, things are more complex. As with file creation and destruction, directories must inherently be claimed before the files used to represent them, but there is no necessary order of claiming the directories. Thus, in the extreme case, if two users try to delete the same directory, but one claims the parent first, and then the child, while the other claims the child first, and then the parent, a deadlock would result. This can be trivially eliminated by imposing the obvious locking order -- directories must be locked in hierarchical order, parent before child.

    Comments: Only four did well here, with five more suggisting an overkill order (lock the entire tree from the root down to the directory in which creation or deletion takes place), and a surprising number claiming that, because there was an ordering that avoided deadlock, that deadlock was not a problem -- note that the fact that a problem can be solved does not imply that no problem existed!

    Problem, part C: To what extend do the transactions you have identified span multiple servers in the distributed environment described here? What kinds of commit protocols and deadlock detection protocols are implied by this?

    A Solution: All of these transactions span multiple servers. As a result, two-phase commit protocols are appropriate, and if dynamic deadlock detection is needed, a probe protocol for a single-resource model would be appropriate.

    Comments: A surprising number suggested that this was an and-model problem!