Final Comments and Solutions

22C:116, Spring 1999

Exam Grade Distribution

Mean   = 14.0
Median = 14.0
                 X     X                 X
    _________X___X_X_X_X___X___X___X_____X_X_______X_X_X_X_______X____
    . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.
Exam Grades: C   +|- - B   B   B   B + +|- - A   A   A   A + +

Exam Solutions

Thunks may be used to implement pass-by-reference semantics in the context of parameters to an RPC as follows:
Each reference parameter is passed as a pair of thunks, one available to use for examining the referenced variable, and one available to use for assignment to the referenced variable. The thunks reside in the same address space as the referenced variable, and thus, they are in a different address space from the remote procedure to which they are passed. As a result, RPC mechanisms are used for each call to a thunk.
5 gave good answers. 5 more gave similar wrong answers involving passing the actual pointer to the remote procedure, which would then pass the pointer back to some function on the calling machine that the answers mis-described as a thunk. Since this function was not itself passed as a parameter, it is not a thunk, but this solution would work, so this common wrong answer was given significant credit. Vague, confused or entirely wrong answers made up the remainder of the responses; this was particularly dissapointing because this question was almost literally taken from the study questions distributed a week in advance.
Part A: If an Amoeba client wishes to use thunks to pass parameters to an Amoeba server, the client must create a thunk-server. This thunk server must be a new thread of the calling process. If it were a distinct process, it would not share the same address space as the caller, and therefore, it would be unable to address the actual parameter. The thunk server cannot be part of the calling thread because the calling thread is blocked by trans() for the duration of the RPC.
4 gave good answers; again, the number was dissapointingly small because this queston was directly based on a study question. 7 gave faulty reasons for rejecting use of a new thread, 4 gave faulty reasons involving use of a separate process, and 2 gave faulty reasons involving use of part of the calling thread. 9 went on, sometimes at great length, on various irrelevant issues.
Part B: When a client thread wishes to pass a thunk to an Amoeba server, the actual parameter is a capability. The capability is a reference to the thunk, so the client uses it to make a call to trans() in order to perform the RPC to the thunk.
There was only one good answer given to this part; 5 more gave vague answers that were at least somewhat relevant. Most earned no credit by giving answers that went into sometimes considerable detail about Amoeba capabilities without saying anything about thunks or anything else specific to the question at hand.
Part C: When a client thread wishes to pass a thunk to an Amoeba server, the actual parameter is a capability. The client creates the thunk server, so of course, the client creates all capabilities for the services of the thunk server. Specifically, in creating the thunk server itself, the client assigns the get-port number for the server, and therefore, it can compute the put-port field needed to create the capability. The object ID field of the capability can be used to identify each particular thunk that is relevant to this RPC, and the check field can be assigned to prevent unauthorized use of the thunks by other users.
2 gave good answers; 5 suggested that the capability could be obtained by a call to the create operation of the thunk server. This amounts to circular reasoning! The caller of the create operation is the very same function that created the thunk server in the first place, so having the thunk-server support a create operation solves no problems at all! Most answers given were entirely wrong.
Part A: The top-level design of the code given on the exam contained a major error! Specifically, if there are no other ready threads at the time thread_read() is called, the thread package will terminate with an error message when thread_read() calls thread_wait().
7 gave essentially this answer. 2 earned partial credit for noticing that the variable d was overloaded -- this was an unintentional error on my part, and although it was clearly not a top-level design error, I gave credit for it because once found, it could easily prevent other errors from being noticed. 7 students earned no credit.
Part B: The structure done-rec was added to the given code because simply passing the semaphore was as a parameter forces the delayed character count dcc to be referenced as a global variable. If this is done, two concurrent or overlapping calls to thread_read() could lead to one I/O completion corrupting the value of dcc intended for use by the other I/O operation.
8 earned full dredit here, 3 gave garbled or vague answers that seemed worthy of partial credit. The remainder earned no credit.
Part C: The implementation of nb_read() for a disk file must include two distinct calls to the done() callback. One of these belongs in the code for completing an I/O operation by reading from the in-memory sector buffer, and the other belongs in the code in the disk interrupt service routine for completing a read by performing actual I/O. No matter how many sectors are found in the cache and how many require disk I/O, all reads must complete in one of these two ways, so there are no other cases to consider.
2 gave good answers here, 4 more gave vague answers worthy of partial credit, and 5 got half credit for answers that missed the possibility of completing a read using data from the sector cache in memory, despite the fact that this cache was explicitly mentioned in the question. The remaining 7 received no credit.
Part A: When two elections are initiated concurrently, the extra election can be eliminated by incorporating the unique ID number of the originator of each election into the messages concerning that election. When a machine begins participating in an election, it remembers this number until that election terminates. When a machine receives a second election notice during the conduct of an election, if the second notice originated on a machine with a winning number, the recipient begins participating in the new election and forgets about the old one. If the second notice originated on a losing machine, the second notice is discarded. As a result, only one election will run to completion, and all machines will participate in that election.
13 gave good answers here -- this was to be expected, as this question was based on a study question. Two more suggested using timestamps instead of process or machine ID numbers; this is incorrect, but because all other details were correct, these received good partial credit.
Part B: The election algorithm described did not survive many faults. Specifically, if any machine fails between the time it first begins participating in an election and the time that election terminates, the algorithm will block indefinitely! The solution to this is to have each participant, including the root, start a timer when the election begins, and if this timer expires, begin a new election. Anything short of starting a new election is vulnerable to failures in the root, so less extreme recovery actions are merely optimizations.
13 gave valid examples of failures the given algorithm could not survive, and 6 gave complete fixes to make the algorithm fault tolerant. 5 stopped after saying "use a timeout" and either said nothing about the action to take after the timeout or gave an incorrect response. 3 more gave solutions that omitted timeouts. 3 received no credit.
Part A: If users A and B wish to share a binary tree or a linked list in page X, they must contend with the fact that page X may be in different virtual addresses in the two address spaces, and thus, the meanings of pointers may differ. Specifically, user A may see pointer p as a reference to object q, while user B sees the same pointer as referring to an entirely different object.
6 gave good answers. 9 received no credit. Many wrong answers somehow failed to recognize that the actual memory was shared between users A and B.
Part B: This problem can be solved by storing relative pointers in the data structure. If user A stores the structure at address y and user B stores the structure at address z, a pointer p extracted from this structure would be correctly interpreted if A used the value p+y and B used the value p+z.
2 gave this answer, 2 gave answers that required the kernel to interpret all pointers in shared structures (very strange) while 4 somehow got distracted with questions of distributed shared memory in Mach. The remainder gave answers that were even harder to classify.
Part C: Normally, the frame-table entry for a page would indicate what address space contained the page and what address in that space. In the case of a shared page, the entry must indicate all address spaces involved, and for each address space, what address in that space. This clearly requires that the frame-table entry contain, for shared pages, a pointer to a list of entries, one for each user that shares the page.
Full credit was rare, but 7 understood the need for back-references to each address space sharing the page, while omitting reference to the need for a list or other dynamically-sized data structure. 5 gave "textbook" answers about the structure of a generic frame table, with no reference to the impact of sharing.
The most effective experiments for determining the size of the available cache inside the server would involve timed read operations on a file. The first read for any sector of a file brings that sector into cache. If the second read of the same sector is fast, the sector is still in cache; if the second read is slow, the sector must have been flushed from cache. Therefore, reading a file of N sectors once, then reading it again, one sector at a time, measuring the time taken to read each sector, if N is smaller than the cache size, in sectors, all times will be fast. If N is larger, some reads will be slow. Doing this for successively larger values of N will determine the cache size.
6 proposed essentially this scheme. 1 more forgot to pre-fill the cache, and 3 earned partial credit with an interesting but incorrect method based on the ratio of the times taken for cached and non-cached reads.
Part A: If a user wishes to write a user-level page-fault handler for this system, the only reasonable page replacement policy is FIFO.
11 said FIFO, and 7 said random, for no loss of credit. In fact, it is difficult to make random work here because the only easy way to find out if a page is already mapped is to unmap it, and this can't be done until after the page is written to disk. 5 suggested clock replacement, 7 suggested LRU replacement, and 2 suggested Belady's optimal policy; none of these is applicable because all of these rely on information that is unavailable using the kernel interface that was given.
Part B: Here is complete code (Ignoring the issue of page replacement policy) for the user-level page-fault handler.
```
/* somewhere in the initialization code */
on_fault( handler );

/* The handler */
handler( void* va )
{
    while (map( va ) == false) {
        a = page_to_replace();
        write( backing_file, a, page_size );
        (void) unmap( a );
    }
    read( backing_file, va, page_size );
    remember_to_replace_later( va );
    return;
}
```
3 gave good answers, and 5 gave good answers except that they omitted to copy the pages to and from backing store. Note that the while loop in the answer was not required for full credit, although it is, in fact, necessary, although the need for it is subtle. 4 gave answers that hinted at fragmentary understanding of the problem, earning half credit, while 3 gave essentially textbook code for a kernel-level fault handler, for minimal credit.