Midterm II Solutions and Comments

Part of materials for 22C:50, Fall 2003
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Exam II Scores

Mean   = 3.8   X
Median = 3.6   X
             X X
             X X   X
             X X   X   X   X
             X X X X X X X X
   ____X_____X_X_X_X_X_X_X_X_X_X___X___________
    0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10

Total Scores: 2 Exams plus Homeworks 1-9 plus Machine Problems 1-3

Mean   = 47.4                                          X
Median = 47                                  X X X   X X X
                                           X X X X X X X X X X
   __________________________________X_X_X_X_X_X_X_X_X_X_X_X_X_X_X__
    0 . 4 . 8 . 12. 16. 20. 24. 28. 32. 36. 40. 44. 48. 52. 56. 60.
A very tentative grade scale            D      C      B      A

The Exam

The standard Unix/Linux shells usually treat the first item on each line as the name of a file from which an application is to be launched. When they launch an application, they create a new environment for that application, including standard input, standard output, and many other attributes (here the term environment is used in its broad sense). Explain why all Unix/Linux shells must have some built-in commands that are executed directly and do not involve launching applications. Give some example built-in commands that justify this reasoning. (2 points)

Commands that influence the environment of later commands must be executed as built-in commands to the shell because if you launch an application under Unix, any changes it makes to its environment are local to that application and have no effect after it terminates. Examples of such commands are cd and setenv (or any other command that sets shell variables). The most extreme example is exit, which terminates the current shell. Termination of an application launched by the shell cannot terminate the shell itself.
2 did well here, 8 had 3/4 credit and half had at least 1/4 credit. A popular incorrect answer that earned partial credit was to focus on input/output redirection using the < and > operators. These are not commands, they are operators, so while they are built-in and cannot be done by launching applications, they aren't really good examples.
Many listed ls, mkdir and rm as built-ins. These are not built in! All you needed to do to learn this was to try them under the example shell from the machine problem and look at that shell and notice that they aren't built in.
In a C program running on a Unix or Linux system, the code contains ch=getc(stdin), and some time later, the user hits a key on the keyboard. Getting the character from the keyboard to the program involves:
- entry to an interrupt service routine,
- return from the interrupt service routine,
- a read() call,
- return from the read() call,
- enqueue on the input or typeahead queue,
- dequeue from the input queue,
- a polling loop waiting for a non-empty queue, and
- transfer of data from a device register.
List these elements in the order in which they happen; for each, where is it in the layers of software between user and hardware. (2 points)

getc(), a middleware routine, called by user
a read() call, interface from middleware to kernel
a polling loop waiting for a non-empty queue, in read
The user hits a key on the keyboard
entry to an interrupt service routine, interface hardware to kernel
transfer of data from a device register, by the interrupt routine
enqueue on the input or typeahead queue, by the interrupt routine
return from the interrupt service routine
dequeue from the input queue, by read
return from the read() call, kernel to middleware
getc(), a middleware routine, returns to the user

Because so few gave the system layer involved, that part of the question was not graded. Among those who gave layers, the solutions were poor, so this aspect of the issue will be covered again on the final!
3 did well, 3 had only one item one place out of order, and half the class earned more than half credit. Errors were graded in terms of how far out of order an item was.
Many had answers that failed simple sanity tests: for example, entry to the keyboard interrupt service routine must be caused by hitting a key on the keyboard; answers that put these two out of order were clearly wrong.
Many virtual memory systems associate page numbers in the virtual address space with sector numbers on the backing store. In these systems, the concept of page frame applies only to main memory. The Berkeley Timesharing System, developed in the late 1960's, the backing store was divided into page frames just like main memory. Only when the decision was made to evict a page from main memory was a frame for that page allocated on the backing store, and that space was deallocated as soon as the page was brought back into main memory.
a) Why could the Berkeley scheme outperform systems that used fixed association of pages to locations in the backing store? (1 point)

This allows minimization of the latency delays involved in writing the page out to the backing store. In contrast, a fixed association would lead to an essentially random latency for these writes.
Only 1 gave a really good answer here, 3 more recognized latency issues. 13 got partial credit for noticing that the Berkeley scheme reduces the total memory requirement by avoiding the storage of duplicate copies of pages. While true, this has no useful effect on performance. Others got distracted by search algorithms and gave wrong answers, for example, asserting that a fixed association was computationally expensive compared to the Berkeley scheme, while in fact, fixed associations are almost free, while the Berkeley scheme, with its variable association, requires the expensive search.

b) Where would you save the backing store address of each page while that page is not in main memory? (1 point)

The page-frame field of the page-table entry for a page that is not in main memory is available, and is the obvious place to put the disk address of that page when it is in the backing store.
6 gave this answer, while 2 more said "in a page table", suggesting that, perhaps, there would be other page tables for this purpose, earning good partial credit. 5 said "in the frame table", an odd proposal, requiring complex search algorithms, but workable, and therefore worth half credit.
Suppose user program, running under a virtual memory operating system, calls the Unix/Linux kernel call write(d,buf,len). The size of the buffer is exactly (as luck would have it) the same as the size of the sector on the disk, and the current file position of file d is exactly aligned on a sector boundary of the file. The interface hardware supports direct memory access data transfers.
a) Assume the simplest DMA interface. What might force the operating system to make a memory-to-memory copy of the user's buffer prior to doing the DMA transfer? (1 point)

The obvious problem is, what happens if the user's buffer is split across non-contiguous page-frames.
Only 1 earned full credit on this, while 3 earned almost full credit for suggesting that the problem was pages that were not in main memory. (Why this should involve a memory-to-memory transfer is odd; just forcing a page fault on each page of the buffer before initiating the transfer would suffice).
Half credit was offered for the suggestion that doing the I/O from a system copy of the buffer would protect against user changes to the buffer after the call to write; while this could be an issue, it has nothing to do with all the constraints in the problem statement about buffer size, alignment and virtual memory. In sum, only half the class got a nonzero score here.

b) A DMA interface can handle scatter-read gather-write transfers if the controller takes not just one buffer address and one count, but a list of buffer addresses, each with a count, and then processes the entire list in one transfer. How does this help? (1 point)

If the physical memory holding the parts of the buffer is scattered across many page frames, a gather-write can write these to disk in one transfer without use of an intermediate buffer.
2 gave this explanation, and one more observed that this could be done without connecting the answer to part a. Only 6 more earned partial credit. The median score for the entire question (parts a and b) was 0.3 points, with only 2 scoring above 1.5 points.
Your boss tells you he wants to eliminate the inefficiencies in the Unix two-level file system (where the high level maps file names map to i-node numbers and then the low level maps i-node numbers to actual space on disk). His idea is to put the information Unix keeps in the i-node directly into the directory entry for each file.
a) What aspects of Unix file system semantics are not supported by this proposal? (1 point)

Links to files (that is, one file with multiple textual names) will not work with this alternative system, and the Unix approach to rebuilding the directory using redundancy in the directory/i-node scheme will be weakened if i-nodes are eliminated.
5 earned full credit mentioning links, while 1 earned full credit mentioning redundancy. Partial credit was available for some interesting wrong answers, many involved misunderstanding of the distinction between i-nodes and index sectors. Nothing about this proposal involved the elimination of index sectors. 12 students had nonzero scores on this part of the problem.

b) Given that new disk drives and even new storage technologies arrive on the market fairly frequently, why is this proposal likely to lead to higher development and maintenance costs? (1 point)

This proposal makes the directory structure depend on the type of drive used, so each new technology will require a rewrite of the entire directory manager instead of just the i-node interpreter, as in the Unix file system.
9 students said this well, and 9 more gave poor explanations that might have meant something like this, good for partial credit. The median score for this problem, parts a and b combined, was 1 point, and 2 students got full credit.