Exam 2: Final

Solutions and Commentary

Part of the homework for 22C:112, Spring 2009
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Final Exam Distribution

Median = 7.5            X
Mean   = 8.33           X X
                        X X
      X       X   X   X X X   X X     X   X                 X
____X_X___X___X___X_X_X_X_X___X_X_____X_X_X_______X_________X___________X___
 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 16. 17. 18. 19 .

Midterm plus Final Distribution

Median = 11.3           X
Mean   = 12.63        X X
                  X   X X   X
                X X   X X X X X X X
 _____________X_X_X_X_X_X_X_X_X_X_X___X_X_____X_X_X___X___________
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Machine Problems 1 to 5 Distribution

Median = 18.5                                       X
Mean   = 18.15                      X   X   X       X
                                  X X   X   X     X X
                                  X X   X   X   X X X
 _____________X_____X_X_____X_X_X_X_X_X_X_X_X___X_X_X_____________
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Top 10 out of 12 Homework Distribution

                                                          X X  
                                                          X X  
Median = 27.8                                             X X  
Mean   = 26.48                                          X X X  
                                                        X X X X
                                                  X     X X X X
                                                  X     X X X X
 _____________________________________X___X_X___X_X___X_X_X_X_X___
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Total Score Distribution

Median = 57.6                       X X     X  
Mean   = 57.27                      X X     X  
                                    X X     X X
                          X         X X X   X X           X
 _____________X_____X___X_X_X_______X_X_X_X_X_X___X_X_____X_X_X___
   20. 24. 28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80
Grade Scale      D    +   -   C   +   -   B   +   -    A    +

Exam Quesitons and Solutions

Background: In an ideal world, when a Unix program calls read(fd,buf,len) bytes read from the device would be directly transferred to buf.
a) Suppose the user buffer is in demand paged virtual memory and the I/O driver directly moves data from the device to the user's buffer. Discuss the likely consequences of this in the event that some page of the user's buffer is not currently in main memory.

Note that the interrupt handler is the part of the driver that typically moves the data. This means that the interrupt handler must now recognize and handle page faults -- a time consuming operation that involves initiating and awaiting completion of at least one disk read and possibly a preliminary disk write.
1/4 of the class gave good answers. 1/8 earned no credit. Typical answers earning partial credit suggested that the likely problem was data loss. Data loss would be a possible consequence, but without an explanation of why data might be lost, this answer is best classified as jumping to an unwarranted conclusion. (There is a way to eliminate page faults in the interrupt handler, but nobody came close to suggesting it.)

b) Suppose the user buffer is in demand paged virtual memory. Explain how the problems you identified in part a) are alleviated by having the interrupt service routine transfer the data to an intermediate buffer before transfer to the user's address space.

Because the intermediate buffer is guaranteed to be in main memory, the interrupt handler no-longer needs to worry about causing page faults. The user-side service routine would be responsible for copying data between the user address space and the intermediate buffer. This copy operation is far easier to interrupt to handle a page fault.
Only 2 did well here, but half the class got partial credit for interesting but incompletely thought-out answers. Common answers held that the use of an intermediate buffer eliminated the problem of page faults. It does not, it only moves where they are encountered.

c) Suppose the user's buffer is smaller than a disk sector and fd refers to a disk file. Explain why the interrupt service routine for the device will probably transfer the data to an intermediate buffer before transfer to the user's address space.

Since disk drivers typically transfer entire sectors to or from disk, an intermediate buffer must be use to read the entire sector from disk first, and then the desired fragment of this buffer must be copied to the user's buffer. Things are more complex if the user's desired read spans two disk sectors.
1/4 did well here. 1/4 more earned partial credit.

d) Can the buffers used in parts b) and c) be the same buffer, or is it necessary to copy each data item from one intermediate buffer to another before delivery to the user? Justify your answer.

These can be the same buffer. The natural size for this buffer is one disk sector, and it can be guaranteed to be in main memory. The interrupt service routine reads the buffer from disk, while the user-side code in the driver does the copy of the desired data to the user's buffer.
1/4 did well here. 1/3 earned partial credit, typically by saying yes without offering meaningful justificaitons for their answer.
Background: Consider a system with two input devices, a mouse and a keyboard. The mouse driver presents mouse input as a sequence of mouse events (move, button down, button up). The keyboard driver presents keyboard input as a sequence of keyboard events (key down, key up). Both drivers include event queues.
Programmers writing applications with a Window-Mouse-Icon-Pointer (WIMP) user interface need a single event queue that mixes mouse events with keyboard events. What is needed, therefore, is a "Y" connection to merge the keyboard event queue with the mouse event queue to make a combined event queue.
Assume that the three queues use compatable event records and assume that appropriate enqueue and dequeue operations exist for operating on queues of such event records. Each queue holds a sequence of event records, and each also incorporates several semaphores, and the enqueue and dequeue operations do all necessary waiting and signalling on these semaphores.
a) Consider this design choice for the "Y" connection: Consider using two processes, one to copy events from the keyboard queue to the combined queue, and one to copy events from the mouse queue to the combined queue. Outline the process that copies from the mouse queue, focusing on the control structure.
```
for (;;) { /* an infinite loop */
        combined_queue.enqueue( mouse_queue.dequeue() );
}
```
2 did well. 2/3 earned partial credit. Most solutions were far too complex, attempting to do too much. In addition, many solutions included no loop, so they only copied one item between the two queues. Misuse of semaphores was common, for example, writing code that began if semaphore ==, although this is not done in any normative code using semaphores.
b) Outline the code for the enqueue and dequeue operations, focusing on the wait and signal operations on the queue's semaphores. You may omit all details having to do with actually managing the queues of event records.
```
enqueue(queue q; event_record e){
        q->space.wait(); /* wait for free space in queue */
        q->mutex.wait(); /* begin critical section */
        /* do whatever is involved with enqueueing e in q */
        q->mutex.signal(); /* end critical section */
        q->data.signal(); /* signal that the queue now holds some data */
}
event_record enqueue(queue q){
        event_record e
        q->data.wait(); /* wait for data in queue */
        q->mutex.wait(); /* begin critical section */
        /* do whatever is involved with dequeueing the e from q */
        q->mutex.signal(); /* end critical section */
        q->space.signal(); /* signal that there is now some free space */
        return e;
}
```
This is pure cut-and-paste work from the notes. 1/6 of the class did well. 1/3 of the class earned partial credit, typically by offering mutual exclusion or a solution to the producer-consumer problem but not both.
c) What are the initial values of the different semaphores associated with each queue.
For each queue:
- data is zero;
- space is the available free space;
- mutex is one;

1/5 of the class did well here. 1/4 offered some correct initial values.

d) The 3 queues, each with an enqueue a dequeue, plus two processes to merge the queues, make a total of 8 software components. Just one of these components actually handles the central vertex of the "Y" and actually merges the two data streams. Identify it.

This is essentially a multiple choice question with all options presented in the problem statement. Of these options, the best answer is that the enqueue operation on the combined queue actually merges the two data streams.
1/2 did well here. A few earned partial credit for mentioning the combined queue (not distinguishing between enqueue and dequeue) or for mentioning the two merge processes (both of which call the enqueue routine on the combined queue).
Background: In the Unix file system, a directory is a file holding a sequence of directory entries, where each directory entry consists of an i-node number and the textual name of the file (relative to that directory). I-node numbers are the indices of i-nodes in the i-table, where i-nodes describe where on disk the file is stored and other information about the file. Where on the disk the file is stored is recorded using an irregular but wide tree structure rooted at the i-node.
The access rights associated with each i-node indicate whether the owner, members of the group or members of the public have read, write, and or execute access (a total of 9 bits). For directories, execute access means the right to read a directory entry in order to open a file without the right to read the whole directory in order to see what's there.
a) Why do i-nodes contain a count of the number of directory entries that refer to them?

So that the system can note when this count goes to zero, allowing the file to be deleted and the i-node to be marked as unused, available for reuse to describe some other file.
1/4 did well here. About 1/2 earned partial credit, typically for limiting their answers to the deletion of directories or for discussing error recovery in the file system.

b) Suppose a user calls open("/Users/jones/file", O_WRONLY, 0 ). What is the minimum number of primitive disk read operations required to determine if this open operation is legal? Describe, for each disk read operation, what is being read. Assume here that the operating system does not maintain any internal caches.

6 orr 7 disk reads:
- 0 - read the i-node for / (the root of the file system).
- 1 - read the data from the directory /
- 2 - read the i-node for /Users
- 3 - read the data from the directory /Users
- 4 - read the i-node for /Users/jones
- 5 - read the data from the directory /Users/jones
- 6 - read the i-node for /Users/jones/file
Only one got full credit and 2 more got generous partial credit. 1/4 listed only the 3 directories involved or only the 3 i-nodes. 1/2 listed only one i-node or one i-node plus one directory.

c) Consider the sequence of kernel calls write(f,buf,512); close(f). If disk sectors are 512 bytes, this will obviously cause one sector of data to be written to the the file. What other disk write is certain to take place, and what other disk write(s) might take place, depending on the number of previous writes to the file.

At the very least, closing the file requires the i-node to be written back to disk, updated with the new file length, the sector added to the file, and the date of last modification. If the file was large, it also requires updating one or more index sectors.
1 did well. Among those earning partial credit, 1/3 updated the i-node attributes, 1/6 updated the wrong i-node attributes, and 1/10 mentioned a sector pointer, vaguely.

d) Consider a sequence of write(f,"x",1) kernel calls. All Unix systems cache i-nodes for currently open files in main memory. What would be the performance penalty, for this example, if this were not done?

If i-nodes are not cached, each write of one byte would involve updating the file length on disk. This probably would involve both reading and writing the i-node.
2 did well. 1/3 earned partitial credit, mostly by saying that the i-node would be re-read once per write, without explaining why or noting the need to also re-write the i-node after each write.
Background: Most computers include a global interrupt-enable bit inside the CPU, and most devices include a local interrupt-enable bit in the device. A particular device can only request an interrupt if both enable bits are set and if the device is actually requesting an interrupt.
a) Why does entry to an interrupt service routine typically turn off the global interrupt-enable bit?

Between entry to the interrupt service routine and whatever action causes the device to retract its interrupt request, interrupts must be disabled. This is typically done by automatically turning off the global interrupt enable bit on entry to the service routine.
2 did well. 1/3 earned partial credit by saying this was to protect a critical section. 1/2 said the reason was to protect other interrupts, when in fact, this is not necessarily a problem.

b) Suppose the device's interrupt service routine is slow, and there are other devices that need fast response. How does the interrupt service routine go about allowing other devices to interrupt it while avoiding the problems you described above?

First disable the device specific interrupt for its own device (and possibly for other lower priority devices) and then enable interrupts globally.
1/6 got full credit. Only a few earned partial credit.

c) What part of the device driver will routinely turn on the device's local interrupt-enable bit? Please exclude any manipulation of the device's interrupt enable bit discussed in part b).

The user-side of the driver, after performing the requisite queue operaiton (dequeue from an input queue, enqueue in an output queue).
3 did well. 1/4 earned partial credit, largely for answers that were variously vague or confused.

d) Under what circumstances should the device driver turn off the device's local interrupt-enable bit? Please exclude any manipulation of the device's interrupt enable bit discussed in part b).

If there is no pending input/output, as indicated by the contents of the queue connecting the user-side of the driver with the interrupt handler, the appropriate thing to do is to disable the device's interrupt enable bit.
1/6 did well. 1/5 earned partial credit. In addition to vague answers, common problems involved misuse of terminology.
An open file in Unix is accessed using a file descriptor. Conceptually, this is a handle on an open file object, but actually, it is a small integer index into a table of open files. For example, when the user calls read(fd,buf,len), the system uses file_table[fd] to actually access the open file.
For this discussion, assume a machine with virtual memory, with one address space per user process and an MMU that is turned off whenever a trap-service routine or interrupt-service routine is called.
a) Why doesn't open() just return a pointer to the file object, so that the system can avoid the need for the open file table?

If the user, programing in a language like C had direct access to the pointer, the fact that C (and many other programming languages) are not type safe would lead to the potential for the user to corrupt the pointer, leading to the potential to corrupt arbitrary data within the operating system.
1/5 did well, 1/3 earned partial credit, mostly for vague answers, but in a few cases, for answers that were specific to standard input and standard output. In the latter case, note that the way Unix deals with standard input and standard output is a consequence of the use of file descriptors, not the cause.

b) Given that each process has an open file table, the system can use it to do something useful when the process calls exit(). What, and why would this be difficult without such a table?

It makes it easy to find all files a process has open, in order to close all of them when the process terminates.
1/2 got this right. 1/3 earned partial credit, usually for saying that it allows the system to track the files a process has open without suggesting any reason why this might be useful.

c) When the user calls read(fd,buf,len), why can't the system's read routine transfer data to the user's buffer by directly accessing the array buf[]?

The array buf[] is defined in terms of the virtual address space of the user process. This cannot be directly referenced from code that runs with the MMU turned off. (The system must therefore translate the addresses in the buffer to physical addresses using software.)
1/3 got full credit. Only 1 earned partial credit.

d) When the user calls read(), control eventually transfers to a routine called read() inside the operating system kernel. How does it get there?

The call to read() typically calls a user-side stub routine in the user's address space. The body of this routine may be as small as a single instruction, forcing a trap. The trap handler in the system identifies this trap as an attempt to call read() and transfers control to the system's read() routine. (This, in turn, identifies the device being read and calls the appropriate device dependent routine.)
3 did well. under 1/2 got partial credit for vague answers or statements about calling a system call.
Background: Consider the notation (A;B) meaning do A and B in the order given. In C, if A and B are statements, we can do this with {A;B}.
Consider the notation (A,B) meaing do A and B in any order or in parallel. In C on a Unix system, where fork() returns the process ID of the new process to its parent and wait() returns the process ID of the process that just exited, we can do this with:
```
    if (fork() == 0) { /* child */
            C;
            exit();
    } else { /* parent */
            D;
            wait();
    }
```
What we want is the following:

(A;((B;(C,D);E),(F;(G,H);I));J)

If we do the steps in the order ABCDEFGHIJ, we have satisfied the above order, but many other orders also satisfy this constraint, such as AFHGIBDCEJ and ABFDHCGEIJ. Our first hack at writing code to execute the statements in the correct order in C for a Unix system is:
```
    A;
    if (fork() == 0) { /* child */
            B;
            if (fork() == 0) { /* child */
                    C;
                    exit();
            } else { /* parent */
                    D;
                    wait();
            }
            E;
            exit();
    } else { /* parent */
            F;
            if (fork() == 0) { /* child */
                    G;
                    exit();
            } else { /* parent */
                    H;
                    wait();
            }
            I;
            wait();
    }
    J;
```
a) The above code, as it turns out, is wrong. When you run it, the statements are executed in the order AFHBDCEIGJ, an order that violates our rules. What is wrong with this and how did it happen?

The subsequence EIG is wrong. (I should not occur before G.) It happened because the exit() following E sent a signal that was accepted by the wait() following H.
1/6 did well here. 1/5 earned partial credit, mostly by correctly identifying that step I had occurred too soon.

b) Speaking broadly, that is, not trying to write detailed code, what do we need to do to fix the above code and make it conform to our spec?

The problem is that wait() continues after any child process terminates, and when a process has multiple children, we may need a more specific way to wait.

c) Suggest a rewrite to this code that incorporates what you outlined in part b) and provides a better foundation for a solution to the larger problem.
```
    if (fork() == 0) { /* child */
            exit();
    } else { /* parent */
            wait();
    }
```
```
process_id pid = fork();
if (pid == 0) { /* child */
        exit();
} else { /* parent */
        do { /* nothing */ } while (wait() != pid);
}
```
2 gave answers equivalent to the above, and 2 more gave slightly less general frameworks. A few gave complex and wrong answers that were worth partial credit, but most solutions were both very complex and hopelessly wrong.
Background: A modern supercomputer will typically consist of a network tens to hundreds of computer modules, where each computer in the network is itself a multiprocessor or multicore processor with 4 to 16 CPUs. The network connecting the computer modules has interfaces allowing very fast memory to memory block transfers between computer modules, and the CPUs in each computer module are able to share the local memory of the module. Assume that each CPU has its own MMU.
Recall that Unix includes services such as shmat() and mmap() that deal with creating and using shared memory regions.
a) If we decide to run Unix on this machine, when a process calls fork(), what constraints govern the placement of the new process relative to its parent?

The need to share memory has a major impact on placement. If two processes share memory, they will run more efficiently if they are on the same computer module.
2 did well, about 1/2 earned partial credit, either giving trivial constraints, suggesting that all processes needed to share the same MMU or (the most popular answer) that all processes should run on the same computer module.

b) Is there any hope for providing appliction programs with the ability to use mmap() to share memory between processes running on different computer modules?

Yes, if we use something like demand paging to shuffle memory blocks between computer modules as different processes attempt to access them. (In point of fact, several current commercial supercomputers are based on exactly this architecture.)
1 did well, while half earned partial credit by saying "yes" without any justification. Wrong answers included both unsupported statements about the system being slow and statements starting with "no".