Exam 2: Final

Solutions and Commentary

Part of the homework for 22C:60, Fall 2007
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Final Exam Distribution

Median = 10      X
             X   X         X
             X X X     X   X     X
___________X_X_X_X___X_X_X_X_____X_X_X_____X___________
  0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24.

Exam Distribution -- Final plus Midterm

Mean   = 15.03     X                         X
Median = 14.6      X   X     X X             X X
_____________X_____X_X_X_X_X_X_X_X_____X_X___X_X___________X_____
  0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Machine Problem Distribution (MP1-MP6)

Mean   = 24.76
Median = 25.9
                          X     X     X X
            X     X       X   X X   X X X
____________X_____X_______X___X_X_X_X_X_X_X_
 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Homework Distribution (top 10 scores from HW1-HW12)

                                        X
                                        X
                                      X X
                                      X X
Mean   = 28.15                      X X X
Median = 28.5                       X X X
                              X     X X X
______________________________X_X_X_X_X_X___
 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Overall Distribution

Mean   =  67.94
Median =  68.3                            X
                        X           X X   X               X
__X_____X___________X_X_X_____X_____X_X___X___________X_X_X_X___X_X___________X_
 48. 50. 52. 54. 56. 58. 60. 62. 64. 66. 68. 70. 72. 74. 76. 78. 80. 82. 84. 86
 C      + + + +|- - - -        B        + + + +|- - - -        A        + + + +|

Exam and Solutions

Background: Consider a computer system where the memory management unit is turned on while application programs are executing, and is off during the execution of trap handlers and interrupt service routines. The operating system kernel services are executed as trap service routines. Application programs use demand-paged virtual memory, so page-fault service is important on this machine. Pages are 4K bytes, and so are disk sectors.
For this question, assume that a particular application program calls read(f,buf,len), where read() is a kernel call with the usual Unix semantics.
a) It is possible that the buffer pointed to by buf includes (or is included in) a page that is not marked as valid in the caller's page table. Therefore, there must be page faults associated with the execution of the system service. Explain briefly how the need for such page fault service is detected, focusing on the role of hardware and software.

Parameter validation software within the operating system kernel must check each virtual address provided by the user and, if valid, translate this virtual address to a physical address so that the kernel can reference the desired memory location. This address validation and translation software follows the same algorithm as is used by the memory management unit, and in the event that this software finds an invalid address, it calls the same subroutine as is called by the page-fault trap service routine.
This was a hard question, only 2 did well, 2 more had answers that were confused but basically right, and 3 more earned partial credit. The vast majority discussed how the memory management unit is used to translate virtual addresses, but in the context given for this quesiton, the MMU is turned off! Careful reading of the question would have helped. It is noteworthy that many students turned in the exam early.

b) It is possible that the buffer pointed to by buf involves a page that is marked as read-only in the caller's page table. What should the system do if such a page is involved?

Since read() will need to write data into the buffer from whatever file is being read, a read-only buffer is illegal. Read should therefore raise an exception of some kind. (In the official Unix definition, it sets errno to EFAULT, but such a specific answer was not expected here.)
4 did well here, 4 earned partial credit. Very careless reading of the question is the only possible explanation of the difficulty people had here. A remarkable number of students suggested that some kind of page fault could convert a read-only page into a read-write page allowing read to continue, or that because it was a read() command, there was no problem with a read-only buffer. This was intended as a give-away question!

c) Suppose len tells us that the buffer is a mere 2 bytes long. How many page faults of the type described in part a) might be required to complete the read() operation.

Two faults, if the two byte buffer spans a page boundary and neither page is currently resident in main memory.
10 did well here, 1 more earned partial credit. Many of the wrong answers assumed that the programmer wanted to read an entire disk sector and focused on how many consecutive read() commands would be required to read the sector with a 2-byte buffer. This misreading of the question is bizarre.
Background: A classic direct-memory-access interface is able to copy a single block of data between a device and a contiguous block of physical memory. Some interfaces allow more complex operations. An interface that supports scatter-read gather-write, for example, can read into or write from a discontinuous buffer. In this case, the buffer is specified by (typically) an array of ordered pairs, where each pair specifies the starting address and byte count of one chunk of the buffer.
For this question, assume that a particular application program calls read(f,buf,len), where read() is a kernel call with the usual Unix semantics. Suppose pages are 4K bytes, and so are disk sectors.
There would be no need for scatter-read gather-write for operations under the base assumptions that the file position and buf are both divisible by 4K and len is exactly 4K.
a) Suppose we change our base assumptions by using a value of buf that is not divisible by 4K. Explain if and how scatter-read gather-write can be of use.

In this case, the buffer will span two consecutive pages of the caller's virtual address space. It is unlikely that the page frames holding these pages are consecutive in physical memory, so we can construct a scatter-read buffer description that allows one read into the two halves of this discontiguous buffer. (We do this, of course, after making sure both pages are in main memory, but this extra detail is not an expected part of the answer.)
3 did well here, 6 earned half credit or better, saying that, yes, scatter read can be used, but giving faulty or empty explanations. 9 more said yes, and then gave reasons that were extremely wrong.

b) Suppose we change our base assumptions by using a file position that is not divisible by 4K. Explain if and how scatter-read gather-write can be of use.

The first answer that comes to mind is that it is of no use. However, that turns out to be incorrect, so that answer is worth partial credit. The system must read from two different disk sectors, and part of each sector must be placed in part of the user's buffer, while the other part must be discarded. For each of the two sectors to be read, a scatter-read buffer description can be constructed that directs the needed part of each disk sector to the correct part of the user's buffer while discarding the remainder by directing it to a scratch memory region.
3 gave the no-use answer for partial credit. 2 said scatter-read could be used, but gave no useful reason. For better credit, 4 more gave halfway decent explanations of how scatter-read might be useful, but 2 gave wrong reasons.
Background: If the address space is sufficiently large, each shared segment can be given a unique permanent virtual address. As a result, all programs that use a given shared segment will see it at the same virtual address, so the segment may contain absolute addresses in pointers from one part of the segment to another, and if one shared segment depends on another, it may use absolute addresses to reference the other.
In contrast, if the virtual address space is small, shared memory segments must be dynamically assigned to virtual addresses as they are inserted into user address spaces. As a result, it will frequently be the case that a given shared object will appear at different virtual addresses for each process that uses it.
The Unix/Linux mmap() function is not committed to either of the above models. It takes advice from the caller about where to place the shared segment, but it need not follow that advice. Whatever it does, it returns a pointer to the shared segment, wherever it was mapped into memory.
a) Two virtual-addressing models are describe above. On one of these models, mmap() would generally ignore the user's advice. Which one and why?

In the large address space model, by definition in the problem statement, each segment is given a unique permanent virtual address. Therefore, in this case, the mmap() function must ignore the user's advice about what virtual address to use.
4 gave good answers, 5 more gave muddled explanations of the correct answer. 4 more earned partial credit by explaining, correctly, why in the small address space model, address conflicts would sometimes force the mmap() service to ignore the user-s advice.

b) Suppose a shared segment contains executable code, for example, a shared library. Explain how code within that segment can call or branch to other code in the same shared library, even if the shared segment is in different virtual addresses in different address spaces.

It must use position independent, relative or self-relocating forms of address to address other parts of itself.
3 did well here, 5 earned partial credit. This problem was intended as a give-away, since it was a re-run of Homework 7 problem 3a.

c) In the small address space model described above, it is difficult for code in one shared segment to call code in another. Why?

Suppose segment A contains a call to code in segment B, but the addresses of A and B may differ from one address space to the other. Since both are shared segments, neither of them can be changed to reflect this difference. Since the relationship of A to B may vary, relative addressing cannot be used. As a result, the best that can be done is to require that the program that inserted B in the address space maintain the address of B and pass it to A.
4 did well. 10 earned partial credit. Typical problems revolved around attempting to give a solution to the problem instead of explaining what problem needed to be solved.

d) In the large address space model described above, why do we need a separate page table for each process?

We need some way to set each process's access rights so that, when that process is running, it has access only to the segments it is permitted to use.
3 did well, 2 earned partial credit by addressing the right issue poorly, and 5 earned partial credit with good explanations of the wrong issues. The most common error was to focus on the need a process has for private memory. Privacy (a matter of access rights) does not imply that the private segments do not have unique addresses in the large model.
Background: Kernel threads impose the overhead of a kernel call on each thread operation. User-level threads interact poorly with any kernel calls that block the calling process, for example, a call to read() on the Unix/Linux system. A compromise is to design kernel calls that are able to cooperate with a user-level thread package.
In Unix/Linux, read() always returns the number of bytes actually read; if a file is opened with the O_NONBLOCK option, read operations on that file will not block waiting for input but will only read those bytes already available from the device. If no bytes are available, the global variable errno is set to EAGAIN.
Consider the problem of writing a th_read() middleware routine that, from the user's perspective, behaves like a normal call to read() but is part of a user-level thread package. All files opened by users of this thread package are assumed to be opened with the O_NONBLOCK option set.
a) What should th_read() do when read() returns with errno set to EAGAIN? Note that it may do two different things, depending on whether or not there are other ready threads.

If there is another ready thread, th_read() should relinquish to that other thread, and when control eventually returns to it, it should try again. If there is no other ready thread, th_read() should block until data is available. A poor solution would involve busy waiting. The answer to part b) discusses something better. (Note that this answer may be split between the two parts in a range of different ways.)
10 did well here, 7 earned partial credit. Wrong answers typically involved assumptions that th_read() should have, from the user's perspective, behavior noticably different from read(), for example, by returning an error or raising an exception of the user doesn't type fast enough.

b) The design of Unix is such that the decision in read() to be blocking or nonblocking depends on how the file was opened. Consider an alternative, where the decision to block or not depends on the version of read() (or on a parameter to read()). Under what circumstance would it be better for th_read() to call the blocking version of read()? (the answer is strongly connected to part a).

It would be nice to code th_read() so it first checks to see if there are any ready threads. If so, it would operate as described in part a) above, relinquishing if a non-blocking read() indicated that it would block and then, when control returns, trying again. If there were no ready threads, th_read() would simply call the blocking read().
7 did well here, but only 1 earned partial credit. The most common error was to misinterpret blocking with something having to do with mutual exclusion. Technically, the problem here is a producer consumer problem. The user typing at the keyboard is the producer, and the thread calling th_read() is the consumer. A related misreading involved the assumption that all of the threads of the process were trying to read the same file. This seems like a very unlikely way to organize an application and therefore an odd and misleading focus for an answer.
Background: When a program on a Unix-like system needs to call a function that is implemented by launching an application, it calls to fork() and the child process immediately calls execve() while the parent uses wait() to await the completion of the application.
a) Suppose the calling process has multiple active user-level threads. Which of the kernel calls mentioned above causes problems for the thread manager, and why?

The parent, with multiple active user-level threads, will block all of them when it calls wait().
5 did well. 6 earned partial credit with a focus on wait() but with a poor explanation. Several wrong answers focused on allegations that there might be a deadlock problem here.

b) Suppose the calling process has multiple active kernel-level threads. Which of the kernel calls mentioned above may lead to undesirable semantics, and why?

When a process with multiple active kernel-level threads forks, it seems natural that they should all be split on a call to fork(). If this is done, the other threads, the ones that don't do the execve() and wait() could cause trouble.
4 did well, 5 focused on fork() but with a poor explanation. 2 more earned partial credit for good discussions of potential problems with wait() that were, ultimately, wrong.
Background: Consider the following bit of code from an implementation of FIFO queues developed for a multithreaded environment where relinquish() transfers control to any ready thread:
```
enqueue( char ch ) {
        while (tail == head) relinquish();  /* 1 */
        buffer[tail] = ch;                  /* 2 */
        tail = (tail + 1) % buffer_size;    /* 3 */
}
```
The above code worked just fine on a uniprocessor with cooperative user-level threads, but things began to fail when it was moved to a system with preemptive kernel threads on a multicore chip, but only when there are multiple threads producing data for the queue.
a) What potential damage, if any, is caused by a preemption between lines 1 and 2?

Another thread could call enqueue() and, finding tail and head unequal, could store a character in the exact same buffer location that this thread is about to use. The data the other thread enqueues will be lost when this thread executes line 2, and the effect will be as if the other thread enqueued an uninitialized value in the immediately following queue location.
7 did well here, 3 earned partial credit with answers suggesting that there could be a write operation to a full queue.

b) What potential damage, if any, is caused by a preemption between lines 2 and 3?

Another thread could call enqueue() and, finding tail and head equal, could store a character in the exact same buffer location that this thread has just used. The data the other thread enqueues will replace the data this thread enqueued and it will be as if this thread enqueued an uninitialized value in the immediately following queue location.
7 did well here, 6 more got partial credit for partial descriptions of the problem.

c) What potential damage, if any, is caused by a preemption within line 3?

Exactly the same damage as in part b, except that the immediately following queue location will not be used because the other thread's attempt to increment tail will be undone when this thread completes its own increment operation.
6 did well, 6 more gave answers that identified tail as the focus of the problem but then gave incorrect descriptions of the problem.

d) Why is this code safe if there is only a single producer?

Because only the producer increments tail, and all of the above conflicts involved conflicts with other producers.
10 did well here, while 5 more earned partial credit.

e) Is this code safe on a multicore CPU if thread scheduling is not preemptive?

No, because the presence of other CPUs allows instruction-level interleaving of execution.
4 did well here, while one earned partial credit. The typical wrong answer said that the lack of preemption solved the problem, which is simply wrong if there are multiple CPUs.
Consider this rewritten version of the code from the previous problem
```
enqueue( char ch ) {
        wait( spacesem );                   /* 1 */
        wait( mutex );                      /* 2 */
        buffer[tail] = ch;                  /* 3 */
        tail = (tail + 1) % buffer_size;    /* 4 */
        signal( mutex );                    /* 5 */
        signal( datasem );                  /* 6 */
}
```
a) Which semaphores above must be general semaphores and which are binary semaphores?

mutex can be binary. The others must be general.
12 did well here, and 5 more earned 2/3 credit, having failed to mention datasem. 3 earned partial credit saying that they must all general semaphores (in fact, mutex can be binary), and 3 more earned a little credit for answers that had just one of the semaphores correctly identified.

b) Which lines above are unnecessary if there is only one producer process or thread?

Lines 2 and 5 can be omitted.
11 did well here. 5 suggested omitting all semaphores for half credit, one suggested omitting lines 2, 5 and 6, a puzzling answer. No partial credit was given for the few who suggested omitting lines 1 and 6.

c) What are the initial values of all of the semaphores?

mutex should be one. datasem should be zero. spacesem should be buffer_size.
8 did well here, 4 more forgot to mention datasem but initialized the others correctly. 5 suggested that all should be zero, getting datasem right in the process. 6 correctly initialized mutex but not the others.

d) Which line in the above code replaces the busy waiting code in the previous version of enqueue (that is, replaces line 1 of the code given in problem 6)?

Line 1.
14 got this right. There was no partial credit. 5 suggested line 2 and another 2 suggested line 5.

e) Would it be safe to move line 6 earlier in this code? If so, what is the earliest line in the code that it must follow?

Line 6 can be trivially moved between lines 4 and 5. Curiously, it can also be moved between lines 3 and 4. This signal can unblock a consumer, and the consumer can safely consume the data as soon as it is placed in the buffer by line 3; the consumer does not care if tail has been incremented, since only producers use that variable.
2 got this right. 9 earned half credit, moving line 6 between lines 4 and 5. 3 thought, somehow, that it would be safe to move line 6 before line 3. 2 more thought that the answer depended on how many producer processes were involved (it does not). No credit was given to the 6 who wanted to move line 6 before line 2 or to the 4 who held that no change was possible.
The Amoeba and Demos systems allowed processes to use remote procedure call semantics to access methods of objects maintained by server processes. Object handles in these systems were represented as capabilities (they were called links on Demos), and on Demos, each process had a well-defined capability list (the link table) holding capabilities for all of the remote resources (mostly objects) that process could manipulate.
a) The link table in Demos is analogous to the open file table in Unix-based operating systems. What can a Demos process do with links (and what can an Amoeba process do with capabilities) that a Unix process cannot do with open file descriptors?

Demos and Amoeba are able to send links or capabilities in messages. Unix has no tools for sending open files as the contents of messages to arbitrary other processes.
4 did well. 7 earned partial credit, typically by saying things that were relevant but not clearly distinguishing what Demos and Amoeba allow from what is possible in Unix. For example, Unix file descriptors can be shared (they are shared as a result of fork), and Unix does allow data to be copied from the address space of one process to another, by way of a pipe.

b) In a virtual memory system that associates a page table with each process, page table entries are analogous, in some sense, to Demos links or Amoeba capabilities. What are the methods that apply to the objects referenced by a page-table entry?

Pages can be read, instructions can be fetched from pages, and pages can be written.
1 did well here. 5 earned partial credit. The most common reason for partial credit is if the student referred to read(), write() and exec() in a way that made it unclear whether these were operations on bytes in a page or the names of Unix kernel calls. They are potentially both, and at least one answer to this question earned no credit by making it clear that they were referring to the Unix kernel calls, not to the operations a program may implicitly apply to the bytes in a page.