Exam 2: Final

Solutions and Commentary

Part of the homework for 22C:112, Spring 2010
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Exams

Final Exam

Mean   = 10.15       X
Median =  8.1    X   X
                 X X X X X X         X
   __________X_X_X_X_X_X_X_X___X___X_X_______X__________
      0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 24.

Total Exam Scores

Mean   = 15.91
Median = 13.8     X   X   X
                  X X X   X X
 _________________X_X_X_X_X_X_X_X_____X_____X_X___X_X_____________X________
   2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36

Total of Machine Problems 1 to 5

Mean   = 17.02                          X         X
Median = 16.0             X             X         X
                          X X X X       X         X
 _____X_______X_X_________X_X_X_X_______X___X___X_X___
   2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26

Total of Best 10 out of 11 Homework Scores

Mean   = 18.50
Median = 17.0                                           X
                          X X X   X   X                 X
 _____________X_____X___X_X_X_X_X_X_X_X_X___X_______X_X_X_____
   2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Grand Total Score

Mean   = 51.43                        E -- Engineering student
Median = 46.8         E               G -- Grad student
                      X
                X     X       X E                               X
           _____X_____X_X___G_E_E___X_X___X_X_E_X___________X_G_X_______
             28. 32. 36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 82
Grade scale:    +|- -   C C   + +|- -   B B   + +|- -   A A   + +|

Solutions and Commentary

Background: Consider an attempt to write a new Unix command, myset that takes two arguments, as follows:
```
     myset var val
```
This sets the environment variable var to the value val. There are two options for implementing Unix commands. You can write it as an application, comparalbe to /bin/echo or /bin/ls, or you can modify the source code of one or more Unix shells to make the command a built-in command of that (or those) shell(s), such as exit.
A question One of the approaches to implementing myset will not work. Which one, and explain why it cannot work.

It must be implemented by modifying the source code of a shell. It cannot be implemented by an applicton launched from the shell because launching an application always gives that application a copy of the environment. Changes the application makes to this copy do not have any effect on the original.
4 did well, 5 more said "modify the shell" but gave bad explanations. 7 said vague but interesting things about changing the environment, but did not draw useful conclusions. 7 earned no credit.
Background: Compilers for a Unix system generally have two choices when it comes to placing constants in memory. Constants can be embedded in the code segment, or they can be embedded in the static segment.
On some machines, the code segment is execute only. Instruction fetch from the code segment is legal, but load and store are illegal.
a) Consider such a machine. What constants can the compiler put in the code segment?

Immediate constants that are encoded as part of a machine instruction. Any constant that must be accessed by a load instruction will have to be in the static segment.
5 did well. 13 earned no credit. Partial credit went to those who declared, for example, that "int and char constants go in the code segment" without explaining why.

b) For constants that cannot be put in the code segment, what problems does the system have with enforcing the semantic rules of the programming language.

The system cannot prevent modification of constants stored in the static segment. In a language like C, with weak typing, this means that a program may change the value of a constant.
7 did well, 14 earned no credit.
Background: Consider the problem of text output to a character sequential printer.
1. A C program uses the C-standard library to output a character
2. The standard library uses a system call to output a block of text
3. the system uses the I/O driver to output the text
4. the interrupt service routine in the driver outputs a character
A problem Give one line of C code clearly illustrating each step above. You'll probably need to make some assumptions. Document them.
1. putchar( "x" );
2. write( stdout, ch, 1 );
3. enqueue( printerqueue, ch );
4. out( printer_data_register, dequeue( printeerqueue) );

19 got part 1, 7 got part 2, 3 got part 3, and 1 got part 4. 4 or 5 got partial credit on parts 3 and 4. Over 10 got no credit on parts 2, 3 and 4.
Background: Consider a user program on a computer running Unix. The program executes the following two lines of code:
fd = open("/file", O_RDONLY);
read( fd, &ch, 1);
Assume the following: No files have been opened in the last few minutes, but some application has been doing intensive disk I/O. As a result, all RAM devoted to a cache of recently read disk sectors is taken up by material related to that applicaton. Nothing about / or about /file is currently in RAM. You can assume a classical Unix file system with inodes, and you can assume that both / and /file are small.
a) Show the sequence of disk reads required to complete the two statements above. Document each read with a brief explnation of what is being read.

1) read inode of file system root /
2) read first sector of root directory to lookup entry for /file
3) read inode of file named /file
4) read first sector of file named /file to get the first byte
7 did well, 3 earned no credit. The most common reads that people forgot were the read of the inode of the root and the read of the actual data from the file. Several students were penalized for proposing several additional reads.

b) Complete the following sentence: "The integer fd is an ... which points to ... holds the address of the first sector of /file. Your answer will be a run-on sentence, and it will trace the complete path from the integer fd to the disk sector.

The integer fd is an index into the process's table of open files, each entry of points to an open file description; the indexed entry points to an open file description data structure that references the inode for the file; the inode holds the address of the first sector of /file.
Nobody got the whole chain right. 8 missed the fact that fd is an integer index into an array of pointers, not a pointer. 9 missed the fact that it is an array. 10 had the pointers point to the i-node, while in fact, the data structure describing an open file contains much more than the i-node -- things like the current file position and the current buffer being used for blocking and deblocking. 5 did not mention the inode. 6 earned no credit at all.
Background: Consider the problem of building a disk request queue. Each disk request record is a queueable object, with a field called next used to link to the next request in the queue. Disk requests records also contain the physical address of a buffer in RAM, the disk address of a sector, and a transfer direction.
A disk scheduler can be constructed using a collection of FIFO queues, one for each cylinder in which any requests are pending. Assume each FIFO queue is implemented as a record of two pointers, head amd tail, where new items are enqueued at the tail, and the next dequeue will return the head item.
a) Suggest an implementation for the dequeue operation. Make sure you correctly handle dequeueing from an empty queue (which should return NULL and dequeueing the last item in the queue.
```
node * dequeue( queue * q );
    node * t = q->head;
    if (q->head != NULL) q->head = q->head->next;
    return t;
}
```
This particular version assumes that an empty queue is always indicated by q->head being null. Therefore, enqueue must check the head even though most of its work involves the tail.
3 did well. 12 earned no credit. The most common problem leading to partial credit was trouble with dequeueing from an empty queue.
b) Why is there one FIFO queue per cylinder instead of just one queue for the whole disk?
The elevator algorithm relies on the user side of the driver to sort disk requests into queues, one per cylinder, so that all pending requests in one cylinder can be handled before moving to the next cylinder.

9 did well here. 8 earned no dredit. Partial credit was offered for some confused but not entirely wrong explanations.

c) In the context of a disk scheduler, something like a semaphore can be used to count the number of enqueued items. Should there be one such counter per queue, or should there be one such counting semaphore per disk? Explain your answer.

The interrupt-server side of the queue must wait when there are no pending requests anywhere on the disk. If the current queue is empty, but there are pending requests in other queues, there is no wait. Yes, in this case, the disk will seek, but from the driver's perspective, this just means that the next disk I/O operation will take longer.
5 did well. 12 earned no credit. Partial credit was given for poor reasoning supporting the right answer.

d) Why did the above question say for "something like a semaphore" instead of just "a semaphore"? The answer has something to do with the fact that the consumer of data from the disk queue or queues is an interrupt service routine.

An interrupt handler cannot block, waiting for a signal operation on a semaphore. Instead, it must return from interrupt leaving the interrupt disabled, and the corresponding signal operation will involve enabling the interrupt.
2 did well. 4 more were confused but on the right track. The remainder earned no credit.
Background: The Modcomp IV had a 16 bit word and a 16-bit virtual address, with an 8 bit page field and an 8-bit word-in-page field. Each entry in the page table was 16 bits. 10 bits for the page-frame, 2 bits for access rights, and the remainder unused. The access rights are null (no access), read-only, read-execute and read-write-execute. A reserved opcode, REX, causes an unimplemented instruction trap.
In general, an operating system supporting parallel processes may be built in three distinct ways:
i) All processes and the operating system share one virtual address space.
ii) Each process has its own address space, and the system is mapped into all address spaces to simplify addressing of parameters to system calls.
iii) Each process has its own address space, distinct from the system address space.
a) How big is the virtual address space of this machine, in words? How big is the physical address space?

The virtual address space is 2⁸ or 256 pages of 2⁸ or 256 words each, which comes to a total of 2¹⁶ or 64K (65,536) words.
The physical address space is 2¹⁰ or 1024 page frames of 2⁸ or 256 words each, which comes to a total of 2¹⁸ or 256K words.
Only 3 got this correct. Partial credit was offered to the 4 who got the 64K virtual address space, and the 1 who got the 256K physical address space. The rest got no credit. Wrong answers generally involved extraordinary amounts of memory, common figures being in the hundreds of megabytes, but some were in the gigabytes. It is difficult to understand how so many managed to propose huge virtual address spaces based on a 16-bit word.

b) Which of the operating system addressing models discussed above would be best for this system, and why?

Model iii). Because the physical address space is bigger than the virtual address space, we can only use the physical memory effectively if we pack multiple processes into memory. We could not do this if all processes shared one virtual address space as in model i). The virtual address space is so small that we cannot afford to turn over part of each processes address space to the system, as in model ii).
3 did well, and only 4 earned no credit. Many picked the right model and then offered poor reasons in support of it.

c) Suppose you implement Unix on this machine. When the user calls read(f,buf,len), the system must use the pointer buf to address the user buffer. How? The answer to this depends on your answer to part b.

The system must translate the address buf from a virtual address to a physical address.
9 did well, and an equal number earned no credit. Full credit was given to those who said "ask the MMU to map the address". Curiously, the real MODCOMP IV included privileged operating system instructions to do precisely this. Many modern system designs don't do this, leaving it entirely to software to emulate the MMU in doing this translation.

d) Would demand-paged virtual memory be useful on this system, or would the primary use of the virtual memory be to protect processes from each other? If your answer depends on circumstances, explain the circumstances.

When the aggregate memory allocated to different processes exceeds the physical memory, demand paging would be useful. A single process running alone is unlikely to profit by this, but it is easy to see that the need can arise when there are more than 4 processes, each using its full 64K address space.
Only 2 got this, while 15 earned no credit. A few got partial credit, waffling on the edge of a coherent reason. As an aside, note that the system developers at MODCOMP didn't expect demand paging to be used on this machine.
Background: Consider the following alternatives:
-- A file mapped into the virtual address space with mmap() or shm_open().
-- Reading and writing a file, where the system keeps a cache of recently accessed disk sectors in RAM.
Assume that the application process is doing something more complex than reading or writing the file sequentially. Assume that each system call requires a trap and a return from trap.
a) Which approach to using the file is likely to require fewer traps? Document your assumptions.

Mapping the file will use fewer traps if we assume that read() and write() operations on the file use buffer sizes comparable to the page size. With mapped file access, we get one trap per page fault, while with read() and write(), we get one trap per call. We can assume that the pool of buffers in the disk cache and the pool of page frames backing up the mapped file access are managed using similar approximations of LRU, to the big difference between the two approaches is whether the traps happen conceptually between the user and the LRU mechanism, with read() and write(), or between the LRU mechanism and the disk, with mapped file access.
5 did well, while 7 more earned partial credit with poor explanations for the right conclusion.

b) What computations that are required by one approach are offloaded into special hardware by the other approach?

With mapped file access, the MMU maps user access to the RAM pages that hold a cache of recently used disk sectors. With read() and write() to a disk cache, this mapping is done entirely by software.
6 did well, while 5 more earned partial credit by claiming that mapped file access offloaded something, but not clearly stating what was offloaded or where it was offloaded to.
Background: Real-time operating systems associate a priority with each process in the system. At any instant, the current running process (on a uniprocessor) always has a priority higher than or equal to all of the ready processes, and when a group of competing processes are waiting for a resource, the highest priority process always gets that resource first.
To convert a conventional operating system to a real-time system, you would add a priority to each process and then modify some of the system routines.
a) What change must you make to the ready queue?

Change it to a priority queue, sorted on the priority of the processes enqueued.
14 did well, and 3 more got partial credit for complex or confused answers.

b) What change must you make to the implementation of semaphores?

There are two changes: First, the queue of processes waiting on the semaphore must be changed to a priority queue. Second, when a signal operation dequeues a process with a higher priority than the current running process, there must be an immediate context switch to the new process.
Nobody got both issues. 4 got partial credit for the change to priority queues. 5 got partial credit for statements about honoring process priority with no hint about the mechanism. 1 got partial credit for the possible need for a context switch. Numerous answers suggested very confused understanding of semaphores.
Background: The RC-6000 operating system, developed by Per Brinch Hansen for the Regencentralen 6000 computer in the late 1960s, had an interesting structure for interrupt service routines.
Each interrupt service routine on this machine did just one thing. Signal an associated semaphore and disable interrupts for the associated device. So, the interrupt service routine for the keyboard would be something like this:
```
        kbd_isr:
            isr_signal( KBD_sem );
            isr_control &= ~KBD_IE; /* disable IE bit for keyboard */
            return_from_interrupt;
```
All the rest of the function you might expect to find in an interrupt service routine is moved to an associated process. The interrupt service routine does no actual input-output. Instead, the associated process waits for the interrupt by waiting on KBD_sem, and then it does the actual input-output.
a) Why isr_signal() instead of just calling the regular signal() operation on the semaphore? Be specific, how would you expect isr_signal() to differ from signal()?

The answer is related to problem 5 d). isr_signal() cannot disable or enable interrupts, since it is inside an interrupt handler.
3 did well. 4 more earned partial credit with answers that addressed issues that were at least marginally relevant.

b) Assume a bounded buffer input queue. Give code for the body of the process that does the rest of the work of a conventional keyboard input interrupt service routine. Be sure to include all code needed to synchronize with the user side of the keyboard I/O driver. (a good answer takes about 7 lines of code).
```
for (;;) {
    wait( KBD_sem );
    enqueue( KBD_queue, KBD_data_register );
    isr_control |= KBD_IE; /* enable keyboard interrupt */
}
```
In the above, it is assumed that any semaphores used to mediate producer-consumer relationships are packaged in the queue data type and are appropriately operated on by the enqueue operation.
There were no perfect solutions. Common errors involved leaving out the loop or adding extra signal operations. 10 left this problem blank, 4 more had answers so poor that they earned no credit.
c) What is the benefit of the RC-6000 approach to writing a device driver? Hint: Your answer to part B should give some guidance.

The body of the driver can use normal interprocess communication, including blocking on semaphores whenever appropriate. Use of special semaphore operations is confined to calls to isr_signal() within the small interrupt handlers.
1 did well. 5 earned partial credit, 6 left it blank.
Background: Consider writing a heap manager using boundary-tag storage allocaiton. You have two options, a lazy manager, that merges free blocks only when it can't find a big enough free block to make a new allocation, and a prompt manager that always merges adjacent free blocks as soon as possible.
a) Which is likely to run faster, on average? Which is likely to run faster, in the worst case? Why?

The lazy heap will run faster on the average, at the expense of occasional long pauses to merge free blocks and reorganize the free list.
13 did well, and only 3 earned no credit.

b) While both options can be implemented with doubly-linked free lists, one of them lends itself to a simplified singly-linked free list. Why?

A lazy heap manager can merge free blocks by a scan of the main boundary tag list, rebuilding the free list from scratch as it does so.
12 did well. 8 earned no credit. Missing or poor explanations earned partial credit.
Background: Consideer writing a heap manager to support the C standard malloc() and free() services, using the binary buddy system. Assume 32-bit pointers, with a 4-gigabyte virtual address space.
a) The user interface free(b) simply calls an internal routine buddy_system_free(). Write C-like code for this call, showing the parameters. Feel free to use integer arithmetic on pointers. Document your assumptions.
```
void free (block * b) {
    buddy_system_free( b - sizeof( int ), log2( *(int *)b ) );
}
```
This assumes that the second parameter to the buddy system free routine is the integer index of the free list to use, and it assumes that the first word of each block is used to hold the block size.
Only 1 got this, and 1 more got partial credit. Far too many students simply regurgitated the entire binary buddy system implementation from their notes, without noticing that the question was about the interface between the free() routine and the buddy_system_free() routine, which requires one more parameter. This issue was discussed in class, and several students asked questions about it.
b) Given the above, what block sizes (in bytes) requested by user programs minimize internal fragmentation and therefore optimize memory use? Explain.

In general, the optimal block sizes will be 2ⁱ-sizeof(int), for some i. If our machine is word-addressed, the optimal sizes would be 1, 3, 7, 15, 31, etc. This is because one word of each block must be reserved to hold the block size.
2 did well, 1 failed to offer an explanation.