Midterm Exam, Fall 2002

Part of 22C:116, fall 2002
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Note: This is an open-book, open-notes exam. Please write your answers in the exam booklet supplied. Leave blank lines between your answers to different questions, and leave a margin down the side of each page. Please write your name in the form it appears on your university ID card; write your name in the blank provided on the front cover. This exam is worth 20 points and has 10 parts! Allocate about 5 minutes per part.

General Background: Several problems on this exam refer to ECOS, the Excessively Complex Operating System introduced in the study questions distributed for this exam. Here are some new facts about ECOS:

ECOS uses the binary buddy system to manage the heap used to store compressed images of less recently used pages in main memory. This means that the expected loss to internal fragmentation is about 25% for a random distribution of allocation requests and the maximum internal fragment size is about 50% of the allocated block.

The page size used by ECOS is 4K bytes, and the maximum physical memory is 4 gigabytes. There are 8 bits in the page table entry reserved for access rights, status, MMU control, etc. Page table entries are 32 bits.

All new versions of the ECOS disk drivers use the cyclic elevator algorithm for disk scheduling.

ECDBMS (the Excessively Complex DataBase Management System) is a general purpose database management system. ECDBMS operates as a single-threaded application.

Background: The total memory requirement for ECDBMS is 100 megabytes; the working set size for ECDBMS has been measured empirically to be 10 megabytes. On the average, the pages of ECDBMS compress to 1/3 of their raw size; Half of all pages compress by a factor of 1/2 to 1/4, while one in 4 compress to a smaller size and 1 in 4 compress to a larger size.
A Problem: If ECDBMS is the primary application you will be running, how much RAM should you purchase? Explain your answer! (Ignore the RAM requirements of other applications or the system itself.)
Background: When a page is compressed, the page table entry for that page is marked as invalid, and the frame number field, plus any other spare bits in the page table, are used to hold the memory address of the compressed version of the page.
Part A: What is the smallest memory block that can be allocated to hold a compressed page. Explain.
Part B: If each process has its own virtual address map, does this make it difficult to implement shared pages? Why, and why does your answer change when you segment the address map and share entire segments?
Background: The internal representation of an ECOS semaphore has a count field and a list of waiting processes, each with a field indicating the amount by which they want to decrement the count. The wait operation, in a critical section, checks the count against the desired decrement. If the count can be decremented without going negative, it is decremented and the process does not block. If it cannot be decremented, the process is enqueued and blocked.
Part A: What invariant relationship holds between the count field of the semaphore and the desired decrement fields of all the processes blocked on that semaphore.
Part B: Briefly what computations are performed by the signal operation? Focus on those that differ from the equivalent conventional semaphore where the decrement is always 1.
Part C: An ECOS programmer notices that the signal() operation is sometimes very slow and at other times very fast. On further experiments, the programmer notices that replacing signal(s,n) with n repititions of signal(s,1) seems to lead to even greater variation in speed. Why is the ECOS signal() operation subject to such variations in speed?
Background: Recall that the the ECOS I/O operations are nonblocking, and if a process wants to await completion of an operation, it must explicitly block by waiting on the ECOS semaphore that was passed to the I/O operation. This allows a process to post multiple I/O operations before waiting for any of them to complete. The ECDBMS database manager makes extensive use of this feature.
Part A: When ECDBMS was ported from ECOS to Unix, the performance was seriously degraded. Explain why; for full credit, you must connect issues in ECDBMS with differences in the ECOS and Unix kernel interfaces and relevant things you know about the device drivers.
Part B: After posting two I/O operations A and B, you want to do this() as soon as A finishes, and that() as soon as B finishes. Can you do this in ECOS? Could you do this (perhaps inefficiently) if ECOS contained a new kernel primitive wouldblock(s,n) that returns true of wait(s,n) would block? What efficiency concern does your solution raise?
Background: One part of ECDBMS allocates a huge buffer and then issues one a single read to fill the buffer from a disk file. Under a much older version of ECOS, the software that processes this buffer was able to do most of the analysis before the read finished by waiting for a few bytes at a time, processing those bytes, and then waiting for more. After an upgrade to ECOS, the software was no longer able to scan this buffer as quickly. The disk driver and the file system developers insisted that they had provided a performance upgrade, and indeed, almost all other parts of ECDBMS and other applications performed better after these upgrades.
Part A: What major feature must have been added to the I/O drivers of ECOS to allow this performance degradation? What evidence do you have that allows you to infer the presence of this feature?
Part B: Can you determine from this information whether ECOS files are required to be stored contiguously on disk? What evidence supports this conclusion?