Assignment 6, Solutions
Part of
the homework for 22C:112, Spring 2011

a) The need to set aside the queue before serving it is the result of what could otherwise happen if disk requests were enqueued in the queue for the cylinder currently being served. What is the problem and why is it solved by setting the queue aside before beginning to serve it? (0.5 points)
If the interrupt service did not set aside the queue for the current cylinder before working on it, and if a user kept enqueueing I/O requests for the current cylinder as fast as the interrupt service routine processed requests, the interrupt service routine would never service requests to other cylinders. This would violate our requirement that the scheduler be fair.
By setting aside the queue before starting to service it, new requests for the same cylinder fill a new queue. This queue will be serviced only after all other cylinders have been given their chance, therefore guaranteeing fair service to all cylinders.
b) Consider this alternative approach: The queue is maintained as a binary search tree, sorted by linearized disk address (of the various BST organizations, a splaytree works best, but this is not relevant to the problem). The left child is for all requests less than or equal to a node. A unidirectional elevator algorithm is used, traversing the tree from left to right and deleting each node as it is served. Does this approach suffer from the problem identified above? Explain why or why not. (0.5 points)
No. The key is "the left child is for all requests less than or equal to the current node" combined with "left to right traversal". Because of this combination, new requests for the same disk address will always be placed "behind the elevator" as it moves forward, so it will always make forward progress. No pattern of enqueues can prevent fairness.
If left to right traversal were used with the rule that "the right child is for all requests greater than or equal to the current node" the situation would be different. In that case, new requests for the same disk address could prevent the scheduling policy from ever advancing to a different disk address.
One way to organize the sectors of a file on disk is to use this 64bit sector number as the key in a search tree, where the leaves are the disk sectors of files, and the internal nodes of the tree are index sectors used to locate the correct leaf. The Btree data structure is particularly appealing for this purpose (see the Wikipedia entry on Btrees).
Assume that disk sectors are 512 bytes, and that linearized sector numbers are 64 bits long.
Some preliminary work. Index sectors in this Btree will need to hold pairs of the form [filesector,disksector] which are 128 bits long, that is 16 bytes. As a result, index sectors will hold at most 32 entries. A quick check of the definition of Btrees given on Wikipedia shows that in a Btree, the average population of an index node is about 3/4 of the maximum, so we expect the Btree to be approximately a 24ary tree.
For reference, here are the powers of 24:
24, 576, 13,824, 331,776, 7,962,624.
a) If the aggregate content of all data in all disk files occupies 1 gigabyte (that is 2^{30} bytes), how many disk sectors would be occupied by the Btree needed to provide access to this data. (0.5 points)
First, 2^{30} bytes of data will at least 2^{21} sectors of 512 (2^{9}) bytes each. That is about 2 million sectors.The number of index sectors will be about 2^{21}/24 which is 87,382 (plus 1/24th of this, 3641, plus 1/24th of that, 152, plus 1/24th more 7) which is exactly 91,182. Giving an exact answer is stupid in this case, since the exact answer depends on exactly how many index nodes there are, and this depends on the history of insertions and deletions in the Btree, which we don't know. The best answer is "on the order of 100,000 index nodes."
b) If the aggregate content of all data in all disk files occupies 1 gigabyte, How many disk accesses would be required to find the location of a randomly selected disk sector. (0.5 points)
The height of this tree would be expected to be at least log_{24}2^{21}, which (rounding up to the next integer) is 5, determined from the table of powers of 24 given above. This is the answer. It is probably exact, since the height of the tree is very stable over huge variations in how things are sliced and diced between the different index sectors.
a) If the file system uses some kind of index sectors to locate the data sectors of a file, how would you modify the above scheme to allow files to be read as quickly as possible? (0.5 points)
I would store the index sector that references each block of successive data sectors right before those data sectors, since it will need to be read before those data sectors are read. Aside from this, I would not change the sequential order suggested above.
b) Suppose your application has multiple files open for output and it occasionally adds sectors to each of them. Suggest an allocation policy that tries to appoximate an optimal allocation pattern without knowing in advance which files will end up being large files and which will stay small. (0.5 points)
Until disk space is very sparse, always store each file starting on a separate track. Basically, you allocate the whole track to the file assuming it will continue growing to fill that track, and only after the file is closed do you return the unused part of the track to the freespace pool. If an application opens several files for output, preferentially locate their initial tracks in the same cylinder, some cylinder that has enough free tracks available to start all of the files.