Assignment 9 Solutions

Part of the homework for 22C:50, Summer 2004
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Standard Unix shells have a search path that is contained in the variable $PATH, so if you type echo $PATH in just about any Unix shell, it will output the search path as a colon-separated string of directory names. Using this notation, what is the search path equivalent to the one used by the example command language interpreter from Machine Problem 4?

:/bin:/usr/bin is formally correct, but .:/bin:/usr/bin would be OK.
Once upon a time, long long ago, Digital Equipment Corporation sold a fixed-head disk drive, the DF32, that had no sector structure. The DF32 allowed DMA transfers to begin with any word within the 32K words on the disk, with a single transfer addressing anywhere from one word to 4K words. (The computer to which it was attached had 4K words of RAM, so 32K seemed modestly large; also, a word was only 12 bits on that machine.)
Assume you've been given the job of designing a file system for a similar (but modern, and therefore much larger) disk drive to be attached to modern computer; the exact details of the capacity are irrelevant, but assume it's large compared to RAM and assume there is no sector structure.
a) Suggest why a programer given this job would be tempted to use a fixed sector size, imposed by software, even though the hardware does not require this.

Storage management on disk is easier if we can allocate sectors to files instead of allocating arbitrary sized chunks.

b) Suggest a representation for files stored on disk that allows the file to be composed of numerous different-sized segments.

Where UNIX i-nodes had the addresses of the disk sectors used to hold the file or to hold index information in the file, we must now have <s,l> pairs, where s gives the word address on disk of the start of a chunk of the file or of a chunk holding index information, and l gives the length of that chunk. Finding the file chunk that holds data at a particular offset from the start of the file would involve computing the sum of all the chunk sizes up to the chunk holding that data. This is expensive enough that we'd probably want to record the byte offset in the file of the first chunk of the file in any index chunk (an index chunk is a chunk of disk space holding the descriptions of chunks of the file, analogous to an index sector).

c) Evaluate the Unix read(), write(), lseek() file abstraction as a candidate for the user-side interface of this file system, assuming that you actually implement files using numerous different-sized segments.

Nothing in the Unix stream abstraction would care whether the file was composed of fixed-size sectors or variable sized chunks. On Unix, users may hope for improved efficiency if the user's buffer is equal in size to one disk sector. With a file system using random-sized chunks, it becomes highly unlikely that the user will be able to do anything analogous.
A RAID is a redundant array of inexpensive disks. RAIDs are constructed so that many small disks can be accessed as a single large disk, with error correction achieved by storing each data sector in redundant form on multiple drives.
Assume you have two disk drives, and you want to organize them as a RAID in software. Each disk records a checksum for every sector written and checks that checksum on read. Explain how your raid_write and raid_read operations would use these checksums to allow the data to survive the failure of either disk drive.

My raid_write operation would write the buffer to corresponding sectors of the two drives. My raid_read operation would read drive 1, and if it got a checksum error, read the same sector of drive 2. Only if both drives failed would the read fail.