Homework 4

22C:116, Fall 1995

Due Friday Sept. 15, 1995, in class

Douglas W. Jones

Background: The optimal sector size to be used in disk transfers is that sector size where
```
Tt = Tp
```
where Tt is the disk transfer time,
```
Tt = Tl + Tm
```
The sum of the latency and the time taken to actually move the data, , which is itself a function of the sector size and disk speed. Tp is the time taken to produce or consume the data in one sector. Again, this depends on the sector size, but it also depends on the algorithm being used to process the data.
Given an empirically determined average processing time per byte, an expected rotational latency time, and a disk transfer rate, in bytes per second, it is possible to analytically determine the optimal sector size.
The problem, part A: Given a fixed CPU with a fixed mix of applications programs, where both hard disks and diskettes are in use, describe the expected benefit of using a different sector size on hard and floppy disks. Give an estimate of the order of magnitude of this benefit, and state the order of magnitude and sign of the difference in sector sizes that would give the largest expected benefit.
The problem, part B: The determination of the optimal sector size is difficult, but it can be empirically measured! Given that you are the author of the software used to read and write sequential disk files, you can make the software read and write multi-sector blocks to gain the effect of larger sectors. If your software dynamically adjusts the number of sectors per block, it can approximate the optimal sector size for each particular application program it serves. Propose the variables the software could measure in the course of normal operation in order to tune the block size for optimal performance.
Background: The classical disk scheduling and disk storage allocation algorithms require knowledge of the formula used to compute sector, surface and cylinder from disk address. Many modern SCSI disks hide this information from the host computer, presenting a disk model that is merely a linear array of disk sectors, divided somewhat artificially into sector, surface and cylinder numbers that have little to do with the physical reality inside the drive. Many SCSI disks also have data caches that hold copies of recently referenced physical tracks, further complicating the relationship between disk drive and system software.
The problem, part A: What effect would you expect this isolation to have on the system software, assuming that the software uses a typical effective disk scheduling algorithm such as the elevator algorithm, and assuming that sectors of files are allocated preferentially in the same cylinder? You must answer this question if you want to know if it will be worthwhile to attempt a measurement of the actual layout of a SCSI disk, using some kind of empirical measurement techniques.
The problem, part B: Propose (and describe) a set of empirical measurements you could perform, using software to measure the actual layout of sectors on a modern SCSI disk. For the sake of this part of the problem, assume that the disk in question has no on-board cache holding copies of recently referenced tracks.
The problem, part C: Now assume the SCSI disk incorporates an on-board cache of recently referenced disk tracks. Can you propose modifications to your experiments that would still allow you to determine the physical layout of the actual disk?