22C:116, Lecture 18, Spring 2002

Douglas W. Jones
University of Iowa Department of Computer Science

File Systems -- A user view
An open file is an object! If we use the UNIX/C model, the object has the following methods:

read(buf, len)
write(buf, len)
transfer one buffer of data (any size) from or to the file, advancing the current file position by the number of bytes read or written. Return the number of characters read or written.

seek(pos, base)
move the current file position to pos, relative to the given base. the base may specify either the start of the file, the end of the file or the current position in the file. Return the resulting position in the file, relative to the start.

close()
the object destructor. This does not destroy the device or the storage allocated on a secondary memory device, it only destroys the file object through which the user gains access to that resource.

Of course, in UNIX, because the system interface is not itself object oriented, we add a first parameter to each of these, so instead of calling f.read(buf,len), we call read(fd,buf,len). In addition, because UNIX was originally a 16-bit system, seek(fd,pos,base) was defined as taking a 16-bit position parameter, and when UNIX was extended to allow longer files in the early 1970's, a new service, lseek, was introduced, allowing a 32-bit position to be passed. This involves no change in the abstraction.
In UNIX, the open-file object is a member of a polymorphic class. You can open a disk file, but you can also open a network socket, an interprocess pipe, or a physical device such as the parallel port or the floppy disk. When you open a physical device, you do so with no file system on that device, so if you read the floppy disk, your file position is relative to the raw hardware, with file position zero being track 0, sector 0, surface 0.
Some subclasses of the device class do not fully support the interface given here. You can't seek or write on a mouse or keyboard. You can't read from or seek on a printer. Other subclasses need extra operations, for example, on an asynchronous communications line, it is possible to set the baud rate, byte size, number of stop bits and parity. Because the UNIX interface is not object oriented, all devices support one additional operation:

ioctl(request, argp)
apply request, a device specific command, to this file object, passing a pointer to the argument list using argp. Request is an integer, with a meaning that depends on the device, and the structure of argp is also interpreted by the device.

In addition, because file handles are stored in a special address space, the open file table of the process calling program, indexed by integers, we cannot manipulate file handles using normal memory addressing tools. Therefore, the system interface includes a special tool

fcntl(cmd, arg)
This masquerades as a method of the open-file object, and some of the things it does are indeed obscure operations on open files. Other things it does are connected to manipulating the handle itself! The actual operation requested is connected to cmd, and any parameter required by that operation is encoded in arg; both are integers. The meaning of the return value depends on cmd.

For each object, we expect there to be an object constructor, and there is such a constructor in UNIX, the open service.

env.open(filename,flags,mode)
search for the named file in the directory structure of the system and return an open-file object allowing access to that object.

Here, UNIX differs in a big way from MS/DOS and most other systems. In those systems, there may be multiple file systems, one per device. Therefore, open is relative to a device on those systems. Under DOS, however, the file name includes the textual encoding of the device name, further obscuring this.
In fact, open in UNIX is interpreted in the context of the environment of the current process, where, from the context of the open system call, the environment contains a pair of hidden objects, the current working directory of a process and the root directory of the process. Usually, the root is the root of the entire file system, but it need not be (the file system supports a method, chroot, that changes the process's root).
Directories -- A user view
A UNIX directory is just a kind of file; from the object oriented viewpoint, an open directory is a kind of open file, and the current process has, at any instant, one open directory for the root of the file system, and one open directory for the current working directory. These environment attributes may be manipulated by the following:

env.cd(filename)
search for the named file in the directory structure of the system and, if it is a directory, set the current working directory of the environment to that file.

env.chroot(filename)
search for the named file in the directory structure of the system and, if it is a directory, set the root directory of the environment to that file. The file name given must be reachable from the current root without following up-links in the directory structure. As a result, chroot can confine a process to a subtree of the directory structure, and once confined, the process cannot escape. To complete this confinement, the process also needs to change the current directory to the new root!

The behavior of chroot and cd are not entirely rational and include features that are afterthoughts. The descriptions above are a bit idealized.
A directory is just a kind of disk file, and it can be read by the same read routine used to read a normal disk file. The following operations write directories:

dir.makelink(finalname,pathname)
create a link to the file named by pathname, putting it in dir with the name dir/finalname. Files may be linked from many directories.

dir.deletelink(finalname)
delete the directory entry dir/finalname from the directory dir. The file itself will be deleted when the last link to it is deleted.

These primitives are not visible to the user! Instead, there is a link kernel primitive (used to implement the ln shell command) that includes two full file names. It uses the first name to as the name of the file being linked to, and it splits the second name into a final file name and a directory name, opens that directory, and uses the makelink primitive on that directory to create a link to the first name.
The unlink kernel primitive primitve (used to implement the rm and rmdir shell commands) parses the given file name into a directory name and a file name in that directory, opens the directory and then applies the delete-link primitive to the named directory entry.
The rename kernel primitive (used to implement the mv shell command) is like link followed by unlink (or ln followed by rm in the shell), except that it is executed in a critical section, so that it appears atomic.
Opening a file is done by opening each directory along the path to that file, reading the directory to find the named directory entry, and opening that, until the named entry is the final name in the pathname provided. The initial directory used in this process is either the process's root or the process's current working directory, depending on whether the first character of the file name is / or not.
File subclasses -- A user view
The following items may be associated with a directory entry, and therefore, each is a subclass of the class open-file-object, since it is legal to open any directory entry:
- Disk files, including directories, which are a subclass of disk files.
- Mounted File systems. Each file system fills a device of subclass disk. so a mounted file system includes a reference to a disk device.
- Devices. Each device is characterized by major and minor device number. The major device number names a particular I/O driver. The minor device number identifies a particular device associated with that driver (many drivers are written to serve any of a number of identical devices).
By convention, the directory /dev contains one entry for each device on the system. Frequently, system administrators leave large numbers of nonexistant devices here, since the default system installation includes huge numbers of devices that nobody has. Here is a list of the /dev file of a real system:
```
bpf0      disk0s7   ptyp5     ptyq2     ptyqf     stdout    ttypa     ttyq7
bpf1      disk0s8   ptyp6     ptyq3     rdisk0    tty       ttypb     ttyq8
bpf2      disk0s9   ptyp7     ptyq4     rdisk0s1  tty.modem ttypc     ttyq9
bpf3      fd        ptyp8     ptyq5     rdisk0s2  ttyp0     ttypd     ttyqa
console   klog      ptyp9     ptyq6     rdisk0s3  ttyp1     ttype     ttyqb
cu.modem  kmem      ptypa     ptyq7     rdisk0s4  ttyp2     ttypf     ttyqc
disk0     mem       ptypb     ptyq8     rdisk0s5  ttyp3     ttyq0     ttyqd
disk0s1   null      ptypc     ptyq9     rdisk0s6  ttyp4     ttyq1     ttyqe
disk0s2   ptyp0     ptypd     ptyqa     rdisk0s7  ttyp5     ttyq2     ttyqf
disk0s3   ptyp1     ptype     ptyqb     rdisk0s8  ttyp6     ttyq3     zero
disk0s4   ptyp2     ptypf     ptyqc     rdisk0s9  ttyp7     ttyq4
disk0s5   ptyp3     ptyq0     ptyqd     stderr    ttyp8     ttyq5
disk0s6   ptyp4     ptyq1     ptyqe     stdin     ttyp9     ttyq6
```
Some of these devices are obvious: /dev/disk0 is the disk. /dev/fd is the floppy disk. The fact that the system doesn't have a floppy disk isn't apparent from the /dev file! Some are curious: /dev/null references a driver that returns EOF for all read attempts and ignores all write attempts. If a program insists on producing huge ammounts of output that nobody wants, redirecting it to /dev/null is a convenient way to suppress that output. /dev/kmem is a driver that allows access to the kernel memory of the machine as if it was an I/O device -- this is useful as a debugging tool. /dev/mem is similar, providing access to the memory of the current running process.
The huge array of ttyp and ptyp devices refer to drivers that exist for network interprocess communication. These look like interactive terminals to processes that read or write them, but they are connected by the network driver, so they may be used to communicate with remote systems. They also get used to communicate locally, so each open window, for example, is typically connected to one of these by the window manager.
Open disk files have an interface that is extremely similar to open disks! In fact, it is proper to think in terms of the same I/O driver used for both, except that there is a virtual disk address to physical disk address translator between the disk file interface and the physical disk. This is similar, in an abstract sense, to the function of the MMU, but it is done entirely in software within the implementation of the file system!