22C:116, Lecture Notes, Jan. 20, 1995

Douglas W. Jones
University of Iowa Department of Computer Science

Operating Systems, A User Perspective
Users rely on operating systems to provide the following:
```
 * File Systems
 * Command Languages
 * Resource Sharing
 * Resource Protection
```
Focus on file systems The word file is frequently used rather carelessly to refer to any of the following items, even though they are quite different things:
```
 * An object to which device independant
    I/O operations are directed.
 * A named object in a directory structure.
 * A region on a disk or similar device.
```
These three meanings of the word "file" are usually used without any clarifying comment, and you have to figure out, from context which meaning is intended.
Generally, there is some operation, such as open(f,name), that takes a file name and interprets it in a directory structure to deliver an open file object. This object may refer to a file on disk, or it may refer to something else such as a terminal or communications line.
It should be noted that on some systems, a file on disk need not have any name in any directory structure.
Focus on Command Languages
Before UNIX, command language interpreterss were generally integral parts of the operating system, and many people mistakenly asserted that, quite naturally, each command language operation was identical to a system call.
In UNIX and most later systems, the command language interpreter, called a shell on UNIX, is merely an applications program. If you don't like it, you can write your own. Most commands in the command language are merely names of programs, and the user of the command language interprets the commands as if they were parameterized procedure calls, where the programs themselves are procedures.
Most users view the command language interpreter as part of the system, but for most purposes, the system views the command language interpreter as if it was a user program.
As an aside, note that in later versions of the UNIX, the system has a special relationship to the Bourne shell through the system command, but this is not a system call, it is just a user library routine that passes its argument to the Bourne shell.
Focus on Resource Sharing -- Memory.
The unit of memory allocation may the page, the contiguous block of memory, or the segment, depending both on the system in question and the client to which memory is allocated.
Clinets may be user programs or other parts of the operating system. Allocation to user programs need not use the same mechanism as allocation to other parts of the system, although a proliferation of mechanisms clearly adds complexity to the system.
Focus on Resource Sharing -- Time.
On systems allowing more processes than processors, CPU time must be allocated to competing processes, some of which may belong to different end users of the system.
Since 1966, with the Berkeley Timesharing System, many systems have allowed users to run more than one process in parallel, so not only do users compete with each other for CPU time, but the processes of one user also compete with each other.
Aside: Surveys done of UNIX users done many years ago show that a single user at a remote terminal typically had 4 processes, one of which is interactive, one of which was waiting for the former process to terminate, and two of which may have been waiting or running in the background.
On window-oriented systems, it is common to see one process behind each window on the display screen.
Focus on Resource Sharing -- Input/Output Operating systems typically allocate I/O to processes competing for the limited bandwidth of the available channels. This may involve code to schedule I/O transactions that is as complex as the code used to schedule CPU time, or on low performance personal computers, the system may ignore this issue.
Focus on Resource Sharing -- Secondary Memory From the user perspective, the central function of a file system, as opposed to the I/O subsystem, is the allocation of space on disk for competing files. Allocation of disk space usually involves algorithms quite different from those used for main memory, both because the user model of disk files is not the same as the user model of main memory, and because of the different characteristics of the disks typically used for secondary storage.
Focus on Resource Protection -- The System View
Protection is necessary for security or reliability, but many small computer systems do without, even though the technology needed to build secure systems has been well known since the mid 1960's.
Security demands that system resources be protected malicious attack.
Reliability demands that system resources be protected accidental damage.
Aside: As a general rule, anything which a malicious user might intentionally do to damage a system might also be done by a careless user. Traditionally, security has been concerned with protection against malicious users, while carelessness was viewed as the concern of fault tolerance. The recognition that malice and carelessness frequently produce the same result has led to the recognition of the relationship between fault tolerance and secure systems.
Focus on Resource Protection -- The User View
When a system involves multiple users, there is usually a demand that the system provide some degree of protection of resources allocated to one user from accidental damage or malicous attack by other users.
Segments of main memory, files on secondary memory, and terminals allocated to a user all need such protection. There are uniform access control and protection mechanisms that can control access to all of these on the same basis, but on most modern systems, each class of resource is protected using ad-hoc mechanisms that are generally different form those used to protect other resources.
Focus on Resource Sharing
When there are multiple users or multiple processes, it is frequently essential that they be able to share access to some resources.
For example, some files, windows or memory segments might be shared while others are private to one user or to one process.
Early timesharing systems viewed resource sharing in terms of allocating free resources to one or another user, exclusively. If this is done, the system is effectively creating one isolated personal computer for each user, and this is of only limited utility.
With the advent of systems such as Multics and the Berkeley Timesharing System (around 1966), the focus shifted to the management of information that was actively shared between users. Thus, each user might have some memory regions that are entirely private, and others that are shared with some, but not all, other users. Management of this shared data, whether on disk or in main memory, poses a significant challenge.
This challenge was largely ignored for much of a decade between 1975 and 1985 as personal computers displaced timesharing systems from growing numbers of applications, but it has re-emerged with a vengence with the advent of computer networks.
Focus on Reliability
On early systems, it was generally assumed that any failure in one part of a system would lead to the failure of the entire system, but as technology has advanced, we have learned to build systems where failures lead to incremental degradation of performance.
Early examples of this include memory managers which detect memory errors and remove the bad memory addresses from the pool of memory available for allocation to various system applications. This is now routinely done both with disk systems and with main memory.
More modern examples include multiprocessors where, when a processor fails, the remaining processors continue to operate.
Aside 1: With centralized systems, the assumption that system failure was an all-or-nothing thing was natural. In fact, this was never true, but partial failures of a centralized system were rare enough that they are the subject of folklore -- for example, the timesharing system where the floating point unit failed. The operating system didn't use floating point, so none of the system managers knew anything was wrong until users started complaining about strange results.
Aside 2: With large-scale distributed systems, partial system failures are the norm! Many networks today are large enough that the normal state of the system is that one or more machines are down at any instant. As an exercise, consider the internet, with around one million machines. Assuming each machine has a mean-time between failures of one month, what is the rate of failures, per unit time?