Processes and Threads

Part of 22C:112, Operating Systems Notes
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Origins

In the early 1960's, three different ideas emerged, all of which lead in the same direction:

First, timesharing systems such as the Dartmouth Basic system (the origin of the Basic programming language) pioneered the idea of providing each user with a virtual personal computer. From a user perspective, each user could edit and run Basic programs. In fact, the computer system held all of the user's work in its memory (RAM and slower drum or disk memory), taking turns executing instructions on behalf of different users. In effect, each user had a share in the large expensive mainframe computer systemn.

By the later 1960's, there were timesharing service bureaus that commercially sold timesharing services. Instead of confining users to single languages, these systems allowed many users to run -- with the appearance of simultaneous access -- and the freedom to program in languages as diverse as Fortran, LISP, BASIC and assembly language.

Second, batch systems found that multiprogramming could be used to improve system performance. In a batch system, users submitted jobs, typically on punched cards, to a clerk who formed the jobs into batches for processing on the mainframe at the computer center. On a uniprogrammed system, each job was run to completion before the next job began. This meant that the CPU was idle while waiting for I/O to finish, and I/O devices were idle while the job was doing CPU-bound computation.

In a multipropgrammed system, multiple jobs were run at once, so that, when one job had to stop to wait for I/O to finish, another job could start running, and while a job had no I/O activity, there was a good chance that another job might be waiting for I/O to finish. This kept the very expensive CPU and I/O devices of the mainframe computer system well utilized, lowering the average price per job. By the mid 1960's, multiprogrammed batch systems were commonplace.

Third, multiprocessors were developed. These are the direct predecessors of today's multicore computer systems. The first solid proposals for such systems came in papers connected with Burroughs corporation, notably M.E.Conway, A multiprocessor system design, In Conf. Proc. 1963 FJCC, pages 139-146. AFIPS Press, 1963. The idea was to connect two or more processors to two or more main memory modules through a fast switch. This permitted what we now call multithreaded applications to run.

Burroughs had significant success with this idea in the 1960's while the rest of the industry largely ignored it. IBM reported poor speedups with multiprocessors in this era, reporting that adding an extra CPU doubled the cost of a machine while the performance was increased by a much smaller amount.

In the 1960's, Gordon Bell of Carnegie-Mellon University noted that minicomputers had a much higher cost/performance ratio than mainframes, and he asked: Why not use multiple minicomputers to do the job instead of one mainframe. This led him to develop a system called C.mmp, with 16 minicomputer CPUs sharing access to 16 memory modules through a special memory switch. The switch cost as much as the processors, but the result worked well enough to inspire followups. In the 1980's, commercial vendors such as Sequent and Encore sold multi-microprocessors with 16 to 16 CPUs sharing access to memory. These machines worked well -- the University of Iowa had two of them. Today, the software developed for those machines provides a solid foundation for supporting today's multicore processors.

A User Perspective

Conway recognized, from the start, that the division of an application into parallel threads depended on the application, not on the number of processors available. He therefore proposed a model of process execution that allowed the system to handle the assignment of threads to processes. From the user perspective, it is useful to consider a very simple case, the set of operations A, B, C and D -- where each operation may be a complex block of code or even a subroutine call. Consider the problem of writing a program where A must be executed first, followed by B and C, which may run in parallel, and D, which may not start until B and C have completed. We might write this as a flow chart as follows:

Flow-chart notation is not very useful in a textual programming language, so people proposed notatins such as:

          A;                         A;
          cobegin                    fork
             B;                         B;
             C;                         C;
          coend                      join
          D;                         D;

Here, the cobegin and coend keywords indicated that the statements between them could be executed in parallel. The terms fork and join referred to the forking and joining of lines representing flow of control in the flowchart.

Conway recognized, from the start, that the symmetry inherent in the above presentations of parallelism is misleading. In a practical implementation, one thread of control will flow from A to D through one of the statements, arbitrarily, let's select B, while the other statement, C, will be run by a second thread. This led Conway to suggest the following basic interface:

          A;
          fork child;                child:
          B;                            C;
          wait X;                       exit X;
          D;

Conway's fork statement took a label, the address of the first instruction of a new thread or process. That child process was launched to run in parallel with the parent, running until it executed an exit statement. The exit statement used the variable X to signal that the child had exited. The wait statement waited until this variable indicate that the child had exited.

Others proposed that the fork operation should call a subroutine, running it in a different thread. Yet others suggested other models. The Unix process manager interface is one variant. Here is the Unix variant of the above:

         A;
         if (fork()) { /* parent */
            B;
            wait();
         } else { /* child */
            C;
            exit();
         }
         D;

In Unix, fork, wait and exit are all defined as system calls (part 2 of the programmer's reference manual). The fork function causes the calling process to be duplicated -- the code segment is shared, so really, this just means that the stack and static segments are copied. The return value of fork is different in the parent (or original copy) and the child (the new copy). In the parent, fork returns the process ID of the child. In the child, fork returns zero.

Wait waits for any child of the parent to terminate. It returns the process ID of the child, so if the parent wants to wait for a particular process, it must wait until the right ID is returned.

Exit terminates a process. The parameter is passed to wait, but the way it does this is awful.

It is noteworthy that all files that were open at the time of the fork remain open in both the parent and the child. If, before the fork, the parent creates a pipe, the parent and child can communicate through the pipe. A pipe is a first-in-first-out buffer, with a file descriptor for the write end of the pipe and a file descriptor for the read end of the pipe.

Most modern Unix variants support shared memory segments. The system service shmat attaches a shared segment to a process. Once such a segment is attached, it will stay attached through a fork, so both parent and child can use if for access to shared variables. Files that are opened with mmap may also be shared as a result of a fork.

When the Unix shell is used to launch an application, what it actually does is fork:

         A;
         if (fork()) { /* parent */
            B;
            wait();
         } else { /* child */
            execve( commandname, argv, envp );
         }
         D;

This relies on the executed program to terminate with an exit, allowing the shell to continue.