Homework 3 Solutions

22C:116, Fall 1999

Douglas W. Jones

Part A: If the array A initially contains SIZE distinct values randomly arranged in its cells, as the sorter threads operate, why is it possible for values to be lost?
The sorter thread contains a critical section in the code to exchange two consecutive items in the array. Because this critical section is not properly protected, two sorter threads attempting to exchange adjacent pairs of items may lose one of the items and duplicate another.

Part B: Eliminate as few as possible of the calls to may() to eliminate the problem you identified in part a:
```
	if (a[n] > a[n1]) {
		int t;
		may();
		t = a[n];
		-- may eliminated
		a[n] = a[n1];
		-- may eliminated
		a[n1] = t;
		may();
	}
```
The above code fragment contains all of the changes; the remainder of the code needs no changes. Note that eliminating exactly 2 calls to may solves the problem with data loss, but that, if we eliminate one more (the first listed above), additional problems are solved, for example, extra exchanges caused by interference between the comparison and the exchange step.
Part C: Give a solution to the you problem identified in part a that uses semaphores and that maximizes the potential for parallelism in this program.
```
	-- add a new global data structure
	semaphore asem[SIZE];

	-- initialize all the semaphores in main()
	for (i = 0; i < SIZE; i++)
		thread_semaphore_init( &asem[i], 1 );

	-- use the semaphores for mutual exclusion in sorter_thread()
	if (a[n] > a[n1]) {
		int t;
		thread_wait( &asem[n] );
		may();
		t = a[n];
		thread_wait( &asem[n1] );
		may();
		a[n] = a[n1];
		thread_signal( &asem[n] );
		may();
		a[n1] = t;
		thread_signal( &asem[n1] );
		may();
	}
```
The above code is deadlock free but does not maximize concurrency. It is possible for all the threads but one to hold one variable while awaiting the next. That one, with the highest subscript value, will run, allowing the others to run. A higher degree of concurrency can be obtained by the following version:
```
	if (a[n] > a[n1]) {
		int t;
		if (n & 1) {
			thread_wait( &asem[n] );
			thread_wait( &asem[n1] );
		} else {
			thread_wait( &asem[n1] );
			thread_wait( &asem[n] );
		}
		may();
		t = a[n];
		may();
		a[n] = a[n1];
		thread_signal( &asem[n] );
		may();
		a[n1] = t;
		thread_signal( &asem[n1] );
		may();
	}
```
Part A: Which would give fewer page faults in a virtual memory environment, a boundary tag heap manager that searches the heap starting from the start whenever it needs a free block, or the same heap manager, but always searching from the point where it most recently made an allocation?
The idea of searching from the point of last allocation leads to a relatively uniform distribution of free blocks and used blocks throughout the heap, while searching always from the start leads to a high likelyhood that the free-space fratments at the start will be very small, while the big fragments will be at the other end of the heap. As a result, searching from one end tends to maintain a very compact heap, with the allocated data items clustered near the start.
The compaction of allocated data that results from starting always at the head of the heap leads to the allocated data occupying fewer pages of memory, and this, in turn, leads to fewer page faults during the execution of the user program.

Part B: In general, what is the relationship between fragmentation in the heap and the rate at which the application program suffers page faults?
In general, fragmentation causes the number of pages occupied by user data structures to expand, and this increases the number of page faults required to run the user program.