Exam 1: Midterm

Solutions and Commentary

Part of the homework for 22C:112, Spring 2009
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Midterm Exam

Mean   = 6.63
Median = 6.0              X X     X
                  X     X X X     X
____________X___X_X_X_X_X_X_X_X___X___X___X_X___________X_X________
   0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15

Machine Problem 1

                    X
                    X
Mean   = 3.65       X
Median = 4.0        X
                X   X
                X   X
    X           X   X   X
    X           X   X   X
____X___X_____X_X_X_X___X__
   0 . 1 . 2 . 3 . 4 . 5

Homework 1 to 5

Mean   = 7.10               X
Median = 6.1            X X X
                        X X X     X           X X
____X_____________X_X_X_X_X_X_X___X_____X___X_X_X__________________
   0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15

Overall Total Score

                                  X
Mean   = 16.60            X       X
Median = 15.7             X       X
                          X     X X                     X
________X_____________X___X___X_X_X___X_X_X_X_X___X_____X_X_______________
   0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34
Grade           D|C           C|B           B|A           A|

Solutions

Background: Consider this shell script:
```
1    #!/bin/tcsh
2    #  examscript -- a shell script for the exam
3    @ number = 1
4    @ argc = $#argv
5    while   ( $number <= $argc )
6            echo \$argv\[$number\] = $argv[$number]
7            @ number = $number + 1
8    end
```
Note that, in the context of the shell, the phrase "dollar sign substitution" refers to the substitution of values of parameters or variables for their textual names, all of which begin with a dollar sign.
a) Why must the @ shell command be built in. Equivalently, why can't a launched shell appliction do what this command does?

If @ was the name of a normal command, an executable file, the shell would launch it with execve() and pass it a copy of the environment. Any changes it made to the environment would be changes to that copy, and therefore, none of those changes would outlast the execution of the @ command.
1/6 of the class did well, 1/4 at least mentioned problems with variables but didn't make the root of the problem clear.

b) One line of the text of the above script shows that dollar sign substitution is more complex than simply making a single pass through the text doing string substitution. Which line, and what is the complexity?

The echo command where it outputs $argv[$number] because, here, the substitution for $number must be completed first, before $argv[...] can be evaluated.
A few did well, while half the class left it blank. 1/6 earned good partial credit for noting the backslash quoting on line 6 as a problem (it is not nearly as interesting as the array indexing) and 1/6 earned some credit for identifying line 6 without explanation.

c) Why so many spaces? For example, why not write @number=$number+1?

The shell uses spaces to separate items on the command lines. Without the spaces, @number=$number+1 would be seen as $argv[0], the command name. (Actually, the string after $number has been replaced with its value).
1/3 did well here, an equal number left it blank. Partial credit was given for weak explanations.
Background: Programs running on Unix-like systems typically have three segments: j read-execute only code segment, a read-write static segment and a read-write stack segment. Consider the following code:
```
int f( int i ) {
        char * x = "abcdefg";
	if ( (i < 0) || (i > 6) ) return "-";
	return x[i];
}
```
a) The identifiers in the above code are f, i and x. For each identifier, name the segment that ends up holding the associated value. One of these is a bit tricky. For full credit, identify the tricky issue.

f refers to code in the code segment.
i refers to a variable in the stack segment.
x refers to a pointer in the stack segment.
x is a bit tricky because the pointer refers to a constant in either the static or code segments.
1/8 did well. x was difficult, either mis-identified or poorly explained. Curiously, i was also difficult. This problem was supposed to be almost trivial since most of the content except for the C syntax was straight out of 22C:60 (assembly language and computer architecture).

b) The constants in the above code are "abcdefg", 0 6 and "-". For each constant, identify the segment that ends up holding the value. Clearly describe any tricky issues that arise from alternative ways the compiler might deal with these identifiers.

"abcdefg" and "-" are string constants and should probably go in the code segment but they might be treated as the initial values of string variables in the static segment.
0 and 6 are the kind of small constants that are typically encoded as immediate instructions, but even if they are not inline in the code, they would most likely still go in the code segment.
1/8 did well. 1/5 did not put "-" in the same segment as "abcdefg", losing a small amount of credit. Only a few pointed out the error in the code -- either the function should return a character pointer or the quotes on the dash should have been single.

c) Which of the addresses of variables or constants discussed in parts a and b would require relocation when the object file containing this code fragment is linked with other object files to make the final executable program.

In short, all the addresses of things in the code and static segments except for addresses of immediate constants or any PC-relative addressing used within the code segment. References to items in the stack segment cannot be relocated at linkage time since those items are only allocated at run time. In the above, references to f, "abcdefg" and "-" are likely to need relocation.
1/6 did well, an equal number gave very wrong answers.
Background: Consider your keyboard input device. A C program reads a character using fgetc( stdin ) which calls read( 0, &ch, 1 ) which calls dequeue( kbd_queue ) which ultimately gets data from in( kbd_data_register ) a) Identify the system layer to which each of the above function calls forms part of the interface. Each of these system layers may, of course, have other interface functions. fgetc( stdin ) is an interface to the standard library, part of the middleware. read( 0, &ch, 1 ) is an interface to the operating system. dequeue( kbd_queue ) is an interface to all or part of the I/O driver. in( kbd_data_register ) is an interface to the device hardware. Only a few did well, 1/5 gave entirely wrong answers, and 1/3 left the problem blank. This should have been an easy question. b) Identify the system layer that calls each of the above functions, that is, the layer that is the user of the layer you discussed in part (a). fgetc( stdin ) is called by user code. read( 0, &ch, 1 ) is called by middleware (but user code is allowed to call it). dequeue( kbd_queue ) is called by the operating system or by the OS side of the device driver in( kbd_data_register ) is called by the driver's interrupt service routine. 1/8 did well, 1/4 gave entirely wrong answers, and 1/3 left the problem blank. Again, this should have been easy.
Background: Consider a computer system with no memory protection, but with the Unix system call model for input/output. read(fd,buf,len) uses fd to locate the device to be used. Eventually, the user's call to read() leads to a call to a system routine read_devclass(dev,buf,len) that is specific to a specific device class and applied to a specific device. a) Explain how the system maps fd to a specific device dev and a read routine of the appropriate device class. The system maintains an array for each process that maps file descriptors (small integers) to open file or device descriptors. Each descriptor contains pointers to the methods that operate on devices of that class, so the read call translates, inside the operating system, to something like: filetable[fd].device->read_method(filetable[fd].device,buf,len) 1/6 did well, another 1/6 made small errors, 2/5 earned no credit. b) The function read_devclass() is device specific, so why pass the parameter dev to it? There may be many identical devices. This code allows one method to operate on identical devices, distinguishing between them using data in the device or open file descriptor. Over 1/2 did well here, 1/5 gave vague answers for partial credit. c) Suppose fd refers to an open disk file. Where is the specific information for the disk file (what disk is it on, where is the file on that disk) stored. Focus on integrating this information with what you have said above. (This question is not an invitation to regurgitate everything you know about i-nodes!) The device descriptor or open file descriptor holds this information in the form of a pointer to a copy of the file's i-node in RAM. A few did well here, 1/5 gave vague answers for partial credit. 1/2 gave answers that missed the point of the question and 1/4 left it blank. Many students wrote dissertations on i-node management (the question warned about this!) or discussed how files are opened (irrelevant to this question).
Background: Consider the two sides of a FIFO queue used to support a character sequential output device. Recall that the % operator in C is the mod operator (the remainder after division). Assume a single-threaded application. The central queue routines are: void enqueue(queue * q, char ch) { q->tail = (q->tail + 1) % q->size; q->buffer[q->tail] = ch; } char dequeue(queue * q) { q->head = (q->head + 1) % q->size; return q->buffer[q->head]; } a) Given a queue pointer q, give boolean expressions for the queue full and queue empty conditions that are compatible with the above code. empty: q->head == q->tail full: q->head == ((q->tail + 1) % q->size) 1/5 did well, while most earned partial credit. Most did well with the test for empty, while the test for full posed problems. 1/4 had trouble getting the parentheses right for the modular arithmetic. b) Are there any critical sections in this code where interrupts should be disabled in order to prevent trouble? If so, identify them. If not, explain why not. Not necessarily. The fact that the head and tail are each only changed by one routine and the fact that the application is single threaded makes it possible that there are none. (However, poorly written checks for full and empty queues can create trouble.) 1/3 did well, a few thoughtfully identified potential problems, and only 1/8 earned no credit. c) When should the interrupt handler turn off the device interrupt enable bit? The handler should leave interrupts disabled when it discovers that the queue is empty. 1/4 got this right, 1/2 earned no credit. The most common error was to turn the question on its head and answer it for an input queue (possibly cribbing from the same previous exam that the question was cribbed from, but not noticing changes in the question.) d) When should the user-side routine of the driver turn on the device interrupt enable bit? The user side should enable interrupts after every enqueue. 1/5 got this right, 1/8 gave near-correct answers and, 1/2 earned no credit. As with the previous question, the common error was to answer for an input queue.