Exam 1: Midterm

Solutions and Commentary

Part of the homework for 22C:112, Fall 2012
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Exam I

                     X
Mean   = 6.72        X                       X
Median = 7.0         X         X             X
                 X   X         X             X
_______________X_X_X_X_X_______X_X_X_X_X___X_X____________________
  0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15

Homeworks 1-6

Mean   =  9.55
Median = 10.5              X         X           X
                           X         X X     X X X   X X
_________X___________X_X___X_________X_X___X_X_X_X___X_X___X______________
  0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 18. 18

Machine Problems 1-2

                                         X
                                         X
                                         X
                                         X
                                         X
Mean   = 8.84                            X X
Median = 9.5                         X X X X
                                     X X X X
___________________X___X_____X___X___X_X_X_X__
  0 . 1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10

Total Scores

Mean   = 24.18
Median = 25.7
                                                       X
                       X           X         X   X     X     X
_______X_____________X_X___X_______X_X_X_X_X_X_X_X_X___X_X___X_X______
  4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 38
    |     F     |     D     |     C     |     B     |     A     |

Exam Solutions and Commentary

Background: The main program for the posted solution to MP3 was as follows, with comments omitted:
```
int main() {
        for (;;) {
                getcommand();
                parseargv();
                dollarsubs();
                launch();
        }
}
```
The dollarsubs() routine looks for strings in argv that begin with a dollar sign and substitutes the appropriate value from the environment. In the standard Unix shells, dollar substitution interacts with quotes as follows (actual output, cut and pasted):
```
[dpmlh058:~] jones% echo '$SHELL' = "$SHELL for" "user $USER".
$SHELL = /bin/tcsh for user jones.
```
This is quite different from what the output would be under the posted solution to MP3
a) What would MP3 do with the first quoted argument, '$SHELL'?

It would replace it with /bin/tcsh
Note that the posted solution gets rid of the quote marks first, and only then checks the text of each parameter to see if it starts with a dollar sign.
10 got this correct, 8 got the expected wrong answer (saying that quote marks suppress environment variable substitution) and a few had creative wrong answers that fit no clear pattern.

b) What would MP3 do with the second quoted argument, "$SHELL for"?

It would complain that the environment variable $SHELL for is undefined.
This is because it uses the entire text of any parameter that starts with a dollar sign as a variable name, according to what is given above.
Only 1 student got this -- that student got the entire question correct. Most had the expected wrong answer /bin/tcsh for but 3 had creative wrong answers.

c) What would MP3 do with the final quoted argument?

It would output user $USER verbatim.
Only the first character of each parameter is checked for a dollar sign.
Eight got this right and most had the expected wrong answer, user jones. 2 gave creative wrong answers.
Background: When you launch an application using execve( ..., argv, ... ), the main program of the application is eventually called, main( char * argv, ... ). In both cases, argv is documented as being a pointer to the first of an array of pointers to null-terminated character strings. Note that execve() abandons the program that called it and loads the new program, probably from disk, and note that the strings pointed to by the argv passed to execve() may be anywere in memory.
a) Is the argv pointer simply passed from caller to called program or is something more done? If so, what?

The argument strings must be copied into the called program's address space.
Note that the argument strings may be anywhere in the caller's address space -- in the code segment, the stack segment or the static segment. Since the entire address space abandoned (or at least completely reorganized), the strings must be copied.
11 did well. Of the remainder, 4 dodged the question, 2 suggested that the pointer is simply passed, and the rest gave uninterpretable answers.

b) Given that typical Unix programs have 3 segments, a code segment, a static segment and a stack segment, where does the argv data structure go? (Suggestion: Thinking about where it could not go may be useful).

The base of the stack is the most viable place to put them.
They can't go in the code segment, because it is fully occupied by code. If the arguments were of predictable size, they could go in a slot reserved for them in the static segment. Or they could go at the end of the static segment, but the easy place to put them is on the new program's stack, pushed under the activation record for the main program. That segment naturally holds unpredictable-sized objects.
10 did well. Of the remainder, 4 suggested the code segment and the rest suggested the static segment, generally with wrong reasoning to back this up.
Background: Consider this C code for a proposed implementation of setenv, an applicaton you could call from the MUSH shell, or any other shell:
```
 1   /* setenv arg1 arg2 */
 2   #include <stdio.h>
 3   #include <stdlib.h>
 4   void main( int argc, char * argv[] ){
 5           if (argc != 3) {
 6                   fputs( "incorrect argument count\n", stderr );
 7                   exit( EXIT_FAILURE );
 8           }
 9           if (setenv( argv[1], argv[2], 1 ) < 0){
10                   fputs( "could not set variable\n", stderr );
11                   exit( EXIT_FAILURE );
12           }
13           exit( EXIT_SUCCESS );
14   }
```
The following is an un-edited transcript of the result of compiling and testing the above code on one of our departmental Linux servers:
```
 a   [jones@serv16 ~]$ cc -o setenv setenv.c
 b   [jones@serv16 ~]$ ./setenv
 c   incorrect argument count
 d   [jones@serv16 ~]$ ./setenv myvariable myvalue
 e   [jones@serv16 ~]$ /bin/echo $myvariable
 f   myvariable: Undefined variable.
```
a) What was the value of argc as a result of line c of the above test transcript?

argc = 1
Note, during the exam, a correction was issued, the reference to line c should be line b (the input line that caused the error message to be output on line c).
14 got this; the most popular wrong answer was 0. A few gave creative answers, one of which got partial credit.

b) Did the setenv() routine in the C standard library actually make any changes to the environment during the execution of the setenv program launched from line d of the above test transcript?

Yes, it defined myvariable to have the value myvalue.
13 got this; the most popular wrong answer, given by 5, was that no change was made to the environment. A few answers were difficult to interpret but earned partial credit.

c) Why did line e of the above test transcript produce the error message on line f? Your answer must take into account your conclusions above!

The changes made to the environment were discarded when the setenv program terminated, so they were never visible to the calling shell.
8 did well here; the most popular wrong answer was that myvalue is undefined, in effect, dodging the question of why. Again, a few gave difficult to interpret answers.

d) Is the error message on line f an output from the /bin/echo application? If not, what code produced that error message?

The shell never called the echo application because it found that $myvariable was undefined when it tried to do the variable substitution.
10 did well; the most popular wrong answer, given by 6, was that /bin/echo complained. A few dodged the questions, and a few had terminology problems, one of the latter earned partial credit.
Background: The C and C++ stream model, used with variables of type FILE * (pointer to file). The type FILE is typically declared something like the following (deliberately oversimplified):
```
struct FILE {
    int fd;     /* file descriptor to use for this file */
    int size;   /* buffer size */
    char * buf; /* pointer to the buffer */
    int pos;    /* position in buffer */
};
```
When fopen() is used to open a stream, it uses open() to open the the file on the underlying operating system, stores the returned file descriptor in fd, queries the newly file to find the best buffer size for I/O to that file, allocates a buffer and initializes the buffer position appropriately. In the context of the above, fputc() might be implemented as follows:
```
char fputc( char ch, FILE * s ) {
    if (s->pos >= s->size) {
        write( s->fd, s->buf, s->size );
        s->pos = 0;
    }
    s->buf[ s->pos ] = ch;
    s->pos = s->pos + 1;
}
```
a) Why this complexity? Why not just call write( s->fd, &ch, 1 ) to output each character?

Calling write() with a full buffer of the device's preferred size is more efficient than using it to transfer one character at a time.
Note that the efficiency issue is the cost of the write operation itself, not the cost of disk or other device access. Note also that the blocking of output into a buffer is quite distinct from queueing data for later output.
9 did well, one earned partial credit; the remainder of the wrong answers showed no common patterns.

b) What is the data type of the fd component of each stream in a Unix or Linux system?

Integer.
The integer is an index into a small table of open files maintained on behalf of each process. Each table entry contains a pointer to the open file data structure.
13 got this right, 2 earned partial credit; the wrong answers all boiled down to some kind of pointer to an open file data structure.
Background: The interrupt handler for an output data stream might look like this, written in C, assuming a little bit of assembly language is used to catch the interrupt, save registers, call this code, and then restore registers and return from interrupt:
```
void streamhandler(){
        if (streamqueue->empty( streamqueue )) {
                int c;
                c = in( CONTROL_REGISTER);
                out( c & ~ENABLE_BIT, CONTROL_REGISTER );
        } else {
                char ch;
                ch = streamqueue->dequeue( streamqueue );
        }
}
```
a) Twice, this code does things like x->f(x). What's going on here?

This is the closest you can come in C to calling the methof f() of the object x.
9 did well, and 2 more correctly described what the code does in a manner that indicated understanding without connecting to object oriented programming. The remainder of the answers either dodged the question or suggested no connection with the dominant programming paradigm of the last 30 years.

b) This code resets the enable bit in the device control register. What code is responsible for setting the enable bit?

The code that enqueues in streamqueue; this is usually in the user interface of the I/O driver.
5 did well. Among those earning partial credit, 5 noted (correctly) that interrupts are enabled on return. That, however, involves a different interrupt enable bit; 2 more said the user re-enables interrupts. Among wrong answers, a surprising number simply gave code (usually incorrect) for enabling interrupts, without indicating where this code goes.

c) The code to reset the enable bit first reads the control register and then writes it. Why not just write a constant to the control register to reset the bit?

The other bits in the control register should not be changed when resetting the bit.
6 did well, while 2 more earned partial credit. It was hard to find a patteern in wrong answers.

d) On a uniprocessor, are there problems with critical sections in this code? Why or why not?

It is in an interrupt service routine, so (by default) interrupts are already disabled. Therefore, no problems.
1 did well. 8 more earned partial credit for saying no problem and then giving incorrect reasons. Among wrong answers, many wandered off into discussion of things like threads or recursion, while other answers indicated a clear misunderstanding of interrupt service routines.