Exam 1: Midterm

Solutions and Commentary

Part of the assignments for 22C:169, Spring 2010
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Midterm Exam

Mean   = 10.33
Median = 10.7       X                 X X
                    X X         X     X X       X   X   X X
   _______X_____X___X_X_X_X_X___X_X_X_X_X_____X_X___X___X_X_X_X___X_____
     2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 16. 17. 18

Homeworks 1 to 5

                                                    X
                                                    X
                                                    X
Mean   = 20.34                                      X
Median = 23.1                                       X X
                                                    X X
                                                    X X
                                                    X X
                            X       X     X   X X X X X
   _________X_______________X___X___X_X_X_X___X_X_X_X_X___
     0 . 2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24.

Total of Exams and Homework

Mean   = 30.67                                              X
Median = 32.4                                   X           X
                                  X     X X     X     X   X X     X X
   _______X___________________X_X_X_X_X_X_X___X_X_X___X_X_X_X_X___X_X_X_X___
     8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40. 42
                  D C               C B               B A               A

Solutions and Commentary

Background: Consider the problem of testing a program prog that reads all of its input from standard input. Consider the following test script, under a Unix system:
```
 > fuzztest | prog
```
The program fuzztest outputs an infinite string of random 8-bit bytes. On the average, each possible byte value occurs 1/256 of the time. The technical term for this type of test is a fuzz test.
a) Suppose app is compiler for a programming language as complex as C or Java. Would you expect fuzz testing to discover a large fraction of the errors in the application? (1 point).

Fuzz tests won't be likely to generate complex syntax, so they won't be likely to test errors can only be evoked with such syntax. So, fuzz testing will only evoke the simplest of errors.
1/4 did well. 1/6 focused excessively on buffer overflows (fuzz tests are more general than that) while 1/4 spoke of needing to run a long time to detect most errors, without stating clearly that long, in this case, is astronomically long (as in, the lifetime of the universe might not be long enough). Only 1/6 earned no credit.
(Fuzz testing, when applied to real production software, has been demonstrated to be very valuable -- most commercial software subjected fuzz testing has failed!)

b) Suppose you have an application that uses a window-mouse-icon user interface. How would you modify the idea of fuzz testing to test such an application? (1 point).

Your fuzz test needs to emit a random series of mixed mouse and keyboard events.
1/5 did well. 1/2 focused on mouse clicks alone, not suggesting mixing in keypresses in the stream. Only 1/5 earned no credit.

c) Give criteria by which you would judge that an application has passed a fuzz test. (1 point)

The application must either terminate normally or terminate with an error message. It must not go into an infinite loop, trap, or die of an unhandled exception.
Half did well. 1/4 gave ambiguous answers saying the equivalent of "there must be no error" or focused on anticipating the correct error messages for random input (very difficult to do). Only 1/8 earned no credit.
Background: One way to counter buffer overflow attacks is to have the loader randomize the addresses chosen for the code segment, static segment and stack segment each time a program is loaded.
a) Does this really prevent buffer overflow attacks? If not, what does it prevent? (1 point)

No. It merely makes it very unlikely that an attacker will be able to control the consequences by controlling the value of a return address.
Half did well. 1/5 earned partial credit.

b) Describe a potential vulnerability that this defense does not address. (1 point)

You can still attack values of local variables in the stack frame containing the buffer that overflows and you can still cause total failure by injecting a random return address.
1/3 did well. 1/3 earned partial credit.
Background: One way to counter buffer overflow attacks is to split the stack segment into two stacks, with one stack used only to store return addresses and the other used for local variables.
a) Does this really prevent buffer overflow attacks? If not, what does it prevent? (1 point)

This makes an attack on the return addresses extremely unlikely, but it still permits a buffer overflow.
1/3 did well. 1/3 earned partial credit.

b) The defense suggested in the previous question only required localized changes to the loader. This defense requires large changes to another major category of system software? What category and what kind of changes? (1 point)

This requires a rewrite of the compiler to change the subroutine calling sequence.
1/6 did well. 1/5 focused on the operating system, while in fact, most of the operating system doesn't care how the application manages its stack. 1/5 were even more vague, answering that some unspecified system routines must be changed.

c) Describe how a class of buffer overflow attacks that could still succede despite this defensive measure. (1 point)

As in the previous problem, attacks on local variables are still possible.
1/4 did well. The most common partial credit answers involved elaborate schemes to use gigantic buffer overflows to cross the gap between the local variable stack and the return address stack in order to corrupt return addresses. 1/2 earned no credit.
Background: Consider a computer system where system calls are handled as follows:
- The bottom half of the virtual address space of each application is reserved for system use. When the application is running, the bottom half is always marked "illegal".
- On entry to a system call, the operating system's trap handler maps the system's code and data segments into the bottom half of the user's address space and then turns the MMU back on.
- The code to return from a system call turns off the MMU, removes the system segments from the address space, and then does a return from trap to return control to the user with the MMU turned back on.
- Users pass parameters to system calls by pushing them on the user stack.
- System calls use the user's stack for their own activation records.
a) Does the system need to do any address translation in order to interpret user parameters? (1 point)

No.
2/3 got this right. There was no partial credit.

b) Suppose the user calls the read(f,buf,len) system service and the system does not do any pointer vaildity checking. How could the user attack the system? (1 point)

Provide a value of buf that points to system memory, thereby having the system overwrite its own memory when it does the read.
1/4 did well. 1/4 said something about pointers, but were not explicit about which parameter was the vulnerable one. Common errors involved funny values for f, which is, in fact, not seriously vulnerable. 2/5 earned no credit.

c) There are two distinct and disjoint regions of the address space that the user could attack. One is fairly obvious, the other is a bit subtle, but is clearly hinted at in the description given above. Describe both of them informally. (1 point)

The low half of memory (system memory) and any stack frames being used by the system.
1/6 did well. 1/2 got half credit for the low half of memory. 1/3 gave answers vague enough to earn reduced credit. 1/6 earned no credit.

d) Write informal code appropriate for parameter validity checking for the arguments to the above quoted read() system call. (1 point)
```
	if ((buf in system memory) || (buf in system stack frames)) error;
	
```
1/4 did well. 2/3 earned no credit.
e) Informally describe how the trap handler would go about mapping the system's memory segments into the virtual address space on entry to a system call. (1 point)

On entry, the trap handler would edit the page table to make the system's memory addressable (marking those page table entries as valid where they had previously been invalid).
1/8 did well. 1/4 earned partial credit by giving answers that implied large scale copying of the contents of the address space, as opposed to its description. 1/2 earned no credit.
Background: Because the standard Unix shells are unable to defend themselves against injection attacks, the set UID bit does not work launching shell scripts. Note that the Unix model has two user ID's per process: The real UID and the effective UID. The set UID bit causes the effective UID to change on exec, and it is always legal for a process to use the setuid() command to set the effective UID to the real UID.
a) Suppose the system permitted the set UID bit to operate when launching a shell, but you, as the author of the shell, know that it is fatally flawed and inherently unable to defend itself against shell-script injection attacks. Suggest how you could modify the shell so that it voluntarily undoes the effect of the set UID bit if someone should happen to set it on the file holding a shell script. (1 point)

The shell should begin by setting the effective user ID to the real user ID.
Only 1/8 did well. 1/8 earned partial credit. Most gave wrong answers. The most common wrong answers involved doing things that required root access.

b) Because Unix refuses to honor the set UID bit on shell scripts, but honors it on executable machine code files, users who need to use the set UID bit on a shell script sometimes write "wrapper" programs. Consider a shell script named script and a script wrapper named scriptwrapper. Outline briefly the minimal code of scriptwrapper. (1 point)
```
        /* wrapper for shell script.
           The setuid bit can be set on the executable of the wrapper file */
        #include <unistd.h>
        int main(int argc, char* argv[]) {
                execve( "script", argv, envp );
        }
        
```
1/4 did well, 1/4 forgot to pass the arguments! 1/3 left this problem blank.
c) Describe the kind of code you would add to scriptwrapper to help defend script against shell injection attacks. Your answer should include at least an example of the kind of thing your wrapper would need to detect and how it should react on detecting that. (1 point)

If a text argument contains an embedded semicolon or unbalanced parentheses, it could exit with an error condition instead of exec-ing the script. The actual tests on each argument depend on the use the script will make of that argument.
3 did well. 1/3 forgot to say what the wrapper should do if it rejects an argument. 1/5 did not give an example, but merely said what it should do if it finds a bad argument. 1/3 left the problem blank.
Background: A blind path in Unix-like systems is the path-name of a file, such as a/b/c where a user has access to the file, but where the user's only rights to some of the directories along the path are --x or -wx. If I tell you the name of a file with a blind path, you can try to open that file, but you cannot explore the directory structure to find its name without being told or without making a very lucky guess.
Consider the convention that Unix users could use, where each user has a directory ~/InBox. If Alice wants to give Bob a file, Alice puts a link to that file in ~/bob/InBox.
Assume that Carol is an adversary and must not read the file that Alice is giving Bob. Explain how to use a blind path to solve this problem. Specifically:
a) What access rights should be used for ~/bob/InBox? Assume the file has owner ID and group ID set to bob. (1 point)

Using question marks to indicate "does not matter", rw????-w?.
2/3 did well. 1/6 earned no credit.

b) Assume Alice's file has alice as its owner ID and group ID. What access rights should she put on this file before linking to it from ~/bob/InBox? (1 point)

Using question marks, as above, ??????r??
2/3 did well. 1/5 earned no credit.

c) Does Alice need to tell Bob the file name, or can Bob find the file himself? (1 point)

Bob can find it himself.
5/6 did well. 1/10 earned no credit.
Background Suppose I have an application I want to pass to your application so that you can run it. In Unix, the only way to do so is to pass your application the name of the file holding my application. This means your code could save the name, so I cannot keep the identity of my application a secret.
One solution to this problem is as follows. If I have an application, myapp and you have an application yourapp, what I do is create a temporary link to myapp. My shell code to do this would look something like this:
```
	# the following code replaces: yourapp myapp
        ln myapp tempname
        yourapp tempname
        rm tempname
```
A Last Problem: Assume yourapp is also a shell script. What could yourapp do to defeat my defense? Its goal is to retain and deliver to you any kind of handle on my application. Can I defend myself against your improved attack? How? (1 point)
yourapp can attack the above by doing ln tempname yourname to keep a handle on the file.
1/4 earned half credit by getting this far.
The best defense I can mount is as follows:
```
	cp myapp tempname
	yourapp tempname
	echo byebye > tempname
	rm tempname
	
```
Here, you are working with a copy. If you keep a link to that copy, I destroy its value by overwriting the copy with junk. In any case, I delete my link to the copy, deleting the copy if you didn't keep a link, and leaving you with junk if you did.
Only 1/10 got this, and one more earned full credit by explaining why their solution that didn't overwrite the copy failed. 2/3 earned no credit for their work on this problem.