Exam 1: Midterm

Solutions and Commentary

Part of the assignments for 22C:169, Spring 2007
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Exam Scores

Mean   = 8.36    X                   X
Median = 8.0     X X X     X   X     X X
                 X X X     X X X     X X X                 X
_________X___X_X_X_X_X_X___X_X_X_X___X_X_X___X_X_____X___X_X_____X______
  1 . 2 . 3 . 4 . 5 . 6 . 7 . 8 . 9 . 10. 11. 12. 13. 14. 15. 16. 17. 18

Homework Scores (assignments 1 - 5)

Mean   = 18.06                       X     X   X
                                     X     X   X           X   X
                           X     X   X     X X X           X X X   X
_____X___  ____X_____X___X_X_X___X_X_X_____X_X_X___X_X_X_X_X_X_X_X_X____
  2 . 3     10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24

Total Scores (Midterm plus Homework)

All Students

                           X
Mean   = 26.42             X
Median = 26.5              X
                   X     X X     X   X
         X       X X   X X X X   X X X
_____X_X_X___X_X_X_X_X_X_X_X_X_X_X_X_X_X_X___X_____X_X________
  14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40. 42
   C    C    +    -    B    B    +    -    A    A    +

Graduate Students Only

Mean   = 30.49
                                 X
_______X_____________________X___X___X_X_X___X_____X__________
  14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40. 42
   C    C    +    -    B    B    +    -    A    A    +

Solutions and Commentary

Background: Consider the problem of testing a program prog that reads all of its input from standard input. Consider the following test script, under a Unix system:
```
 > fuzztest | prog
```
The program fuzztest outputs an infinite string of random 8-bit bytes. On the average, each possible byte value occurs 1/256 of the time. The technical term for this type of test is a fuzz test.
a) What class of security problems is fuzz testing most likely to detect? (1 point).

Fuzz testing will detect buffer overflow problems with relative ease.
Most students did well here. Some used vague terms such as "attack the return address", for partial credit.

b) What kinds of security problems could theoretically be detected, but would be unlikely to detect unless a huge number of trials were run. Explain both why detection is possible and why it is unlikely. (1 point).

In theory, fuzz testing can detect vulnerabilities to injection attacks, but it would take many trials for random strings to accidentally include syntactic elements that expose such a vulnerability.
Results were divided, with approximately equal numbers doing well, earning partial credit, and earning no credit.

c) For a single fuzz testing trial, what are the symptoms of success or failure? (1 point)

A program passes a fuzz test if it terminates with an error message or a properly handled exception. In the rare event that the fuzz test produces syntactically correct input, the program may also terminate normally. If it goes into an infinite loop, raises an unhandled exception, or terminates with a trap, the program fails the test.
There were many vague answers here. Raising an exception, for example, is not necessarily bad or good. (There were debates about what success and failure meant -- is success passing the test, or is success finding a flaw? It doesn't matter, so long as the meaning is documented.)
Background: Some operating systems launch programs with very predictable addresses, so the code segment always begins at address C and the stack segment always begins at address S. Other operating systems launch programs with effectively random addresses for the stack segment and, with a bit of work, the code segment.
Recall that there are two models for sharing main memory between user processes: In one model, all processes share the same virtual address space while in the other model, each process has a separate virtual address space.
a) Describe an attack which the randomization of stack segment addresses will interfere with. (1 point)

Randomized stack segment addresses prevent buffer overflow attacks that attempt to overwrite pointers in the activation record with pointers to some of the injected data; this includes attacks that inject machine code as part of the buffer and hope to force the execution of that code.
There were odd answers here that were very difficult to classify. This was a hard question. Many thought that randomizing the address of the stack segment would make it difficult to conduct a buffer overflow attack. It is worth noting that the buffer overflow attacks we conducted in class did not depend on the address of the stack segment!

b) Describe an attack which the randomization of program segment addresses will interfere with. (1 point)

Randomized program segment addresses prevent the types of buffer overflow attacks demonstrated in class, that is, where the attack attempts to force execution of code at a known address in the program.
Some said "clobbering the RA" or something similar. There is no problem with clobbering the return address here, since clobbering only means destroying. The problem is with exploiting this to force execution of known code. Again, this was a hard problem.

c) What extra work will the system probably have to do in order to randomize the location of the code segment each time the program is launched. (1 point)

Randomizing the location of the code segment requires relocating all of the absolute addresses in that code segment.
This was very hard, few got it. It required reaching back into what you know of computer architecture and instruction sets, something people seem to have hesitated to do.

d) Does the use of shared code segments interfere with this scheme on a system where there is just one shared address space? (1 point)

Yes, because in such a system, all processes that share the same code segment share it at the same address. So, if your adversary runs that code in order to discover its current code segment address, and then holds the code in memory, your adversary can then attack you when you run that code.
Again, this required thinking about instruction-set level details, something many seem to have been adverse to doing.

e) Does the use of shared code segments interfere with this scheme on a system where each process has a separate address space? (1 point)

Possibly. If the code segment contains no absolute code addresses (all branch and call instructions are relative), then different users can use the segment at different virtual addresses. If there are absolute code addresses in the code segment, then all users sharing that segment will have to see it at the same virtual address.
Again, this requires thinking about instruction-set level details.
Background: Consider a version of Unix on which there is no distinction between effective user ID, real user ID and saved user ID. Instead, there is just one user ID per process, the user ID. Once a program with the set-user-ID bit is launched by exec, it cannot revert to the user ID of the process that launched it.
This poses a problem for SUID applications that need to interactively open new files in the real user's domain. This problem led, at some point in the development of Unix, to the distinction between real and effective user IDs. Here, we will solve the problem in a different way.
When a SUID application wants to access a file in the user's domain, it runs a helper application provided by the user, launching that application as a parallel process by fork and exec. A typical helper might communicate with the main application as follows:
- filename - passed as a program parameter.
- direction - read, write or both, passed as a program parameter.
- standard output from the helper - data read from the file.
- standard input to the helper - data to write to the file.
The helper will most likely open a dialogue with the user, for example, outputting a message like "OK to read filename?" and waiting for the user's permission.
a) For this scheme to work, who must be the owner of the helper application and what must be the access rights to the helper application? (1 point)

The helper must be owned by the user, with rights ??s--x--x. (Since groups have not been mentioned in the question, and the group membership of the various participants and files is unknown, so the group and other access rights fields are identical). The question marks indicate rights that don't matter.
The odd answers here involved strange assumptions about groups and group rights.

b) Why does the helper need to directly access /dev/tty (the controlling terminal) to communicate with the user instead of letting the application pass its standard input and output files to the helper? (1 point)

If the helper were to try to use standard input and output to communicate with the terminal, the SUID application could redirect these, for example, to a file that answers "yes" to any request to permit access to a file. Therefore, it is essential that the application have a direct path to the user's control terminal so that the user, and nobody else, can authorize access to his or her files.

c) Why is it necessary for the helper's owner to keep the file name of the helper application a closely guarded secret? How could the SUID application threaten the helper's owner by recording the helper application's file name? (these are really the same question.) (1 point)

If the file were known to the public, anyone could directly execute the helper in order to steal or alter the helper's owner's files. An evil SUID appliction could steal the file name so that the SUID application's owner could go run the helper later in order to steal files from the helper's owner.
This was a relatively easy question. Many did well.

d) Most users will not write their own helper applications. Rather, they will make a copy of a standard helper provided by the system administrator, /bin/helper. Why must the user make a copy of this file instead of just allowing the SUID applicaiton to run the system's helper? (1 point)

Each user needs to make a copy so that the user may set the ownership and access rights himself.
This was a relatively easy question. Many did well.

e) What must the owner and access rights be for the file /bin/helper mentioned in part d? (1 point)

The owner should be root, with rights ??-r-?r-?. The root's rights to its own helper don't matter. If others execute the root's helper, it does not run as an SUID program, so it is no threat to the root.
Again, strange use of group rights was a problem. Some also got very elaborate with SUID and SGID rights here, a bad idea.
Background: Here is a shell script written as a first draft of the helper application described above:
```
#/bin/tcsh
# helper filename direction
#   filename  -- the path name of a user's file
#   direction -- r, w or rw

if      ($argv[2] == r) then
	cat	$argv[1]
else if ($argv[2] == w) then
	cat	- > $argv[1]
	exit 0
else if ($argv[2] == rw) then
	cat	$argv[1]
	cat	- > $argv[1]
	exit 0
else
#	failure
	exit 1
endif
```
This script is vulnerable to the following injection attack.
```
> helper xxx " 1 ) rm -f helper "
```
a) Suggest a reasonable candidate for an injection attack through the filename parameter. (1 point)

A first try might be helper "xxx ; rm -f helper" rw
Many did reasonably here. A surprising number simply copied the example attack on the direction parameter.

b) In general, what problem do you face in trying to write parameter validation code in an interpretive language such as one of the Unix shells? [consider writing tests of the form if ( $1 =~ pattern )...] (1 point)

As illustrated in the injection attack given, the if statement itself is vulnerable to injection attacks, so writing an if statement to check the parameter will need to be done with extreme care.
Many students gave platitudes about how hard it is to defend against injection attacks.

c) Is the above script vulnerable to a search-path misdirection attack? (1 point)

Quite possibly, because it uses the cat command instead of bin/cat.
Saying just "yes" was worth half credit. Some thought cat was a built-in shell command. It isn't built in in tcsh.
Background: A blind path in Unix-like systems is the path-name of a file, such as a/b/c where a user has access to the file, but where the user's only rights to some of the directories along the path are --x or -wx. If I tell you the name of a file with a blind path, you can try to open that file, but you cannot explore the directory structure to find its name without being told or without making a very lucky guess.
Consider the convention that Unix users could use, where each user has a directory ~/InBox. If Alice wants to give Bob a file, Alice puts a link to that file in ~/bob/InBox.
Assume that Carol is an adversary and must not read the file that Alice is giving Bob. Explain how to use a blind path to solve this problem. Specifically:
a) What access rights should be used for ~/bob/InBox? Assume the file has owner ID and group ID set to bob. (1 point)

The rights should be drw?-w?-w?. Note that the need of the owner to traverse the directory without reading it is unclear, and the need of others to traverse it is also unclear (although there are compelling arguments to forbit it). Others must definitely be able to write the directory in order to deposit files there, and they must not be able to read it, so they cannot see what third-parties have deposited.
Again, there were strange assumptions about groups, even though group membership was never mentioned in the problem. Many seem to have misunderstood the meaning of the x and s access rights on directories. The s (sticky) access right on a directory sets the owner or group of files created in that directory. If you link to a file from that directory, it is not created.

b) Assume Alice's file has alice as its owner ID and group ID. What access rights should she put on this file before linking to it from ~/bob/InBox? (1 point)

The rights should be ???r??r??. Alice may elect to retain rights to the file, but need not do so. Alice must let Bob access the file, so the group and other rights must allow reading. Execute rights may also be extended, if the file is executable. Write rights may be extended.
Odd answere here were common, again, with elaborate group membership schemes or misunderstandings about sticky directories dominating.

c) Does Alice need to tell Bob the file name, or can Bob find the file himself? (1 point)

Bob can read his own directory, so it is easy for Bob to find files that have been left there.
This was easy, most got it.
Background Suppose I have an application I want to pass to your application so that you can run it. In Unix, the only way to do so is to pass your application the name of the file holding my application. This means your code could save the name, so I cannot keep the identity of my application a secret.
Consider this alternative: The new execfve() kernel call is just like execve() except that, where execve takes a file name as a parameter, execfve takes an open file, where that file must have been opened for execution. The right to execute the file is determined at the time the file is opened, but the SUID bit and file ownership apply at the time the file is actually executed.
A Last Problem: There are security problems posed by other questions on this exam that this new service would solve. Which ones. Be specific. (1 point)

Problem 3c mentioned that the helper application needed to remain anonymous when passed to the SUID application. With this new service, the original user can pass an open helper to the SUID application, so that the application only sees an open executable file descriptor.
This was deliberately hard, yet a number of students got it, or at least suggested problem 3, for partial credit. Many others had odd answers that seemed to be wild guesses.