Exam 2: Final

Solutions and Commentary

Part of the assignments for 22C:169, Spring 2010
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Grade Distributions

Final Exam

                      X
Median = 10.7     X   X       X             X
                  X X X X     X   X     X   X   X X
           _____X_X_X_X_X___X_X_X_X_____X_X_X___X_X___X_________________
             2 . 4 . 6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30

Midterm plus Final

Mean   = 22.09
Median = 20.5     X     X
                X X     X X     X X                         X   X
 _______X_____X_X_X_X___X_X___X_X_X_X_X_____X_____X_X___X_X_X_X_X_________
   6 . 8 . 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40

Homeworks 1 to 10

Mean   = 42.73                                                    X X
Median = 46.7                                                     X X
                                                                  X X
                                                            X   X X X
                                                        X   X   X X X
 _________X_________________X___X_X_X_______X_X___X___X_X___X___X_X_X_X___
   16. 18. 20. 22. 24. 26. 28. 30. 32. 34. 36. 38. 40. 42. 44. 46. 48. 50

Overall Total Scores

Mean   = 64.82                        G
Median = 64.9             G           U           G
                      U E U   G   E   U       G   G G
   _________U_U___U___U_E_U_G_U___E_G_U_G_____G___G_U_G______
     36. 40. 44. 48. 52. 56. 60. 64. 68. 72. 76. 80. 84. 88
Grade Scale   +|-   - B     B +   +|-   - A     A +   +

U = undergraduate
E = engineering undergrad
G = grad student

Exam

General Background: A sandbox is a secure protection domain in which untrusted code can be executed without fear that it will escape into the larger system. Typically, sandboxes are created by limiting the available resources and filtering access to all operations that might be dangerous. Sandboxes can be created at many levels, ranging from the programming language level to the system call level, but only in systems with certain features.
  1. A question: Explain how the chroot() system call relates to building a sandbox. Make sure to indicate something it does not do as well as things it does.

    It isolates a process (and processes it may launch) to one subtree of the file system. It only relates to the file system, and not to any other resources a process might use.

    6 did well. 18 got partial credity, saying nothing about what it does not do. 5 got partial credit, focusing exclusively on defense against rm -r /.

  2. A question: You have two sandboxes that are completely disjoint, where untrusted code can be launched in each of them by chroot() followed by exec(). The program you intend to run in one sandbox needs to produce output that will be consumed by the program you will run in the other. Explain how you can set things up to permit this.

    There are several solutions. One of them is to use hard links to install a file into each of the formerly disjoint subtrees of the file system prior to changing the roots to enter either file system. Several solutions rest on the fact that files open prior to the process of launching a program in a sandbox can remain open, so access to shared files can be established this way.

    8 did well. 12 earned no credit. A surprising number suggested using a shared directory or nesting relationships, for partial credit.

  3. Background: The application at takes, as a parameter, a time and date and a shell command. It records the parameters given, the current effective user ID, current effective group ID, current working directory, and the current environment. At the indicated time on that date, a daemon process reads the record of saved information in order to restore the user's domain and then launch a shell to run the indicated command.

    a) The application at must record the saved information in some file. How can this file be protected from corruption by any code other than authorized users? How can the at command access this file?

    The file for the schedule file can be owned by a special user, with no other user having access to this file, and the at command can be a setuid application with that user ID.

    3 did well. 10 earned no credit. The most common error (leading to partial credit) was to focus on the schedule file and forget the second half of the question.

    b) What rights does the daemon process need in order to change restore the appropriate user environment to launch the user's command? Don't forget the prerequisites for access to the file of pending actions as well as the prerequisites for restoring the user's domain.

    The daemon needs root access, since its job includes changing the user ID and group ID to the saved IDs of the user that scheduled the event.

    6 got this right. 13 earned no credit. Several students got partial credit for an enumeration of the things the process needed the right to do, without indicating how a Unix process could get the right to do these things.

    c) Given what you have inferred above, could at be used from a sandbox established using chroot()?

    Given that the new root's /bin/at is linked to the correct binary, and given that a hard link to the appropriate schedule file is put in the right place, yes. This only works if a single schedule file is maintained, with one record per pending event. If you try to change it to a directory, with one file per pending event, it cannot be linked.

    1 did well, 14 earned no credit. Partial credit was given for those who offered no explanation.

  4. A question: Explain how a user setting up an environment in preparation for use of a sandbox made using chroot() could make a mistake that allows a sandboxed application to escape.

    Including a link to a file used by an outside application launcher would do this. Consider, for example, an implementation of at that did not record and restore the root of the file system. Users of this version of at would be able to launch code outside their sandbox.

    Nobody did well, but 7 earned partial credit.

  5. Background: One way to create a sandbox for some untrusted code is to compile that code in an environment where all system calls are intercepted by wrappers routines. This could be done by compiling the untrusted code with sandboxlib instead of stdlib. The wrapper for open() in sandboxlib could, for example, edit or check the filename to make sure that only permitted files are used, and then call the real open() system service.

    a) Consider the problem of writing sandboxlib for C under Unix. What problem do you face making your code for (for example) open() call the system's version of open()? Can you solve this problem?

    The problem is, how to call a subroutine with the same name as the calling subroutine. A solution will require some way of renaming the system routines.

    2 did well. 2 more got partial credit for mentioning the renaming problem without giving any detail.

    b) Assuming you've filled sandboxlib with interfaces to guard each and every kernel call, how could a C program tunnel through the walls of your sandbox and make a direct kernel call? Hint: See part a!

    Whatever mechanism allows the sandboxlib versions of the system routines to call the real system routines could be used directly by the application to do the same thing.

    6 did well, some with general answers such as the above, others with answers that focused on C's lack of strong typing.

  6. Background: IBM's VM operating system for their line of enterprise servers creates a number of virtual machines. Each virtual machine has its own virtual address space, and runs in user mode, so user-mode instructions run at full speed, except when there is a page fault, and all privileged instructions force traps to handlers in VM. The trap handlers in VM emulate the trapped instructions exactly, so that, for example, I/O instructions are still I/O instructions, although they may address a different device than the one the user thinks is being addressed.

    The virtual machine emulation is so complete that any operating system that would run on an IBM enterprise server can be run on a virtual machine under VM. Even VM can be run under itself. It is common to use VM to create two virtual machines, one to run Unix for access to the internet, and one to run a legacy IBM operating system such as OS, with VM mediating communication between the two.

    a) Relate the concept of a virtual machine to the concept of a sandbox.

    A virtual machine, in the sense used here, can be thought of as a sandbox for containing an entire operating system as well as the applications it runs.

    5 did well, 23 got partial credit. Very few earned no credit.

    b) Relate the idea of a virtual machine, as supported by VM, to the idea of wrappers for system calls discussed in the previous question.

    A virtual machine can be thought of as a wrapper around the privileged instructions of an architecture. Note that the way a virtual machine moderates access to memory is not cleanly described by the concept of a wrapper.

    3 did well. 22 earned partial credit.

  7. Background: The classical Unix access rights are rwx meaning for regular files read, write and execute, and for directories, "the right to inspect", modify, and "traverse as part of a path name". Certain system calls such as open() and exec() require certain access rights.

    Event logs, as used for forensic examination of system failures, record sequences of event records, each typically taking the form of a time stamp followed by a note about what happened at that time. Event log management policies dictate how long event logs are retained. Event logging policies dictate what events must be logged.

    a) What combination of file ownership, group membership, and access rights maximizes the accuracy of enforcement of access to event logs under Unix? A complete answer will document the log file itself as well as an example program that logs events and an example event log manager.

    For the log file:
      rw--w---- owned by logmanager, group logger

    For the log manager application:
      --s--?--? owned by logmanager, depending on how the log manager is written, it may be safe to allow anyone to execute this program.

    For the logger application:
      --?--s--? group logger, the right to launch the logger depends on context that has not been given.

    1 did well. 8 earned no credit. The most common error was to give the log file's access rights, but to say nothing about the logger or the log manager, one of which needs SUID rights, the other SGID rights. It is interesting to note that there is significant overlap between this problem and problem 3, especially part b.

    b) If you could add an additional access right to Unix to improve the protection of event logs, what right would you add? What misbehavior on the part of the event-log manager or the progras that log events would this prevent that is not prevented under Unix?

    It would be nice to limit the logger to append-only access to the log file. This would prevent the logger from making random changes to the log file.

    8 did well here. 10 earned no credit.

    c) Show the idealized access matrix for the log file manager, the program that logs events, and the event log file.

             |log file|
    ---------+--------+
    logger   | append |
    ---------+--------+
    manager  |   rw   |
    ---------+--------+
    

    4 did well here. 12 earned no credit. Strong partial credit was given to those who expressed the right content but had trouble with the format. Many tried to give Unix-specific information, comparable to the output of ls -l; that kind of information is more appropriate in answer to part a.

  8. Background: Andrew Tannenbaum's Amoeba system is explicitly object oriented. Directory entries map textual names to capabilities, where a capability is an object handle and a set of access rights. Amoeba allows 8 bits for the access-rights field on a capability, but does not impose an interpretation on those bits. Instead, it is up to the code implementing each class to impose meaning on the access rights.

    Consider the double-ended queue as an example of a class you could implement under Amoeba, with the following methods: enqueuehead, dequeuehead, enqueuetail, dequeuetail, freespace (how much free space is there in the queue), and data (how much data is there in the queue).

    a) In an object-oriented system, what natural interpretation of access rights allows a clean and fairly fine grained protection model.

    One access right per method of each class.

    7 did well. 2 more got partial credit for suggesting such things as a read/write/delete model.

    b) If an object oriented protection system were built with only 2 access rights bits, read and write, which rights would each dequeue method listed above require?

    The dequeue methods require both read and write, read to get the item dequeued, and write to delete that item from the queue.

    16 got this. 4 more were vague about the need for write addess, 5 suggested read only, and 3 suggested write only.

  9. Background: You can now buy quantum cryptographic hardware that offers genuinely unbreakable crypto over the communications line between any two computers that can be directly connected by a single optical fiber. The underlying cryptographic mechanism only operates at the link level.

    Your boss asks, "since we have unbreakable quantum crypto on every link in the network, why are we bothering to encrypt messages at the application level?"

    A Problem: Answer your boss's question.

    Protecting the data links is not the only problem. You also need to worry about corrupt software between the application and the network on the sending and receiving machines, and you need to worry about corrupt software on the gateways, firewalls and other machines on the path from sender to receiver. In sum, you need complete end-to-end coverage, and link-level encryption does not do this.

    5 did well. Among those who earned partial credit, 14 discussed the possibility of malware without clear focus, 7 discussed protecting the data when it was stored on disk, and 2 discussed protection from errors rather vaguely.

  10. Background: You have been hired to establish an e-mint. Your e-mint sells e-cash certificates in exchange for payments in conventional dollars or valid e-cash certificates. An e-cash certificate is a record that includes, in plaintext form, a denomination (for example, $10), a serial number, and key. The key is much much larger than the serial number, and it determines the validity of the e-cash certificate.

    The recipient of an e-cash certificate is responsible to see that it is never replicated. The recipient may transfer it to others as payment, at which point, it is the recipient's responsibility to not duplicate the certificate.

    The first one to return any particular e-cash certificate to the e-mint will receive its cash value. Presumably, the e-mint mint makes money like any bank, by investing its deposits, since the holder of an e-certificate only gets back the face value of the certificate.

    a) How can the e-mint defend itself against forgery of e-cash certificates? (Hint: This is a server side authentication problem).

    The mint must retain a table of all currently valid certificates in order to guarantee that it only cashes each certificate once. Indexing this table by serial number makes sense, and using it to verify the denomination is obvious. If all mint employees are trustworthy, it is sufficient to have the key be a very large random number. If there is a fear that someone might take a copy of the table of valid certificates, then the table entry should be the result of applying a one-way function to the key. Random keys may still be used.

    3 did well. 10 earned no credit. The others generally suggested using cryptographic signatures of some sort, either assuming that the use of cryptography was obvious or giving insufficient detail to figure out exactly what input was being used where.

    b) You are a user of this e-cash service. You understand that, when you receive an e-payment from another user, that user might have cheated and given you a forged certificate or a copy of one that has already been paid to someone else. As the recipient of an e-cash payment, how can you defend against, or at least detect such cheating so that you can blacklist the cheater and defend against being blacklisted by someone you make payments to?

    On receipt of an e-cash certificate, you should immediately cash it in or use it to buy a new certificate. Any delay gives your partner in trade an opportunity to use the same certificate to pay someone else.

    10 did well. 11 more thought it was sufficient to immediately check the validity of the certificate.

    c) Compare the anonymity of this form of e-cash with the anonymity of conventional paper money. Is this more or less anonymous?

    The effective requirement that you check with the bank each time you receive a certificate implies that the bank is in a position to monitor every transaction in the entire economy. With conventional cash, even though each bill has a serial number that you can record each time you see a bill, neither you nor those with whom you trade learn anything significant by tracking those numbers because many of the bills change hands many times before returning to any one agency.

    6 did well here, while 5 more gave far more speculative answers.

    d) Consider a counterfeiter who buys (using real money) a sequence of e-cash certificates and attempts, on this basis, to hack the system and deduce the next valid e-cash certificate. Describe how you would go about assessing the counterfeiter's chance of success, based on your answer to part a.

    If we assume sequential assignment of serial numbers, the counterfeiter has no difficulty guessing the next serial number the bank will issue. The problem is, can the counterfeiter guess the next key. If keys are truly random, such a guess will be purely a matter of chance, based on the size of the key space. If we assume pseudo-random generation of keys where the counterfeiter knows the algorithm, then we must examine the leakage of information about the pseudo-random number seed into the sequence of outputs from the generator. In the worst case, if each pseudo-random number is n bits, this reveals n bits of the seed.

    6 did well. Over 20 earned no credit, many of whom gave detailed analyses of schemes that appear to have been invented on the spot and had little to do with the answers given to part a. Also note, the question was not "what is the security" but "how would you go about assessing" it.