Homework 11

22C:116, Fall 1998

Due Monday Nov 16, 1998, in class

Douglas W. Jones

Background The two-phase model of atomic transactions re-arranges the text of a transaction into two phases. In phase 1, all locks are acquired and any necessary data is read, while in phase 2, all writes are completed. No data item may be written twice, and the lock on that item is released immediately when it is written. Part A: Rewrite the following to conform to the two-phase model:
```
	begin example transaction
	  read x
	  if x < 10 abort transaction
	  x = x - 10
          write x
          read y
	  y = y + 10
	  write y
	end transaction
```
Part B: Prove that all possible transactions can be rewritten to conform to the two-phase model. One acceptable proof is a general set of rewrite rules that can be applied to any code fragment in order to convert it into a two-phase model. An informal statement of such a rule set is sufficient.
Part C: The two-phase model is not important if all database updates are cached and only performed as the transaction is committed. Under what set of assumptions about the implementation of read and write, and under what set of assumptions about system failure is the two-phase model useful.
Background In a simple client-server interaction, either the client or the server can assign the transaction ID, but the client must remember this ID and present it to the server with each subsequent interaction involved in any particular atomic transaction.
Part A: What security or reliability problems are introduced by allowing the client to assign the transaction ID, and which of these problems remain if the server assigns the ID.
Part B: In a multiple-server interaction, for example, where one client deals with multiple servers, or where a server deals with subsidiary servers, what party is the most logical choice for assigning the transaction ID?
Consider our example thread package. (Ah, you wondered when we'd get back to it!) One problem with this package is that, if two threads write to a file, their writes are serialized. This is the expected behavior for multiple processes writing to a device like a printer, but if two processes write to the same disk file, each has its own logical position in the file that is updated by the write operations of that process alone. Suggest how this semantics could be achieved for our thread package.
Note! Your suggestion should be above the level of pseudocode, but it may involve suggesting a set of new functions to add to the thread package, and it may involve suggesting a set of new data structures to go with those functions.