Two Phase Locking with two-phase commit (2PL+2PC)
How to ensure a serializable schedule?
- Locking-based approach
- Strawman solution 1:
- Grab global lock when transaction starts
- release global lock when transaction finishes committing
- Strawman solution 2:
- grab lock on item X before read/write X
- release lock on item X after read/write X
Problem with strawman 2?
- Permits this non-serializable interleaving
R_1(C), R_1(S), W_1(C), R_3(C), R_3(S), Commit3, W_1(S), Commit1
- Look at W_1(C), if lock on C is released, then another transaction T’ can read new value of C, but T’ must be able to read new value of S that T has not yet written yet.
- If write lock is not held, —> read of uncommitted value
- So, write lock must be held till the transaction commits
(Extra question) How about read locks, must it also be held till transaction commits?
R_3(C), R_1(C), R_1(S), W_1(S), W_1(S), Commit1, R_3(S), Commit3
- This is a non-serializable schedule
- R_3(C) reads old value
- R_3(S) reads new value (non-repeatable reads)
- So, read locks must also be held till commit time
2-phase-locking (2PL)
- a growing phase in which transaction is acquiring locks
- a shrinking phase in which locks are released
- in practice
- growing phase is the entire transaction
- shrinking phase is during commit
- Optimization: use read/write locks instead of exclusive locks
- More on 2PL:
- what is a lock is unavailable?
- deadlock possible?
- how to cope with deadlocks?
- grab locks in order? not always possible
- (central) system detects deadlock cycles and aborts involved transactions
- deadlock prevention
- wound-wait
- wait-die
2-phase-commit (2PC)
- lock without 2PC
- coordinator –> server-a: lock a, log a=1, write a=1 to database state, unlock a
- coordinator –> server-b: lock b, log b=1, write b=1 to database state, unlock b
- Problem (failure to commit)
- what if transaction cannot commit (e.g. deadlocks)
- Problem (failure):
- coordinator crashes after message to a (before message to b).
- A later transaction T2 sees a=1, but the information about b=1 is permanently lost!
- Problem (serializability violation)
- a’s message to server A arrives read(a = 1) read(b = 0)
- b’s message to server B arrives
- 2PC (Two-phase commit)
- coordinator –> server-a: prepare-T1: server-a lock a, logs a=1,
- coordinator –> server-b: prepare-T1: server-b lock b, logs b=1
- coordinator –> server-a: commit-T1: write a=1 to database state, unlock a
- coordinator –> server-b: commit-T1: write b=1 to database state, unlock b
- Now if coordinator crashes after prepare-a before prepare-b, a recovery protocol should abort T1
- (no other transactions can read a=1 since a is still locked)
- Now if coordinator crashes after commit-a before commit-b, a recovery protocol should send commit-T1 to b.
- How does the recovery protocol work? Two options:
- Option 1:
- Coordinator can unilaterially determine the commit status of a transaction
- e.g. Coordinator receives prepare-ok(T1) from server-a, but times out on server-b,
- Coordinator can abort T1 (even if server-b has successfully prepared T1).
- Coordinator durably logs its decision (e.g. to a Paxos RSM).
- Recovery protocol reads from coordinator’s log to decide to commit or abort.
- Option 2:
- Coordinator can not unilaterially determine the commit status of a transaction
- If both server-a and server-b successfully prepared T1, then T1 must commit
- participants log must be durable against failure (e.g. log replicated via Paxos RSM)
- Recovery protocol must read all participating server’s log to decide commit or abort.
- Option 1: