tags : Database, Concurrency, Concurrency Consistency Models, Distributed Systems
A transaction may contain arbitrary code, but it is fundamentally single-threaded.
FAQ
Are SELECT statements also transactions?
- Yes. Since โReadโ is subject to Isolation level, it needs to be a transaction. Eg. What is there that it can see depends on isolation level.
Commits and Rollbacks
- Commit
- When: If a transaction completes
- Change is made permanent in the DB
 
- Rollback
- When: If a transaction aborts/fails
- Undo all changes by the transaction
 
Distributed transaction vs normal DB transaction
- Transaction Protocol = Concurrency Control(CC) + Atomic Commit Protocol
- Distributed Transaction Protocol = Transaction Protocol + Replication Protocol
Normal DB transaction
- Single DB system
- Set of operations that we want to perform on our data
- Typically, a transaction is an all-or-nothing affair
- ACID properties can be implemented directly
Distributed transaction
- Multiple systems/DB systems interacting. i.e Commit happens to more than one piece of hardware.
- ACID properties over the network(handle network partitions)
- Involves more coordination, messaging, logging etc. than normal transactions
- Needs mechanisms like Two Phase Commit (2PC)
Normal Database Transactions
This is done differently for different DBs, following is PostgreSQLโs approach
How to do transaction?
- Start: BEGINorSTART TRANSACTION
- End: COMMITorROLLBACK
- i.e You canโt have multiple COMMITin a transaction,COMMIT/ROLLBACKmarks the end of the transaction. (Nested transactions behave little differently)
- You can have multiple ROLLBACKif you useSAVEPOINT
- Sometimes they can ABORT
- SQL statements outside of explicit transactions(BEGIN/COMMIT) automatically use single-statement transactions.
Lifecycle & Ordering
- T1(started first) and- T2(started later) both are RW transaction
- T1does a lot of reads and then writes
- T2writes and then does some reads
- In this case, even if T1started before, it does not guarantee that writes fromT1would go in first.
Distributed Transactions
- See Distributed Systems, Data Replication
- See Using Atomic Transactions to Power an Idempotent API
- See An illustrated proof of the CAP theorem (2018) | Hacker News
- See CAP
- This is Atomicity v/s Partial Failures
- Transaction Protocol = Concurrency Control(CC) + Atomic Commit Protocol
- Distributed Transaction Protocol = Transaction Protocol + Replication Protocol
Approaches
Two Phase Commit

- Two Phase Commit (2PC)
- We need to use this when we have physically Sharded Databases
Active-passive replicas

Multi-active distributed transactions

- This is the case where Sharding is handled for us by the database (Eg. CockroachDB)
- This is similar to Two Phase Commit (2PC) + Some Replication protocol
Doing it in PostgreSQL directly
- How to process distributed transaction within postgresql? - Stack Overflow
- Distributed transaction on PostgreSQL - Stack Overflow
- Atomic Commit of Distributed Transactions - PostgreSQL wiki
Long running transactions (Saga)
- See Microservices
- It operates differently comparerd to ACID but achieves similar things at business levels applied correctly
- Long-running transaction - Wikipedia
- Pattern: Saga