Topic Overview

Two-Phase Commit (2PC)

Learn the Two-Phase Commit protocol for achieving atomicity in distributed transactions.

Intermediate9 min read

Two-Phase Commit (2PC) is a distributed consensus protocol that ensures all participants in a transaction either commit or abort together, maintaining atomicity.


Overview

2PC ensures atomicity: either all nodes commit the transaction, or all abort. It's a blocking protocol with a coordinator and participants.


Protocol Phases

Phase 1: Prepare (Voting)

  1. Coordinator sends "prepare" message to all participants
  2. Each participant:
    • Writes transaction to log (prepare record)
    • Votes "yes" (ready to commit) or "no" (must abort)
    • Sends vote to coordinator
  3. Coordinator collects votes

Phase 2: Commit/Abort (Decision)

  1. If all vote "yes":

    • Coordinator writes "commit" to log
    • Sends "commit" to all participants
    • Participants commit and send acknowledgment
  2. If any votes "no":

    • Coordinator writes "abort" to log
    • Sends "abort" to all participants
    • Participants abort and send acknowledgment

Implementation

1class TwoPhaseCommitCoordinator {
2 async executeTransaction(participants: Participant[]): Promise<boolean> {
3 // Phase 1: Prepare
4 const votes = await Promise.all(
5 participants.map(p => this.preparePhase(p))
6 );
7
8 // Phase 2: Commit or Abort
9 if (votes.every(v => v === 'yes')) {
10 await this.commitPhase(participants);

Examples

Database Replication with 2PC

1class ReplicatedDatabase {
2 async writeTransaction(data: any): Promise<void> {
3 const coordinator = this.selectCoordinator();
4 const replicas = this.getAllReplicas();
5
6 // Phase 1: Prepare on all replicas
7 const votes = await Promise.all(
8 replicas.map(replica => replica.prepareWrite(data))
9 );
10
11 if (votes.every(v v

Common Pitfalls

  • Coordinator failure: Participants block indefinitely. Fix: Use timeouts, elect new coordinator, or use 3PC
  • Network partition: Cannot proceed if participants unreachable. Fix: Use majority-based commit or eventual consistency
  • Not logging state: Cannot recover from failures. Fix: Write all state to durable log
  • Blocking behavior: Participants wait for coordinator. Fix: Use timeouts, consider alternative patterns
  • Single point of failure: Coordinator is critical. Fix: Use coordinator replication or alternative protocols

Interview Questions

Beginner

Q: What is Two-Phase Commit and how does it work?

A: Two-Phase Commit (2PC) is a protocol that ensures all participants in a distributed transaction either all commit or all abort.

How it works:

  1. Phase 1 (Prepare): Coordinator asks all participants if they can commit. Participants vote yes/no.
  2. Phase 2 (Commit/Abort):
    • If all vote yes: Coordinator tells everyone to commit
    • If any votes no: Coordinator tells everyone to abort

Goal: Atomicity - all nodes agree on the outcome.


Intermediate

Q: What are the limitations of 2PC? How would you address them?

A:

Limitations:

  1. Blocking: If coordinator fails, participants block waiting for decision
  2. Single point of failure: Coordinator is critical
  3. Not partition-tolerant: Requires all nodes to be reachable
  4. High latency: Multiple round trips (prepare + commit)
  5. Synchronous: All participants must respond

Solutions:

  1. Timeouts: Participants timeout and abort if coordinator doesn't respond
  2. 3PC: Three-Phase Commit reduces blocking (adds pre-commit phase)
  3. Saga pattern: Use compensating transactions for eventual consistency
  4. Paxos/Raft: Use consensus algorithms for better fault tolerance
  5. Majority commit: Commit if majority agrees (sacrifice some consistency)

When to use: Short transactions, strong consistency required, all nodes must agree.


Senior

Q: Design a fault-tolerant 2PC system. How do you handle coordinator failures, participant failures, and network partitions? How do you ensure no data loss?

A:

Fault-Tolerant 2PC Design:

1class FaultTolerant2PC {
2 private coordinator: Coordinator | null = null;
3 private participants: Participant[] = [];
4 private log: DurableLog;
5
6 async executeWithFaultTolerance(transaction: Transaction): Promise<void> {
7 // Elect or select coordinator
8 this.coordinator = await this.electCoordinator();
9
10 // Phase 1: Prepare with timeout
11 const prepareResults = await Promise.allSettled(
12 this.participantsp

Handling Failures:

  1. Coordinator failure:

    • Participants timeout and can abort
    • Or elect new coordinator to recover
    • New coordinator reads log, queries participants, makes decision
  2. Participant failure:

    • Coordinator continues with remaining participants
    • Failed participant recovers from log on restart
    • Can query coordinator or other participants for decision
  3. Network partition:

    • Majority partition can proceed (if configured)
    • Minority partition blocks or aborts
    • Resolve conflicts when partition heals

Data Loss Prevention:

  • Durable logging: Write all state to persistent log before responding
  • Write-ahead log (WAL): Log before applying changes
  • Replication: Replicate coordinator log for high availability
  • Quorum: Require majority for commit decisions

  • 2PC ensures atomicity: All participants commit or all abort

  • Two phases: Prepare (voting) and Commit/Abort (decision)

  • Blocking protocol: Participants wait for coordinator

  • Coordinator is critical: Single point of failure

  • Not partition-tolerant: Requires all nodes reachable

  • Use for: Short transactions, strong consistency, all-or-nothing requirements

  • Alternatives: 3PC (less blocking), Saga (eventual consistency), Paxos/Raft (better fault tolerance)

  • Three-Phase Commit (3PC) - Non-blocking alternative to 2PC

  • Distributed Transactions - Maintaining ACID properties across nodes

  • Consensus Algorithms (Raft, Paxos) - Alternative consensus protocols

  • Fault Tolerance - Handling failures in distributed systems

  • Partition Tolerance - CAP theorem and network partitions

Key Takeaways

2PC ensures atomicity: All participants commit or all abort

Two phases: Prepare (voting) and Commit/Abort (decision)

Blocking protocol: Participants wait for coordinator

Coordinator is critical: Single point of failure

Not partition-tolerant: Requires all nodes reachable

Use for: Short transactions, strong consistency, all-or-nothing requirements

Alternatives: 3PC (less blocking), Saga (eventual consistency), Paxos/Raft (better fault tolerance)


About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.