← Back to databases

Database Topic

Isolation Levels

Master transaction isolation levels to balance consistency and concurrency. Essential for database and system design interviews.

Isolation levels control how transactions interact with each other, balancing data consistency against concurrency performance.


The Four Standard Levels

READ UNCOMMITTED

Lowest isolation, highest concurrency

  • Allows dirty reads (reading uncommitted data)
  • No locks on reads
  • Fastest but least safe
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN;
  SELECT balance FROM accounts WHERE id = 1;  -- May read uncommitted data
COMMIT;

Problem: Transaction A reads data that Transaction B later rolls back.

Use case: Rarely used in practice due to data integrity risks.


READ COMMITTED

Default in PostgreSQL and most databases

  • Prevents dirty reads
  • Allows non-repeatable reads (same query returns different results)
  • Allows phantom reads (new rows appear)
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
  SELECT balance FROM accounts WHERE id = 1;  -- Reads committed value: $100
  -- Another transaction commits: UPDATE accounts SET balance = 50 WHERE id = 1;
  SELECT balance FROM accounts WHERE id = 1;  -- Reads new committed value: $50
COMMIT;

Implementation: Row-level locks, released after statement completion.

Use case: Most applications. Good balance of safety and performance.


REPEATABLE READ

Prevents non-repeatable reads

  • Prevents dirty reads
  • Prevents non-repeatable reads
  • Still allows phantom reads
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
  SELECT balance FROM accounts WHERE id = 1;  -- $100
  -- Another transaction commits: UPDATE accounts SET balance = 50 WHERE id = 1;
  SELECT balance FROM accounts WHERE id = 1;  -- Still $100 (snapshot isolation)
COMMIT;

Implementation: Snapshot isolation. Each transaction sees a consistent snapshot of the database.

Use case: When you need consistent reads within a transaction (e.g., calculating totals).


SERIALIZABLE

Highest isolation, strictest consistency

  • Prevents all anomalies: dirty reads, non-repeatable reads, phantom reads
  • Transactions appear to execute serially
  • Slowest but safest
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
  SELECT SUM(balance) FROM accounts;  -- Locks prevent concurrent modifications
  -- All concurrent transactions wait
COMMIT;

Implementation: Predicate locking, serialization conflict detection.

Use case: Critical financial operations, when absolute consistency is required.


Isolation Anomalies

Dirty Read

Reading uncommitted data that may be rolled back.

T1: UPDATE accounts SET balance = 200 WHERE id = 1;
T2: SELECT balance FROM accounts WHERE id = 1;  -- Reads 200 (uncommitted)
T1: ROLLBACK;  -- Balance is back to 100
T2: Now has incorrect data (200)

Prevented by: READ COMMITTED and above

Non-Repeatable Read

Same query returns different results within a transaction.

T1: SELECT balance FROM accounts WHERE id = 1;  -- Returns 100
T2: UPDATE accounts SET balance = 50 WHERE id = 1; COMMIT;
T1: SELECT balance FROM accounts WHERE id = 1;  -- Returns 50 (different!)

Prevented by: REPEATABLE READ and above

Phantom Read

New rows appear in a result set within a transaction.

T1: SELECT COUNT(*) FROM orders WHERE user_id = 1;  -- Returns 5
T2: INSERT INTO orders (user_id, ...) VALUES (1, ...); COMMIT;
T1: SELECT COUNT(*) FROM orders WHERE user_id = 1;  -- Returns 6 (phantom row!)

Prevented by: SERIALIZABLE


Choosing the Right Level

READ COMMITTED (Default)

Use when:

  • Most web applications
  • Good balance of safety and performance
  • Occasional inconsistencies are acceptable

REPEATABLE READ

Use when:

  • Calculating aggregates within a transaction
  • Need consistent snapshot for multiple reads
  • Reporting queries

SERIALIZABLE

Use when:

  • Financial transactions
  • Critical data integrity requirements
  • Can tolerate serialization conflicts/retries

Database-Specific Behavior

PostgreSQL

  • Default: READ COMMITTED
  • REPEATABLE READ uses snapshot isolation
  • SERIALIZABLE uses serialization conflict detection

MySQL (InnoDB)

  • Default: REPEATABLE READ
  • Uses next-key locking to prevent phantoms
  • READ COMMITTED requires different locking strategy

SQL Server

  • Default: READ COMMITTED
  • Supports all four levels
  • Can use snapshot isolation (READ COMMITTED SNAPSHOT)

Performance Implications

  • Lower isolation = Higher concurrency = Better performance
  • Higher isolation = More locking = Potential deadlocks

Best practice: Start with READ COMMITTED, increase only when necessary.


Deadlocks

Higher isolation levels increase deadlock risk:

-- Transaction 1
BEGIN;
  UPDATE accounts SET balance = balance - 50 WHERE id = 1;
  UPDATE accounts SET balance = balance + 50 WHERE id = 2;  -- Waits for T2

-- Transaction 2
BEGIN;
  UPDATE accounts SET balance = balance - 30 WHERE id = 2;
  UPDATE accounts SET balance = balance + 30 WHERE id = 1;  -- Waits for T1

-- Deadlock! Database aborts one transaction

Mitigation: Consistent lock ordering, shorter transactions, deadlock detection.


Interview Questions

1. Beginner Question

Q: What are the four isolation levels, and what problems does each prevent?

A:

  1. READ UNCOMMITTED:

    • Lowest isolation, allows dirty reads
    • Prevents: Nothing (least safe)
    • Use case: Rarely used
  2. READ COMMITTED:

    • Default in most databases
    • Prevents: Dirty reads
    • Allows: Non-repeatable reads, phantom reads
    • Use case: Most applications
  3. REPEATABLE READ:

    • Prevents: Dirty reads, non-repeatable reads
    • Allows: Phantom reads
    • Use case: When you need consistent reads within a transaction
  4. SERIALIZABLE:

    • Highest isolation, prevents all anomalies
    • Prevents: Dirty reads, non-repeatable reads, phantom reads
    • Use case: Critical financial operations

Example:

-- READ COMMITTED (default)
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
  SELECT balance FROM accounts WHERE id = 1;  -- $100
  -- Another transaction commits: balance = $50
  SELECT balance FROM accounts WHERE id = 1;  -- $50 (non-repeatable read)
COMMIT;

2. Intermediate Question

Q: Explain the difference between non-repeatable reads and phantom reads with examples.

A:

Non-repeatable read: The same query returns different results within a transaction.

-- Transaction 1
BEGIN;
  SELECT balance FROM accounts WHERE id = 1;  -- Returns $100
  -- Transaction 2 commits: UPDATE accounts SET balance = 50 WHERE id = 1;
  SELECT balance FROM accounts WHERE id = 1;  -- Returns $50 (different!)
COMMIT;

Phantom read: New rows appear in a result set within a transaction.

-- Transaction 1
BEGIN;
  SELECT COUNT(*) FROM orders WHERE user_id = 1;  -- Returns 5
  -- Transaction 2 commits: INSERT INTO orders (user_id, ...) VALUES (1, ...);
  SELECT COUNT(*) FROM orders WHERE user_id = 1;  -- Returns 6 (phantom row!)
COMMIT;

Key difference: Non-repeatable reads affect existing rows, phantom reads involve new rows.

Prevention:

  • REPEATABLE READ prevents non-repeatable reads (uses snapshot isolation)
  • SERIALIZABLE prevents phantom reads (uses predicate locking)

3. Senior-Level System Question

Q: Design a ticket booking system for a concert with 10,000 seats. Multiple users can book simultaneously. How would you use isolation levels to prevent double-booking while maintaining performance?

A:

Challenge: Prevent two users from booking the same seat while handling high concurrency.

Solution 1: SERIALIZABLE (Strong consistency, lower performance)

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
  -- Check availability
  SELECT COUNT(*) FROM bookings WHERE seat_id = ? AND event_id = ?;
  
  -- If available, book
  INSERT INTO bookings (user_id, seat_id, event_id) VALUES (?, ?, ?);
COMMIT;

Pros: Guarantees no double-booking Cons: Lower throughput, potential for serialization conflicts

Solution 2: READ COMMITTED + Optimistic Locking (Better performance)

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
  -- Use SELECT FOR UPDATE to lock the seat
  SELECT * FROM seats WHERE id = ? AND event_id = ? FOR UPDATE;
  
  -- Check if still available
  IF seat.status = 'available' THEN
    UPDATE seats SET status = 'booked', user_id = ? WHERE id = ?;
    INSERT INTO bookings (user_id, seat_id, event_id) VALUES (?, ?, ?);
  ELSE
    ROLLBACK;
    RETURN 'Seat already booked';
  END IF;
COMMIT;

Pros: Better performance, prevents double-booking Cons: Requires explicit locking

Solution 3: Application-level with database constraints (Best for high concurrency)

-- Database constraint prevents double-booking
ALTER TABLE bookings ADD CONSTRAINT unique_seat_event 
UNIQUE (seat_id, event_id);

-- Application code (READ COMMITTED)
BEGIN;
  INSERT INTO bookings (user_id, seat_id, event_id) 
  VALUES (?, ?, ?)
  ON CONFLICT (seat_id, event_id) DO NOTHING;
  
  IF ROW_COUNT() = 0 THEN
    ROLLBACK;
    RETURN 'Seat already booked';
  END IF;
COMMIT;

Pros:

  • Highest throughput (no explicit locks)
  • Database constraint ensures integrity
  • Simple application code

Cons:

  • Requires retry logic on conflict
  • Users may see "seat available" but booking fails

Solution 4: Distributed system approach (For very high scale)

# Use distributed lock (Redis) + database
def book_seat(user_id, seat_id, event_id):
    lock_key = f"seat:{event_id}:{seat_id}"
    
    # Acquire distributed lock
    if not redis.set(lock_key, user_id, nx=True, ex=30):
        return "Seat being booked by another user"
    
    try:
        # Check and book in database
        with transaction():
            seat = db.query("SELECT * FROM seats WHERE id = ? FOR UPDATE", seat_id)
            if seat.status != 'available':
                return "Seat already booked"
            
            db.execute("INSERT INTO bookings ...")
            db.execute("UPDATE seats SET status = 'booked' WHERE id = ?", seat_id)
    finally:
        redis.delete(lock_key)

Trade-offs:

  • SERIALIZABLE: Strongest guarantee, lowest performance
  • READ COMMITTED + FOR UPDATE: Good balance, explicit locking
  • Application-level + constraints: Highest performance, requires retry logic
  • Distributed locks: For very high scale, adds complexity

Recommendation: Use Solution 3 (application-level + constraints) for most cases. It provides good performance while maintaining data integrity.


Key Takeaways

  • Isolation levels balance consistency and concurrency—higher isolation = more consistency but lower performance
  • READ COMMITTED is the default in most databases and works for most applications
  • REPEATABLE READ prevents non-repeatable reads using snapshot isolation
  • SERIALIZABLE prevents all anomalies but can cause serialization conflicts and lower throughput
  • Dirty reads occur when reading uncommitted data that may be rolled back
  • Non-repeatable reads happen when the same query returns different results within a transaction
  • Phantom reads occur when new rows appear in a result set within a transaction
  • Higher isolation increases deadlock risk—use consistent lock ordering to mitigate
  • Choose isolation level based on requirements—not all applications need SERIALIZABLE
  • Understand database-specific behavior—PostgreSQL and MySQL implement isolation differently
  • Monitor for conflicts—track serialization failures and deadlocks
  • Consider optimistic locking for high-concurrency scenarios instead of high isolation levels

Keep exploring

Database concepts build on each other. Explore related topics to deepen your understanding of how data systems work.