Database Topic
Isolation Levels
Master transaction isolation levels to balance consistency and concurrency. Essential for database and system design interviews.
Isolation levels control how transactions interact with each other, balancing data consistency against concurrency performance.
The Four Standard Levels
READ UNCOMMITTED
Lowest isolation, highest concurrency
- Allows dirty reads (reading uncommitted data)
- No locks on reads
- Fastest but least safe
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- May read uncommitted data
COMMIT;
Problem: Transaction A reads data that Transaction B later rolls back.
Use case: Rarely used in practice due to data integrity risks.
READ COMMITTED
Default in PostgreSQL and most databases
- Prevents dirty reads
- Allows non-repeatable reads (same query returns different results)
- Allows phantom reads (new rows appear)
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- Reads committed value: $100
-- Another transaction commits: UPDATE accounts SET balance = 50 WHERE id = 1;
SELECT balance FROM accounts WHERE id = 1; -- Reads new committed value: $50
COMMIT;
Implementation: Row-level locks, released after statement completion.
Use case: Most applications. Good balance of safety and performance.
REPEATABLE READ
Prevents non-repeatable reads
- Prevents dirty reads
- Prevents non-repeatable reads
- Still allows phantom reads
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- $100
-- Another transaction commits: UPDATE accounts SET balance = 50 WHERE id = 1;
SELECT balance FROM accounts WHERE id = 1; -- Still $100 (snapshot isolation)
COMMIT;
Implementation: Snapshot isolation. Each transaction sees a consistent snapshot of the database.
Use case: When you need consistent reads within a transaction (e.g., calculating totals).
SERIALIZABLE
Highest isolation, strictest consistency
- Prevents all anomalies: dirty reads, non-repeatable reads, phantom reads
- Transactions appear to execute serially
- Slowest but safest
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
SELECT SUM(balance) FROM accounts; -- Locks prevent concurrent modifications
-- All concurrent transactions wait
COMMIT;
Implementation: Predicate locking, serialization conflict detection.
Use case: Critical financial operations, when absolute consistency is required.
Isolation Anomalies
Dirty Read
Reading uncommitted data that may be rolled back.
T1: UPDATE accounts SET balance = 200 WHERE id = 1;
T2: SELECT balance FROM accounts WHERE id = 1; -- Reads 200 (uncommitted)
T1: ROLLBACK; -- Balance is back to 100
T2: Now has incorrect data (200)
Prevented by: READ COMMITTED and above
Non-Repeatable Read
Same query returns different results within a transaction.
T1: SELECT balance FROM accounts WHERE id = 1; -- Returns 100
T2: UPDATE accounts SET balance = 50 WHERE id = 1; COMMIT;
T1: SELECT balance FROM accounts WHERE id = 1; -- Returns 50 (different!)
Prevented by: REPEATABLE READ and above
Phantom Read
New rows appear in a result set within a transaction.
T1: SELECT COUNT(*) FROM orders WHERE user_id = 1; -- Returns 5
T2: INSERT INTO orders (user_id, ...) VALUES (1, ...); COMMIT;
T1: SELECT COUNT(*) FROM orders WHERE user_id = 1; -- Returns 6 (phantom row!)
Prevented by: SERIALIZABLE
Choosing the Right Level
READ COMMITTED (Default)
Use when:
- Most web applications
- Good balance of safety and performance
- Occasional inconsistencies are acceptable
REPEATABLE READ
Use when:
- Calculating aggregates within a transaction
- Need consistent snapshot for multiple reads
- Reporting queries
SERIALIZABLE
Use when:
- Financial transactions
- Critical data integrity requirements
- Can tolerate serialization conflicts/retries
Database-Specific Behavior
PostgreSQL
- Default: READ COMMITTED
- REPEATABLE READ uses snapshot isolation
- SERIALIZABLE uses serialization conflict detection
MySQL (InnoDB)
- Default: REPEATABLE READ
- Uses next-key locking to prevent phantoms
- READ COMMITTED requires different locking strategy
SQL Server
- Default: READ COMMITTED
- Supports all four levels
- Can use snapshot isolation (READ COMMITTED SNAPSHOT)
Performance Implications
- Lower isolation = Higher concurrency = Better performance
- Higher isolation = More locking = Potential deadlocks
Best practice: Start with READ COMMITTED, increase only when necessary.
Deadlocks
Higher isolation levels increase deadlock risk:
-- Transaction 1
BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 1;
UPDATE accounts SET balance = balance + 50 WHERE id = 2; -- Waits for T2
-- Transaction 2
BEGIN;
UPDATE accounts SET balance = balance - 30 WHERE id = 2;
UPDATE accounts SET balance = balance + 30 WHERE id = 1; -- Waits for T1
-- Deadlock! Database aborts one transaction
Mitigation: Consistent lock ordering, shorter transactions, deadlock detection.
Interview Questions
1. Beginner Question
Q: What are the four isolation levels, and what problems does each prevent?
A:
-
READ UNCOMMITTED:
- Lowest isolation, allows dirty reads
- Prevents: Nothing (least safe)
- Use case: Rarely used
-
READ COMMITTED:
- Default in most databases
- Prevents: Dirty reads
- Allows: Non-repeatable reads, phantom reads
- Use case: Most applications
-
REPEATABLE READ:
- Prevents: Dirty reads, non-repeatable reads
- Allows: Phantom reads
- Use case: When you need consistent reads within a transaction
-
SERIALIZABLE:
- Highest isolation, prevents all anomalies
- Prevents: Dirty reads, non-repeatable reads, phantom reads
- Use case: Critical financial operations
Example:
-- READ COMMITTED (default)
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- $100
-- Another transaction commits: balance = $50
SELECT balance FROM accounts WHERE id = 1; -- $50 (non-repeatable read)
COMMIT;
2. Intermediate Question
Q: Explain the difference between non-repeatable reads and phantom reads with examples.
A:
Non-repeatable read: The same query returns different results within a transaction.
-- Transaction 1
BEGIN;
SELECT balance FROM accounts WHERE id = 1; -- Returns $100
-- Transaction 2 commits: UPDATE accounts SET balance = 50 WHERE id = 1;
SELECT balance FROM accounts WHERE id = 1; -- Returns $50 (different!)
COMMIT;
Phantom read: New rows appear in a result set within a transaction.
-- Transaction 1
BEGIN;
SELECT COUNT(*) FROM orders WHERE user_id = 1; -- Returns 5
-- Transaction 2 commits: INSERT INTO orders (user_id, ...) VALUES (1, ...);
SELECT COUNT(*) FROM orders WHERE user_id = 1; -- Returns 6 (phantom row!)
COMMIT;
Key difference: Non-repeatable reads affect existing rows, phantom reads involve new rows.
Prevention:
- REPEATABLE READ prevents non-repeatable reads (uses snapshot isolation)
- SERIALIZABLE prevents phantom reads (uses predicate locking)
3. Senior-Level System Question
Q: Design a ticket booking system for a concert with 10,000 seats. Multiple users can book simultaneously. How would you use isolation levels to prevent double-booking while maintaining performance?
A:
Challenge: Prevent two users from booking the same seat while handling high concurrency.
Solution 1: SERIALIZABLE (Strong consistency, lower performance)
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- Check availability
SELECT COUNT(*) FROM bookings WHERE seat_id = ? AND event_id = ?;
-- If available, book
INSERT INTO bookings (user_id, seat_id, event_id) VALUES (?, ?, ?);
COMMIT;
Pros: Guarantees no double-booking Cons: Lower throughput, potential for serialization conflicts
Solution 2: READ COMMITTED + Optimistic Locking (Better performance)
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
-- Use SELECT FOR UPDATE to lock the seat
SELECT * FROM seats WHERE id = ? AND event_id = ? FOR UPDATE;
-- Check if still available
IF seat.status = 'available' THEN
UPDATE seats SET status = 'booked', user_id = ? WHERE id = ?;
INSERT INTO bookings (user_id, seat_id, event_id) VALUES (?, ?, ?);
ELSE
ROLLBACK;
RETURN 'Seat already booked';
END IF;
COMMIT;
Pros: Better performance, prevents double-booking Cons: Requires explicit locking
Solution 3: Application-level with database constraints (Best for high concurrency)
-- Database constraint prevents double-booking
ALTER TABLE bookings ADD CONSTRAINT unique_seat_event
UNIQUE (seat_id, event_id);
-- Application code (READ COMMITTED)
BEGIN;
INSERT INTO bookings (user_id, seat_id, event_id)
VALUES (?, ?, ?)
ON CONFLICT (seat_id, event_id) DO NOTHING;
IF ROW_COUNT() = 0 THEN
ROLLBACK;
RETURN 'Seat already booked';
END IF;
COMMIT;
Pros:
- Highest throughput (no explicit locks)
- Database constraint ensures integrity
- Simple application code
Cons:
- Requires retry logic on conflict
- Users may see "seat available" but booking fails
Solution 4: Distributed system approach (For very high scale)
# Use distributed lock (Redis) + database
def book_seat(user_id, seat_id, event_id):
lock_key = f"seat:{event_id}:{seat_id}"
# Acquire distributed lock
if not redis.set(lock_key, user_id, nx=True, ex=30):
return "Seat being booked by another user"
try:
# Check and book in database
with transaction():
seat = db.query("SELECT * FROM seats WHERE id = ? FOR UPDATE", seat_id)
if seat.status != 'available':
return "Seat already booked"
db.execute("INSERT INTO bookings ...")
db.execute("UPDATE seats SET status = 'booked' WHERE id = ?", seat_id)
finally:
redis.delete(lock_key)
Trade-offs:
- SERIALIZABLE: Strongest guarantee, lowest performance
- READ COMMITTED + FOR UPDATE: Good balance, explicit locking
- Application-level + constraints: Highest performance, requires retry logic
- Distributed locks: For very high scale, adds complexity
Recommendation: Use Solution 3 (application-level + constraints) for most cases. It provides good performance while maintaining data integrity.
Key Takeaways
- Isolation levels balance consistency and concurrency—higher isolation = more consistency but lower performance
- READ COMMITTED is the default in most databases and works for most applications
- REPEATABLE READ prevents non-repeatable reads using snapshot isolation
- SERIALIZABLE prevents all anomalies but can cause serialization conflicts and lower throughput
- Dirty reads occur when reading uncommitted data that may be rolled back
- Non-repeatable reads happen when the same query returns different results within a transaction
- Phantom reads occur when new rows appear in a result set within a transaction
- Higher isolation increases deadlock risk—use consistent lock ordering to mitigate
- Choose isolation level based on requirements—not all applications need SERIALIZABLE
- Understand database-specific behavior—PostgreSQL and MySQL implement isolation differently
- Monitor for conflicts—track serialization failures and deadlocks
- Consider optimistic locking for high-concurrency scenarios instead of high isolation levels
Keep exploring
Database concepts build on each other. Explore related topics to deepen your understanding of how data systems work.