Topic Overview

Clock Synchronization (NTP, Lamport)

Learn how distributed systems synchronize clocks and order events using NTP and Lamport clocks.

Intermediate11 min read

In distributed systems, nodes have independent clocks that drift apart. Clock synchronization ensures nodes agree on time or can order events correctly.

The Problem

Physical clocks drift: Even with atomic clocks, clocks drift at different rates.

No global time: In distributed systems, there's no single source of truth for time.

Event ordering: Need to determine if event A happened before event B across nodes.

Physical Clock Synchronization: NTP

Network Time Protocol (NTP) synchronizes physical clocks across the network.

How NTP Works

Client requests time from NTP server
Server responds with timestamp
Client calculates offset and adjusts clock
Stratum hierarchy: Servers sync with higher-stratum servers

1class NTPClient {
2  async synchronize(): Promise<void> {
3    const server = this.selectNTPServer();
4    
5    const t0 = this.getLocalTime(); // Client send time
6    const response = await server.getTime();
7    const t1 = this.getLocalTime(); // Client receive time
8    
9    // Server timestamps: t2 (receive), t3 (send)
10    const { t2, t3 } = response;

NTP Stratum

Stratum 0: Atomic clocks, GPS
Stratum 1: Servers directly connected to stratum 0
Stratum 2: Servers sync with stratum 1
And so on...

Logical Clocks: Lamport Clocks

Lamport clocks provide a logical ordering of events without synchronized physical clocks.

Lamport Timestamps

Each node maintains a counter that:

Increments before each event
Sends counter value with messages
Updates to max(local, received) + 1 on receive

1class LamportClock {
2  private counter: number = 0;
3
4  tick(): number {
5    this.counter++;
6    return this.counter;
7  }
8
9  send(message: Message): Message {
10    this.counter++;
11    return {
12      ...message,
13      timestamp: this.counter
14    };
15  }
16
17  receive(message: Message): void {

Properties

Happens-before relation: If A → B (causally), then L(A) < L(B)
Converse not true: L(A) < L(B) doesn't mean A → B (may be concurrent)
Total ordering: Can break ties using node ID

Vector Clocks

Vector clocks improve on Lamport clocks by detecting concurrent events.

Vector Timestamps

Each node maintains a vector of counters (one per node).

1class VectorClock {
2  private vector: Map<string, number> = new Map();
3  private myNodeId: string;
4
5  tick(): Map<string, number> {
6    const current = this.vector.get(this.myNodeId) || 0;
7    this.vector.set(this.myNodeId, current + 1);
8    return new Map(thisvector

Vector Clock Properties

Causality detection: V(A) < V(B) iff A causally precedes B
Concurrency detection: If neither V(A) < V(B) nor V(B) < V(A), events are concurrent
More accurate than Lamport clocks but requires O(n) space per timestamp

Examples

Distributed Event Ordering

1class EventOrdering {
2  private lamport: LamportClock = new LamportClock();
3  private events: Event[] = [];
4
5  async processEvent(event: Event): Promise<void> {
6    const timestamp = this.lamport.tick();
7    this.events.push({ ...event, timestamp });
8    
9    // Send to other nodes
10    await this.broadcast({ event timestamp

Causality Detection with Vector Clocks

1class CausalOrdering {
2  private vector: VectorClock = new VectorClock();
3  private pending: Map<string, Message> = new Map();
4
5  async send(message: Message): Promise<void> {
6    const vector = this.vector.send(message);
7    await this.broadcast({ ...message, vector });
8  }
9
10  async receivemessage Message

Common Pitfalls

Assuming physical clocks are synchronized: They drift, use logical clocks for ordering
Using Lamport clocks for concurrency detection: Can't detect concurrent events, use vector clocks
Not handling clock skew: NTP can have errors, use logical clocks for critical ordering
Vector clock size: Grows with number of nodes. Fix: Use bounded vectors or hierarchical clocks
Not updating on receive: Must update clock when receiving messages
Ignoring network delays: NTP calculations must account for network latency
Clock rollover: Clocks can wrap around, handle overflow

Interview Questions

Beginner

Q: Why do we need clock synchronization in distributed systems?

A: Distributed systems need clock synchronization because:

Event ordering: Determine if event A happened before event B across nodes
Causality: Understand cause-and-effect relationships
Consistency: Ensure operations happen in correct order
Debugging: Timestamps help debug distributed systems
Scheduling: Coordinate timed operations across nodes

Challenge: Physical clocks drift and aren't perfectly synchronized, so we use logical clocks (Lamport, Vector) for ordering.

Intermediate

Q: Compare Lamport clocks and Vector clocks. When would you use each?

Lamport Clocks:

Single counter per node
O(1) space per timestamp
Total ordering: Can order all events
Limitation: Can't detect concurrent events (L(A) < L(B) doesn't mean A → B)

Vector Clocks:

Vector of counters (one per node)
O(n) space per timestamp
Causality detection: V(A) < V(B) iff A causally precedes B
Concurrency detection: Can detect if events are concurrent

When to use:

Lamport clocks:

Simple event ordering (don't need to detect concurrency)
Large-scale systems (space efficient)
When total ordering is sufficient

Vector clocks:

Need to detect concurrent events
Need precise causality tracking
Conflict resolution (detect concurrent writes)
Small to medium clusters (space overhead acceptable)

Example: Database replication - use vector clocks to detect concurrent writes that need conflict resolution.

Senior

Q: Design a distributed system that maintains causal ordering of messages. How do you handle clock synchronization, message delivery, and ensure no message is delivered before its causally preceding messages?

Architecture:

Vector clocks for causality tracking
Message buffering for out-of-order messages
Delivery guarantees to ensure causal ordering

Design:

1class CausalMessageSystem {
2  private vector: VectorClock;
3  private pending: Map<string, PendingMessage> = new Map();
4  private delivered: Set<string> = new Set();
5
6  async send(message: Message): Promise<void> {
7    // Increment vector clock and attach to message
8    const vector = this.vector.send(message);
9    
10    const causalMessage = {
11      ...message,

Optimizations:

Bounded vectors: For large clusters, use hierarchical or compressed vectors
Garbage collection: Periodically clean up old message IDs
Piggybacking: Include vector clock in all messages (even acks)
Batching: Batch multiple messages with single vector update

Handling Network Issues:

Retransmission: Retry failed messages
Timeout: Detect missing messages, request retransmission
Partition handling: Buffer messages during partition, deliver when partition heals

Monitoring:

Track pending message count
Measure delivery latency
Detect causality violations (should never happen)
Monitor vector clock size

Physical clocks drift: Use NTP for approximate synchronization, but don't rely on it for ordering
Lamport clocks: Provide total ordering with O(1) space, but can't detect concurrency
Vector clocks: Detect causality and concurrency with O(n) space per timestamp
Causal ordering: Messages must be delivered after all causally preceding messages
Message buffering: Store out-of-order messages until causally ready
NTP hierarchy: Stratum-based hierarchy for scalable time synchronization
Trade-offs: Lamport (simple, space-efficient) vs Vector (accurate, space-intensive)
Use logical clocks for event ordering in distributed systems, not physical clocks
Leader Election - Ordering events during leader elections
Gossip Protocol - Information dissemination with causal ordering
Distributed Transactions - Ordering transaction events
Partition Tolerance - Handling clock synchronization during partitions
Distributed Logging - Correlating logs with timestamps

Key Takeaways

Physical clocks drift: Use NTP for approximate synchronization, but don't rely on it for ordering

Lamport clocks: Provide total ordering with O(1) space, but can't detect concurrency

Vector clocks: Detect causality and concurrency with O(n) space per timestamp

Causal ordering: Messages must be delivered after all causally preceding messages

Message buffering: Store out-of-order messages until causally ready

NTP hierarchy: Stratum-based hierarchy for scalable time synchronization

Trade-offs: Lamport (simple, space-efficient) vs Vector (accurate, space-intensive)

Use logical clocks for event ordering in distributed systems, not physical clocks

Clock Synchronization (NTP, Lamport)

The Problem

Physical Clock Synchronization: NTP

How NTP Works

NTP Stratum

Logical Clocks: Lamport Clocks

Lamport Timestamps

Properties

Vector Clocks

Vector Timestamps

Vector Clock Properties

Examples

Distributed Event Ordering

Causality Detection with Vector Clocks

Common Pitfalls

Interview Questions

Beginner

Intermediate

Senior

Key Takeaways

Related Topics