Real Engineering Stories
How Do You Handle Data Consistency in Microservices?
A senior-architect view of consistency across service boundaries: local transactions, eventual consistency, sagas, outbox, CDC, idempotency, reconciliation, and when stronger coordination is justified—without pretending distributed systems behave like one big database.
After twenty years of shipping and rescuing distributed systems, the question "How do you handle data consistency in microservices?" still comes up in every serious architecture review. The honest answer is not a single product or protocol. It is a layered strategy: strong guarantees inside a service, explicit weaker guarantees across services, and operations that detect and repair the gap reality always leaves open.
This article is not a vendor pitch. It is the mental model I use with teams when we move from one database and one deploy to many.
Why consistency gets harder when you split services
In a monolith with one relational database, you can often wrap a business action in a single ACID transaction. The database enforces atomicity and isolation; everyone reads the same committed state.
In microservices, each service typically owns its own datastore. A single user action—place order, charge card, reserve inventory, send confirmation—becomes multiple commits separated by networks, processes, and clocks that are not synchronized in the way your intuition wants.
That creates several classes of inconsistency:
- Temporal inconsistency: Service A has applied an update; Service B has not yet seen it. This is normal under asynchrony.
- Partial failure: Service A committed; the call to B timed out. You do not know whether B succeeded. Retries can duplicate work.
- Ordering: Events arrive out of order. Consumers must tolerate or reorder explicitly.
- Split brain in processes: Two operators or jobs "fix" the same row with conflicting assumptions.
If you ignore these, you get production tickets that sound like: "Customer was charged but order shows pending" or "Inventory says shipped, fulfillment says not picked."
The goal is not to eliminate all temporal lag—that is often impossible at scale without unacceptable cost. The goal is to choose invariants, design workflows that converge, and observe when convergence fails.
First clarify what "consistent" means for each use case
Interviewers and product owners often say "we need consistency" without specifying which consistency. Before picking patterns, answer:
- Which reads must be immediately accurate? (e.g. available balance before a withdrawal)
- Which writes must be atomic with which others? (e.g. debit and ledger entry)
- What is acceptable temporary divergence? (e.g. search index a few seconds behind the source of truth)
- Who is the single writer for each aggregate or entity lifecycle?
Rule of thumb I use in reviews: default to strong consistency inside one service's aggregate (one bounded context, one transactional boundary), and eventual consistency across services unless a documented requirement forces something heavier.
Layer 1 — Local ACID transactions (your default inside a service)
Inside a service, prefer short, well-scoped transactions against one primary database (or a small set you explicitly treat as one unit, for example two schemas on the same Postgres cluster with one team owning all writes to those tables).
What "local ACID" buys you
- Atomicity: either every statement in the transaction commits, or none of it does.
- Isolation: concurrent transactions do not read each other's half-finished work at a level you choose (via isolation level).
- Durability: once committed, the database has taken responsibility for the write surviving process crashes (subject to your replication and backup story).
Aggregate and ownership
- Model writes around one aggregate root per transaction when possible (one order header + its line items, one wallet + its ledger entries for that operation).
- One writer per aggregate in code review terms: a single code path "owns" transitions (e.g. only the Order service transitions
PLACED → PAID).
Transaction shape
- Keep transactions short: fewer round trips, smaller lock windows, less deadlock risk.
- Avoid holding transactions open across RPC to another service or user think time—that is how you get timeouts that leave you guessing whether you committed.
Isolation levels (practical note)
- Default read committed is common; repeatable read or serializable when you have strict invariants (e.g. inventory not oversold) and contention is understood.
- Higher isolation reduces anomalies but increases retries and deadlocks under load—tune deliberately, measure contention, and document why.
When two tables must always commit together
If two tables must always move together and cannot tolerate partial commits, they usually belong in the same service (and often the same database) until you have a very strong reason to split them. Splitting storage before you have a cross-service workflow story is how you accidentally build a distributed monolith with worse reliability than the monolith.
What to avoid as a "default"
- Distributed transactions across databases (2PC-style) as a casual tool: they are brittle, operationally expensive, and couple failure domains.
- Long chains of local transactions in one request without clear domain boundaries—often a sign the slice is wrong.
Layer 2 — Eventual consistency across services (the normal case)
Across services, eventual consistency means: given enough time and no new conflicting updates, all replicas or projections converge so that reads reflect the same underlying facts (within whatever your model allows—see below).
This is not "weak"—it is explicit. You document:
- Maximum staleness (or a probability bound) for each read path that matters.
- Who is authoritative for which fact at time T.
- What happens when a read hits a projection that has not caught up yet (hide the row, show "pending sync", read-through to source, etc.).
Why it is the normal case
Networks partition, processes restart, consumers lag, and indexes rebuild. Pretending every service sees every other's commit instantly is how junior designs die in production. Senior designs say: strong where the business pays for it, eventual elsewhere, with tests and metrics for both.
Mechanisms (when to use which)
- Domain events after local commit (with an outbox—Layer 3): best when you control the writer and want a clear, ordered story of what happened in the domain.
- CDC (change data capture) from the system of record: best when legacy writers cannot all publish events, or many systems read the same table—you tail the log of changes and fan out to projections without changing every app path on day one.
- Read models / projections: materialized views, search indexes, analytics stores, caches—all eventually aligned with the source of truth if the pipeline is healthy.
CQRS in one sentence
Often you separate the model you write (normalized, transactional) from the model you read (denormalized, optimized for queries). The gap between them is lag you must measure and cap per use case.
Conflict and ordering
- If two updates can race (two admins editing the same entity), you need a policy: last-write-wins with version checks, domain merge rules, or human resolution.
- Ordering: do not assume global order across all events; assume per-partition or per-aggregate order if your broker and keys are set up for it.
Product framing
Tell stakeholders which screens may be seconds behind the source of truth, and which flows must hit the owning service or block until fresh (e.g. balance before withdrawal). Put that in runbooks, not only in architecture diagrams.
Layer 3 — The transactional outbox (reliable "write then notify")
A classic bug: commit to DB, then send a message to a broker. If the process crashes after commit and before publish, nobody else learns the change—permanent inconsistency between the database and the rest of the system.
The transactional outbox fixes the atomicity problem between "business state" and "intent to notify": both are persisted in one commit.
Pattern (conceptual)
- In the same database transaction as your business write, insert a row into an outbox table (payload, event type, correlation id, metadata, often a monotonic id for ordering).
- A separate relay (or outbox processor) reads pending rows, publishes to Kafka/RabbitMQ/SNS/etc., then marks rows published or deletes them—again in a transaction with the publish acknowledgment strategy you choose.
Relay implementation options
- Polling: simple, works everywhere; tune batch size and
SELECT … FOR UPDATE SKIP LOCKED(or equivalent) to avoid double-processing. - Log-based tailing: some databases support reading the WAL or change stream into the outbox relay for lower latency.
Delivery semantics
This preserves at-least-once delivery to the bus: the relay may publish successfully but crash before marking the row processed, so it may publish again. Consumers must be idempotent (Layer 5).
Ordering
If order matters for downstream, use a single partition key per aggregate (e.g. orderId) and document that ordering is per-partition, not global.
Poison rows
If one outbox payload is bad, skip with DLQ or quarantine after N attempts so you do not block the whole relay; alert and fix forward.
CDC vs outbox
- Outbox: explicit domain events, full control of payload shape, easiest to reason about for new code.
- CDC: great when you must integrate without touching every writer; you may emit lower-level change events and need mapping to domain language upstream or in a stream processor.
Layer 4 — Sagas for multi-step business workflows
When one business operation spans multiple services, you need a workflow model, not a chain of pray-and-RPC on the synchronous request path.
A saga breaks the operation into local transactions (each commits or rolls back in one service). If a later step cannot complete, you run compensating transactions to undo prior steps—or a forward recovery path when undo is impossible (money already settled, goods already shipped).
Choreography vs orchestration
- Choreography: services subscribe to events and react. Fewer moving parts early; the interaction graph can become hard to visualize and change as the number of services grows.
- Orchestration: a workflow engine or orchestrator issues commands and tracks state. Easier audit trail and timeouts in one place; the orchestrator is a critical dependency you must scale and harden.
Compensation rules
- Compensations are business operations, not blind
DELETE—refund, release reservation, cancel shipment request. - Define partial compensation order (release inventory before refund, or the reverse—depends on fees and policies).
- Some steps cannot be compensated; then you need manual intervention, support workflow, or forward fix with clear state (
PAID_BUT_FULFILLMENT_PENDING).
Timeouts and stuck states
- Every external call in the saga should have a timeout and a defined next state (retry, compensate, escalate).
- Stuck is not an edge case—it is normal under partial outages. Alert on age in state and count by state.
Observability
- Persist a correlation id and saga instance id across all steps; export to tracing.
- Dashboards: counts by state, transition rates, compensation rate, mean time in each state.
Non-negotiables in production sagas
- Idempotency keys on every external effect (payment, shipment, email).
- Timeouts and explicit stuck-state handling (human or automated).
- Observable state machine (what state is this order in, and why).
Layer 5 — Idempotency and exactly-once illusion
Brokers and HTTP are generally at-least-once. Exactly-once end-to-end across heterogeneous systems is a marketing phrase unless you define it narrowly—for example idempotent operations plus a dedupe store, which gives you effectively-once side effects.
Why duplicates appear
- Retries after timeouts (client does not know if server succeeded).
- Broker redelivery after ack loss or consumer crash before offset commit.
- At-least-once messaging guarantees by design.
HTTP APIs (synchronous)
- Client sends
Idempotency-Key(or equivalent) on mutating requests. - Server stores
(key → result or in-progress)in a durable store with a TTL longer than client retry window. - First request executes; replays return the same response body and status without re-executing side effects.
Async consumers
- Use natural keys:
(aggregateId, commandType)or(orderId, sagaStep)processed once in a processed_commands table. - Or store broker offset + partition with processing—but idempotency keys are often easier for business logic.
Kafka-oriented notes
- Producer idempotence helps within a session; it does not replace application idempotency for business operations.
- Consumer should treat handling as: parse → dedupe → effect, where effect is safe to repeat.
TTL and storage
- Dedupe tables grow; use TTL aligned to retry policy and legal retention needs.
- For high throughput, partition the dedupe store by key hash.
What "exactly once" can honestly mean
- Exactly-once effect for a payment: charge once, which you achieve by idempotent charge API at the PSP plus your dedupe key—not by magic middleware.
Layer 6 — Reconciliation (the safety net adults use)
No design prevents all edge cases: gateway quirks, manual DB fixes, clock skew, buggy deploys, partial compensations. Reconciliation compares authoritative external or sibling systems to your internal state and detects drift before customers or auditors do.
Typical pairs
- Payment gateway settlements vs internal ledger and order payment flags.
- Inventory reservations vs warehouse WMS or 3PL feeds.
- Shipment tracking events vs order fulfillment state.
- Subscription billing provider vs entitlement service.
Job design
- Frequency: continuous streaming checks for high-value drift; nightly batches for long-tail.
- Idempotent fixes: applying a fix twice must not double-charge or double-ship.
- Human queue for ambiguous cases (two systems disagree and neither is obviously wrong).
Alerting
- Drift count and amount (e.g. cents) above threshold.
- Age of oldest unreconciled item.
- Spike in auto-fix rate (often signals a bad deploy, not random noise).
Auto-fix only when safe
- Prefer flagging and tickets until rules are proven.
- When auto-fixing, log before/after, tie to correlation id, and support replay for support tools.
This is not an admission of failure—it is admission of reality. Distributed systems always have edge cases; reconciliation is how mature teams keep trust in the data.
When stronger coordination is justified (narrowly)
Two-phase commit (2PC) and similar protocols across heterogeneous databases are rare in large microservice estates because they hurt availability and couple failure domains.
Consider stronger coordination only when:
- Regulatory or financial rules demand immediate cross-entity invariants.
- The blast radius of temporary inconsistency is unacceptable and small enough to coordinate.
Even then, prefer saga + reconciliation first and measure whether you truly need more.
Anti-patterns I still see in production
- Long synchronous chains (A calls B calls C) on the user request path—latency multiplies, failures become ambiguous, retries duplicate work.
- Dual writes without reconciliation (write to DB and cache "at the same time" with no single ordering story).
- Shared database between "microservices"—you get distributed pain without clear ownership.
- Ignoring outbox and hoping "publish after commit" never races with crashes.
- No stuck-state metrics—orders in
PAYMENT_PENDINGforever with no alert.
Operational checklist
- Workflow state visible in admin tools and metrics (counts by state, age histograms).
- Tracing with correlation IDs across HTTP and async boundaries.
- Dead-letter queues and replay procedures with idempotency.
- Reconciliation dashboards and runbooks for common drifts.
Interview framing (one minute)
"Inside a service I use ACID on one aggregate. Across services I assume at-least-once delivery and design idempotent consumers. Multi-step business flows use sagas with compensations and timeouts. I use an outbox or CDC so I never rely on commit-then-publish luck. I run reconciliation because distributed systems always have edge cases."
Related reading on this site
For a story-driven view of migration and consistency failures (big-bang cutover, stuck orders), see The Monolith to Microservices Migration That Almost Failed. For incremental replacement of a monolith, see What Is the Strangler Pattern?.
FAQs
Q: Is eventual consistency just "broken consistency"?
A: No. It is an explicit guarantee: replicas or services will converge after bounded delay under defined conditions. You document which reads may lag and you build reconciliation for when convergence does not happen.
Q: Can I use distributed transactions (2PC) across all my microservices?
A: You can, but in practice it rarely scales operationally: participants block, partitions hurt availability, and debugging is painful. Prefer local transactions + saga + idempotency + reconciliation unless a narrow domain truly requires stronger coupling.
Q: What is the difference between choreography and orchestration in sagas?
A: Choreography is event-driven decentralization—each service knows what to do when it sees an event. Orchestration is a central coordinator that issues commands and tracks state. Choreography is lighter early; orchestration often wins for complex workflows and compliance audits.
Q: How do I handle duplicate messages from Kafka?
A: Treat consumption as at-least-once: idempotent handlers, dedupe keys, or store processed message IDs. Design your side effects (payments, shipments) so repeating the same logical operation is safe.
Q: What is the transactional outbox for?
A: To make "database write" and "message publish" atomic from the application's perspective by persisting the intent to publish in the same DB transaction, then having a relay publish reliably.
Keep exploring
Real engineering stories work best when combined with practice. Explore more stories or apply what you've learned in our system design practice platform.