← Back to Real Engineering Stories

Real Engineering Stories

The DNS Change That Pointed Production at Staging

A DNS configuration mistake during a migration pointed production traffic to the staging database, causing data corruption scares and a frantic 2-hour rollback. Learn about DNS propagation, change management, and environment isolation.

Intermediate22 min read

This is a story about how a single DNS record change—copy-pasted to the wrong environment—routed production traffic to staging for 45 minutes. It's about change management, DNS propagation, and why "it's just a config change" can be the most dangerous phrase in operations.


Context

We were migrating our API to a new region. The plan: update DNS to point to new servers. Staging was tested first. Production change was scheduled for a maintenance window.

What Went Wrong:

  • Engineer had both staging and production DNS configs open
  • Applied the staging DNS change to production by mistake
  • Production traffic started hitting staging infrastructure
  • Staging database began receiving production writes

The Incident

T+0
DNS change applied to production (wrong config)
T+5 min
First production requests hitting staging. Staging DB receiving production data
T+15 min
Users reporting 'wrong data' and 'missing features'
T+25 min
On-call identified DNS misconfiguration. Initiated rollback
T+45 min
DNS reverted. Propagation took 20 more minutes
T+65 min
All traffic restored to correct environment

Key Lessons

  1. DNS propagation is slow and unpredictable—TTL matters, rollback takes time.
  2. Change management: Require approval for production DNS. Use different accounts for staging vs prod.
  3. Environment isolation: Staging and production should be obviously different—different domains, different credentials.
  4. Runbooks: Document exactly which config goes where. One wrong click can be catastrophic.

Apply This Thinking

Practice what you've learned with these related system design questions:

Keep exploring

Real engineering stories work best when combined with practice. Explore more stories or apply what you've learned in our system design practice platform.