Topic Overview

DNS Resolution Flow

Understand how DNS resolves domain names to IP addresses: iterative vs recursive queries, DNS hierarchy, and caching.

DNS (Domain Name System) translates human-readable domain names (like example.com) into IP addresses (like 93.184.216.34). It's a distributed, hierarchical system that enables the internet to function without requiring users to remember numeric IP addresses.


DNS Hierarchy

DNS uses a hierarchical tree structure:

                    . (root)
                   / | \
              com  org  net  (TLD - Top Level Domain)
              /     |     \
         example  wikipedia  google  (Second-level domains)
          /         |          \
      www  mail  www  mail  www  mail  (Subdomains)

DNS Server Types

  1. Root DNS Servers: 13 root servers (a.root-servers.net through m.root-servers.net)
  2. TLD (Top-Level Domain) Servers: Manage .com, .org, .net, etc.
  3. Authoritative DNS Servers: Own the DNS records for specific domains
  4. Recursive DNS Servers (Resolvers): Query on behalf of clients (e.g., 8.8.8.8, 1.1.1.1)

Recursive vs Iterative Queries

Recursive Query

The DNS resolver handles all queries and returns the final answer to the client.

Client → Resolver: "What's the IP of example.com?"
Resolver → Root: "Where is .com?"
Resolver → TLD: "Where is example.com?"
Resolver → Authoritative: "What's the IP of example.com?"
Resolver → Client: "93.184.216.34"

Client perspective: Single request, single response

Iterative Query

Each server responds with a referral to the next server in the hierarchy.

Client → Root: "What's the IP of example.com?"
Root → Client: "Ask .com TLD server (IP: 192.5.6.30)"
Client → TLD: "What's the IP of example.com?"
TLD → Client: "Ask example.com authoritative (IP: 199.43.135.53)"
Client → Authoritative: "What's the IP of example.com?"
Authoritative → Client: "93.184.216.34"

Client perspective: Multiple requests, multiple responses

In practice: Clients use recursive queries (to resolvers), resolvers use iterative queries (to DNS hierarchy)


DNS Resolution Process

Step-by-Step Flow

1. Client checks local cache
   ↓ (cache miss)
2. Client queries recursive resolver (e.g., 8.8.8.8)
3. Resolver checks its cache
   ↓ (cache miss)
4. Resolver queries root server (iterative)
   Root → "Ask .com TLD server at 192.5.6.30"
5. Resolver queries .com TLD server (iterative)
   TLD → "Ask example.com NS at 199.43.135.53"
6. Resolver queries authoritative server (iterative)
   Authoritative → "A record: 93.184.216.34"
7. Resolver caches result and returns to client
8. Client caches result

Detailed Example: Resolving www.example.com

Step 1: Client → Recursive Resolver (8.8.8.8)
  Query: "What's the IP of www.example.com?"

Step 2: Resolver → Root Server (a.root-servers.net)
  Query: "Where is .com?"
  Response: "Ask .com TLD server at 192.5.6.30"

Step 3: Resolver → .com TLD Server (192.5.6.30)
  Query: "Where is example.com?"
  Response: "Ask example.com NS at ns1.example.com (199.43.135.53)"

Step 4: Resolver → Authoritative Server (199.43.135.53)
  Query: "What's the IP of www.example.com?"
  Response: "A record: 93.184.216.34"

Step 5: Resolver → Client
  Response: "93.184.216.34"
  (Resolver caches this result)

DNS Record Types

Common Record Types

TypePurposeExample
AIPv4 addressexample.com A 93.184.216.34
AAAAIPv6 addressexample.com AAAA 2606:2800:220:1:248:1893:25c8:1946
CNAMECanonical name (alias)www.example.com CNAME example.com
MXMail exchangeexample.com MX 10 mail.example.com
NSName serverexample.com NS ns1.example.com
TXTText record (SPF, DKIM)example.com TXT "v=spf1 ..."
PTRReverse DNS (IP to name)34.216.184.93.in-addr.arpa PTR example.com
SOAStart of AuthorityZone metadata (serial, refresh, etc.)

Examples

DNS Query Using dig

# Query A record
dig example.com A

# Response:
# ;; ANSWER SECTION:
# example.com.    3600    IN    A    93.184.216.34

# Query with trace (shows full resolution path)
dig +trace example.com

# Shows:
# 1. Query to root server
# 2. Query to TLD server
# 3. Query to authoritative server
# 4. Final answer

DNS Resolution in Code

import socket

def resolve_dns(hostname):
    """Resolve hostname to IP address"""
    try:
        # Uses system's DNS resolver (recursive query)
        ip = socket.gethostbyname(hostname)
        return ip
    except socket.gaierror as e:
        print(f"DNS resolution failed: {e}")
        return None

# Usage
ip = resolve_dns("example.com")
print(f"IP: {ip}")  # Output: IP: 93.184.216.34
import dns.resolver

def resolve_dns_detailed(hostname, record_type='A'):
    """Resolve with detailed DNS information"""
    try:
        answers = dns.resolver.resolve(hostname, record_type)
        
        for rdata in answers:
            print(f"{hostname} {record_type}: {rdata}")
            print(f"TTL: {answers.rrset.ttl}")
            
        return [str(rdata) for rdata in answers]
    except dns.resolver.NXDOMAIN:
        print(f"Domain {hostname} does not exist")
        return None
    except dns.resolver.NoAnswer:
        print(f"No {record_type} record for {hostname}")
        return None

# Usage
resolve_dns_detailed("example.com", "A")
resolve_dns_detailed("example.com", "MX")
resolve_dns_detailed("example.com", "NS")

DNS Caching Implementation

import time
from collections import OrderedDict

class DNSCache:
    def __init__(self, max_size=1000):
        self.cache = OrderedDict()
        self.max_size = max_size
    
    def get(self, hostname, record_type='A'):
        key = (hostname, record_type)
        
        if key in self.cache:
            entry = self.cache[key]
            # Check TTL
            if time.time() < entry['expires']:
                # Move to end (LRU)
                self.cache.move_to_end(key)
                return entry['value']
            else:
                # Expired, remove
                del self.cache[key]
        
        return None
    
    def set(self, hostname, record_type, value, ttl=3600):
        key = (hostname, record_type)
        
        # Remove oldest if cache full
        if len(self.cache) >= self.max_size:
            self.cache.popitem(last=False)
        
        self.cache[key] = {
            'value': value,
            'expires': time.time() + ttl
        }
        self.cache.move_to_end(key)

# Usage
cache = DNSCache()
cache.set("example.com", "A", "93.184.216.34", ttl=3600)
ip = cache.get("example.com", "A")

Common Pitfalls

  • Not understanding TTL: DNS records have TTL (Time To Live). Fix: Understand caching behavior, TTL affects how long records are cached
  • DNS propagation delays: Changes take time to propagate. Fix: Set appropriate TTL, wait for propagation, use DNS checkers
  • CNAME conflicts: CNAME can't coexist with other records. Fix: Use A records for root domain, CNAME for subdomains
  • Circular dependencies: CNAME pointing to itself or loops. Fix: Validate DNS records, check for cycles
  • DNS cache poisoning: Malicious DNS responses. Fix: Use DNSSEC, validate responses, use trusted resolvers
  • Not handling DNS failures: Applications crash when DNS fails. Fix: Implement retries, fallback resolvers, cache results
  • Confusing recursive vs iterative: Not understanding query types. Fix: Clients use recursive (to resolver), resolvers use iterative (to hierarchy)

Interview Questions

Beginner

Q: Explain how DNS resolution works. What happens when you type a URL in your browser?

A:

Process:

  1. Browser checks cache: Browser first checks its DNS cache for the domain
  2. OS checks cache: If not in browser cache, OS checks system DNS cache
  3. Query recursive resolver: If cache miss, query configured DNS resolver (e.g., 8.8.8.8)
  4. Resolver queries root: Resolver queries root DNS server for TLD information
  5. Resolver queries TLD: Resolver queries TLD server (e.g., .com) for domain information
  6. Resolver queries authoritative: Resolver queries authoritative DNS server for the domain
  7. Return IP address: Authoritative server returns A record (IP address)
  8. Cache result: Resolver and client cache the result (respecting TTL)
  9. Browser connects: Browser uses IP address to establish TCP connection

Example: www.example.com

Browser → OS Cache → Resolver (8.8.8.8)
Resolver → Root → TLD (.com) → Authoritative (example.com)
Authoritative → "93.184.216.34"
Resolver → Browser: "93.184.216.34"
Browser → Connects to 93.184.216.34:80

Intermediate

Q: What is the difference between recursive and iterative DNS queries? When is each used?

A:

Recursive Query:

  • Client asks resolver: "What's the IP of example.com?"
  • Resolver does all the work (queries root, TLD, authoritative)
  • Resolver returns final answer to client
  • Used by: Clients querying DNS resolvers

Iterative Query:

  • Client asks server: "What's the IP of example.com?"
  • Server responds: "I don't know, but ask this other server"
  • Client must query the next server itself
  • Used by: DNS resolvers querying DNS hierarchy (root, TLD, authoritative)

Flow:

Client → Resolver (recursive): "example.com?"
  Resolver → Root (iterative): "example.com?"
    Root → Resolver: "Ask .com TLD at 192.5.6.30"
  Resolver → TLD (iterative): "example.com?"
    TLD → Resolver: "Ask example.com NS at 199.43.135.53"
  Resolver → Authoritative (iterative): "example.com?"
    Authoritative → Resolver: "93.184.216.34"
  Resolver → Client: "93.184.216.34" (final answer)

Why this design?

  • Efficiency: Resolvers cache results for multiple clients
  • Load distribution: Root/TLD servers don't handle all queries
  • Security: Authoritative servers only serve their domains

Senior

Q: Design a high-performance DNS resolver that handles millions of queries per second. How do you implement caching, handle DNS failures, and ensure low latency?

A:

class HighPerformanceDNSResolver {
  private cache: DNSCache;
  private upstreamResolvers: string[];
  private queryQueue: QueryQueue;
  private stats: ResolverStats;
  
  constructor() {
    // Multi-level cache
    this.cache = new MultiLevelCache({
      l1: new InMemoryCache(100000),      // Hot cache
      l2: new RedisCache(),                // Distributed cache
      l3: new PersistentCache()            // Disk cache
    });
    
    // Multiple upstream resolvers for redundancy
    this.upstreamResolvers = [
      "8.8.8.8",
      "1.1.1.1",
      "9.9.9.9"
    ];
    
    // Query batching and queue
    this.queryQueue = new QueryQueue({
      batchSize: 100,
      batchTimeout: 10 // ms
    });
  }
  
  // 1. Caching Strategy
  async resolve(hostname: string, recordType: string = 'A'): Promise<DNSRecord> {
    const cacheKey = `${hostname}:${recordType}`;
    
    // Check cache (multi-level)
    const cached = await this.cache.get(cacheKey);
    if (cached && !this.isExpired(cached)) {
      this.stats.recordCacheHit();
      return cached.value;
    }
    
    // Cache miss: resolve
    const record = await this.resolveUpstream(hostname, recordType);
    
    // Cache with TTL
    await this.cache.set(cacheKey, record, { ttl: record.ttl });
    
    this.stats.recordCacheMiss();
    return record;
  }
  
  // 2. Upstream Resolution with Failover
  async resolveUpstream(hostname: string, recordType: string): Promise<DNSRecord> {
    // Try resolvers in order
    for (const resolver of this.upstreamResolvers) {
      try {
        const record = await this.queryWithTimeout(resolver, hostname, recordType, 100);
        return record;
      } catch (error) {
        this.stats.recordResolverFailure(resolver);
        continue; // Try next resolver
      }
    }
    
    throw new Error("All DNS resolvers failed");
  }
  
  // 3. Query Batching
  async batchResolve(queries: Query[]): Promise<Map<string, DNSRecord>> {
    // Group queries by upstream resolver
    const batches = this.groupByResolver(queries);
    
    // Execute batches in parallel
    const results = await Promise.all(
      Array.from(batches.entries()).map(([resolver, batch]) =>
        this.batchQuery(resolver, batch)
      )
    );
    
    return this.mergeResults(results);
  }
  
  // 4. Negative Caching
  async resolveWithNegativeCache(hostname: string): Promise<DNSRecord | null> {
    const cacheKey = `negative:${hostname}`;
    
    // Check negative cache (NXDOMAIN)
    const negative = await this.cache.get(cacheKey);
    if (negative) {
      return null; // Domain doesn't exist
    }
    
    try {
      return await this.resolve(hostname);
    } catch (error) {
      if (error.code === 'NXDOMAIN') {
        // Cache negative result (shorter TTL)
        await this.cache.set(cacheKey, null, { ttl: 60 });
      }
      throw error;
    }
  }
  
  // 5. Prefetching and Preloading
  async prefetchCommonDomains(): Promise<void> {
    const commonDomains = [
      'google.com',
      'facebook.com',
      'amazon.com',
      // ... top domains
    ];
    
    // Prefetch in background
    Promise.all(
      commonDomains.map(domain => this.resolve(domain).catch(() => {}))
    );
  }
  
  // 6. Monitoring and Health Checks
  async healthCheck(): Promise<HealthStatus> {
    const checks = await Promise.all([
      this.checkUpstreamResolvers(),
      this.checkCacheHealth(),
      this.checkQueryLatency()
    ]);
    
    return {
      healthy: checks.every(c => c.healthy),
      resolvers: checks[0],
      cache: checks[1],
      latency: checks[2]
    };
  }
}

// 7. DNS over HTTPS (DoH) / DNS over TLS (DoT)
class SecureDNSResolver extends HighPerformanceDNSResolver {
  async resolveSecure(hostname: string): Promise<DNSRecord> {
    // Use DoH/DoT for encrypted DNS queries
    const dohEndpoint = "https://cloudflare-dns.com/dns-query";
    return await this.queryDoH(dohEndpoint, hostname);
  }
}

Optimizations:

  1. Multi-level caching: In-memory (hot), Redis (warm), disk (cold)
  2. Query batching: Batch multiple queries to reduce overhead
  3. Failover: Multiple upstream resolvers with automatic failover
  4. Negative caching: Cache NXDOMAIN responses (shorter TTL)
  5. Prefetching: Preload common domains
  6. Connection pooling: Reuse TCP connections for DNS queries
  7. Anycast: Deploy resolvers in multiple locations
  8. Monitoring: Track cache hit rates, latency, failures

Key Takeaways

  • DNS hierarchy: Root → TLD → Authoritative servers in a tree structure
  • Recursive queries: Clients query resolvers (resolver does all work)
  • Iterative queries: Resolvers query DNS hierarchy (each server refers to next)
  • DNS caching: Results cached at multiple levels (browser, OS, resolver) with TTL
  • Record types: A (IPv4), AAAA (IPv6), CNAME (alias), MX (mail), NS (name server)
  • TTL (Time To Live): Controls how long DNS records are cached
  • DNS propagation: Changes take time to propagate through DNS hierarchy
  • Performance: Use caching, batching, failover, and connection pooling for high-performance resolvers
  • Security: Use DNSSEC, DoH/DoT for encrypted DNS queries
  • Common issues: Cache poisoning, propagation delays, CNAME conflicts, DNS failures

About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.