Topic Overview
DNS Resolution Flow
Understand how DNS resolves domain names to IP addresses: iterative vs recursive queries, DNS hierarchy, and caching.
DNS (Domain Name System) translates human-readable domain names (like example.com) into IP addresses (like 93.184.216.34). It's a distributed, hierarchical system that enables the internet to function without requiring users to remember numeric IP addresses.
DNS Hierarchy
DNS uses a hierarchical tree structure:
. (root)
/ | \
com org net (TLD - Top Level Domain)
/ | \
example wikipedia google (Second-level domains)
/ | \
www mail www mail www mail (Subdomains)
DNS Server Types
- Root DNS Servers: 13 root servers (a.root-servers.net through m.root-servers.net)
- TLD (Top-Level Domain) Servers: Manage .com, .org, .net, etc.
- Authoritative DNS Servers: Own the DNS records for specific domains
- Recursive DNS Servers (Resolvers): Query on behalf of clients (e.g., 8.8.8.8, 1.1.1.1)
Recursive vs Iterative Queries
Recursive Query
The DNS resolver handles all queries and returns the final answer to the client.
Client → Resolver: "What's the IP of example.com?"
Resolver → Root: "Where is .com?"
Resolver → TLD: "Where is example.com?"
Resolver → Authoritative: "What's the IP of example.com?"
Resolver → Client: "93.184.216.34"
Client perspective: Single request, single response
Iterative Query
Each server responds with a referral to the next server in the hierarchy.
Client → Root: "What's the IP of example.com?"
Root → Client: "Ask .com TLD server (IP: 192.5.6.30)"
Client → TLD: "What's the IP of example.com?"
TLD → Client: "Ask example.com authoritative (IP: 199.43.135.53)"
Client → Authoritative: "What's the IP of example.com?"
Authoritative → Client: "93.184.216.34"
Client perspective: Multiple requests, multiple responses
In practice: Clients use recursive queries (to resolvers), resolvers use iterative queries (to DNS hierarchy)
DNS Resolution Process
Step-by-Step Flow
1. Client checks local cache
↓ (cache miss)
2. Client queries recursive resolver (e.g., 8.8.8.8)
↓
3. Resolver checks its cache
↓ (cache miss)
4. Resolver queries root server (iterative)
Root → "Ask .com TLD server at 192.5.6.30"
↓
5. Resolver queries .com TLD server (iterative)
TLD → "Ask example.com NS at 199.43.135.53"
↓
6. Resolver queries authoritative server (iterative)
Authoritative → "A record: 93.184.216.34"
↓
7. Resolver caches result and returns to client
↓
8. Client caches result
Detailed Example: Resolving www.example.com
Step 1: Client → Recursive Resolver (8.8.8.8)
Query: "What's the IP of www.example.com?"
Step 2: Resolver → Root Server (a.root-servers.net)
Query: "Where is .com?"
Response: "Ask .com TLD server at 192.5.6.30"
Step 3: Resolver → .com TLD Server (192.5.6.30)
Query: "Where is example.com?"
Response: "Ask example.com NS at ns1.example.com (199.43.135.53)"
Step 4: Resolver → Authoritative Server (199.43.135.53)
Query: "What's the IP of www.example.com?"
Response: "A record: 93.184.216.34"
Step 5: Resolver → Client
Response: "93.184.216.34"
(Resolver caches this result)
DNS Record Types
Common Record Types
| Type | Purpose | Example |
|---|---|---|
| A | IPv4 address | example.com A 93.184.216.34 |
| AAAA | IPv6 address | example.com AAAA 2606:2800:220:1:248:1893:25c8:1946 |
| CNAME | Canonical name (alias) | www.example.com CNAME example.com |
| MX | Mail exchange | example.com MX 10 mail.example.com |
| NS | Name server | example.com NS ns1.example.com |
| TXT | Text record (SPF, DKIM) | example.com TXT "v=spf1 ..." |
| PTR | Reverse DNS (IP to name) | 34.216.184.93.in-addr.arpa PTR example.com |
| SOA | Start of Authority | Zone metadata (serial, refresh, etc.) |
Examples
DNS Query Using dig
# Query A record
dig example.com A
# Response:
# ;; ANSWER SECTION:
# example.com. 3600 IN A 93.184.216.34
# Query with trace (shows full resolution path)
dig +trace example.com
# Shows:
# 1. Query to root server
# 2. Query to TLD server
# 3. Query to authoritative server
# 4. Final answer
DNS Resolution in Code
import socket
def resolve_dns(hostname):
"""Resolve hostname to IP address"""
try:
# Uses system's DNS resolver (recursive query)
ip = socket.gethostbyname(hostname)
return ip
except socket.gaierror as e:
print(f"DNS resolution failed: {e}")
return None
# Usage
ip = resolve_dns("example.com")
print(f"IP: {ip}") # Output: IP: 93.184.216.34
import dns.resolver
def resolve_dns_detailed(hostname, record_type='A'):
"""Resolve with detailed DNS information"""
try:
answers = dns.resolver.resolve(hostname, record_type)
for rdata in answers:
print(f"{hostname} {record_type}: {rdata}")
print(f"TTL: {answers.rrset.ttl}")
return [str(rdata) for rdata in answers]
except dns.resolver.NXDOMAIN:
print(f"Domain {hostname} does not exist")
return None
except dns.resolver.NoAnswer:
print(f"No {record_type} record for {hostname}")
return None
# Usage
resolve_dns_detailed("example.com", "A")
resolve_dns_detailed("example.com", "MX")
resolve_dns_detailed("example.com", "NS")
DNS Caching Implementation
import time
from collections import OrderedDict
class DNSCache:
def __init__(self, max_size=1000):
self.cache = OrderedDict()
self.max_size = max_size
def get(self, hostname, record_type='A'):
key = (hostname, record_type)
if key in self.cache:
entry = self.cache[key]
# Check TTL
if time.time() < entry['expires']:
# Move to end (LRU)
self.cache.move_to_end(key)
return entry['value']
else:
# Expired, remove
del self.cache[key]
return None
def set(self, hostname, record_type, value, ttl=3600):
key = (hostname, record_type)
# Remove oldest if cache full
if len(self.cache) >= self.max_size:
self.cache.popitem(last=False)
self.cache[key] = {
'value': value,
'expires': time.time() + ttl
}
self.cache.move_to_end(key)
# Usage
cache = DNSCache()
cache.set("example.com", "A", "93.184.216.34", ttl=3600)
ip = cache.get("example.com", "A")
Common Pitfalls
- Not understanding TTL: DNS records have TTL (Time To Live). Fix: Understand caching behavior, TTL affects how long records are cached
- DNS propagation delays: Changes take time to propagate. Fix: Set appropriate TTL, wait for propagation, use DNS checkers
- CNAME conflicts: CNAME can't coexist with other records. Fix: Use A records for root domain, CNAME for subdomains
- Circular dependencies: CNAME pointing to itself or loops. Fix: Validate DNS records, check for cycles
- DNS cache poisoning: Malicious DNS responses. Fix: Use DNSSEC, validate responses, use trusted resolvers
- Not handling DNS failures: Applications crash when DNS fails. Fix: Implement retries, fallback resolvers, cache results
- Confusing recursive vs iterative: Not understanding query types. Fix: Clients use recursive (to resolver), resolvers use iterative (to hierarchy)
Interview Questions
Beginner
Q: Explain how DNS resolution works. What happens when you type a URL in your browser?
A:
Process:
- Browser checks cache: Browser first checks its DNS cache for the domain
- OS checks cache: If not in browser cache, OS checks system DNS cache
- Query recursive resolver: If cache miss, query configured DNS resolver (e.g., 8.8.8.8)
- Resolver queries root: Resolver queries root DNS server for TLD information
- Resolver queries TLD: Resolver queries TLD server (e.g., .com) for domain information
- Resolver queries authoritative: Resolver queries authoritative DNS server for the domain
- Return IP address: Authoritative server returns A record (IP address)
- Cache result: Resolver and client cache the result (respecting TTL)
- Browser connects: Browser uses IP address to establish TCP connection
Example: www.example.com
Browser → OS Cache → Resolver (8.8.8.8)
Resolver → Root → TLD (.com) → Authoritative (example.com)
Authoritative → "93.184.216.34"
Resolver → Browser: "93.184.216.34"
Browser → Connects to 93.184.216.34:80
Intermediate
Q: What is the difference between recursive and iterative DNS queries? When is each used?
A:
Recursive Query:
- Client asks resolver: "What's the IP of example.com?"
- Resolver does all the work (queries root, TLD, authoritative)
- Resolver returns final answer to client
- Used by: Clients querying DNS resolvers
Iterative Query:
- Client asks server: "What's the IP of example.com?"
- Server responds: "I don't know, but ask this other server"
- Client must query the next server itself
- Used by: DNS resolvers querying DNS hierarchy (root, TLD, authoritative)
Flow:
Client → Resolver (recursive): "example.com?"
Resolver → Root (iterative): "example.com?"
Root → Resolver: "Ask .com TLD at 192.5.6.30"
Resolver → TLD (iterative): "example.com?"
TLD → Resolver: "Ask example.com NS at 199.43.135.53"
Resolver → Authoritative (iterative): "example.com?"
Authoritative → Resolver: "93.184.216.34"
Resolver → Client: "93.184.216.34" (final answer)
Why this design?
- Efficiency: Resolvers cache results for multiple clients
- Load distribution: Root/TLD servers don't handle all queries
- Security: Authoritative servers only serve their domains
Senior
Q: Design a high-performance DNS resolver that handles millions of queries per second. How do you implement caching, handle DNS failures, and ensure low latency?
A:
class HighPerformanceDNSResolver {
private cache: DNSCache;
private upstreamResolvers: string[];
private queryQueue: QueryQueue;
private stats: ResolverStats;
constructor() {
// Multi-level cache
this.cache = new MultiLevelCache({
l1: new InMemoryCache(100000), // Hot cache
l2: new RedisCache(), // Distributed cache
l3: new PersistentCache() // Disk cache
});
// Multiple upstream resolvers for redundancy
this.upstreamResolvers = [
"8.8.8.8",
"1.1.1.1",
"9.9.9.9"
];
// Query batching and queue
this.queryQueue = new QueryQueue({
batchSize: 100,
batchTimeout: 10 // ms
});
}
// 1. Caching Strategy
async resolve(hostname: string, recordType: string = 'A'): Promise<DNSRecord> {
const cacheKey = `${hostname}:${recordType}`;
// Check cache (multi-level)
const cached = await this.cache.get(cacheKey);
if (cached && !this.isExpired(cached)) {
this.stats.recordCacheHit();
return cached.value;
}
// Cache miss: resolve
const record = await this.resolveUpstream(hostname, recordType);
// Cache with TTL
await this.cache.set(cacheKey, record, { ttl: record.ttl });
this.stats.recordCacheMiss();
return record;
}
// 2. Upstream Resolution with Failover
async resolveUpstream(hostname: string, recordType: string): Promise<DNSRecord> {
// Try resolvers in order
for (const resolver of this.upstreamResolvers) {
try {
const record = await this.queryWithTimeout(resolver, hostname, recordType, 100);
return record;
} catch (error) {
this.stats.recordResolverFailure(resolver);
continue; // Try next resolver
}
}
throw new Error("All DNS resolvers failed");
}
// 3. Query Batching
async batchResolve(queries: Query[]): Promise<Map<string, DNSRecord>> {
// Group queries by upstream resolver
const batches = this.groupByResolver(queries);
// Execute batches in parallel
const results = await Promise.all(
Array.from(batches.entries()).map(([resolver, batch]) =>
this.batchQuery(resolver, batch)
)
);
return this.mergeResults(results);
}
// 4. Negative Caching
async resolveWithNegativeCache(hostname: string): Promise<DNSRecord | null> {
const cacheKey = `negative:${hostname}`;
// Check negative cache (NXDOMAIN)
const negative = await this.cache.get(cacheKey);
if (negative) {
return null; // Domain doesn't exist
}
try {
return await this.resolve(hostname);
} catch (error) {
if (error.code === 'NXDOMAIN') {
// Cache negative result (shorter TTL)
await this.cache.set(cacheKey, null, { ttl: 60 });
}
throw error;
}
}
// 5. Prefetching and Preloading
async prefetchCommonDomains(): Promise<void> {
const commonDomains = [
'google.com',
'facebook.com',
'amazon.com',
// ... top domains
];
// Prefetch in background
Promise.all(
commonDomains.map(domain => this.resolve(domain).catch(() => {}))
);
}
// 6. Monitoring and Health Checks
async healthCheck(): Promise<HealthStatus> {
const checks = await Promise.all([
this.checkUpstreamResolvers(),
this.checkCacheHealth(),
this.checkQueryLatency()
]);
return {
healthy: checks.every(c => c.healthy),
resolvers: checks[0],
cache: checks[1],
latency: checks[2]
};
}
}
// 7. DNS over HTTPS (DoH) / DNS over TLS (DoT)
class SecureDNSResolver extends HighPerformanceDNSResolver {
async resolveSecure(hostname: string): Promise<DNSRecord> {
// Use DoH/DoT for encrypted DNS queries
const dohEndpoint = "https://cloudflare-dns.com/dns-query";
return await this.queryDoH(dohEndpoint, hostname);
}
}
Optimizations:
- Multi-level caching: In-memory (hot), Redis (warm), disk (cold)
- Query batching: Batch multiple queries to reduce overhead
- Failover: Multiple upstream resolvers with automatic failover
- Negative caching: Cache NXDOMAIN responses (shorter TTL)
- Prefetching: Preload common domains
- Connection pooling: Reuse TCP connections for DNS queries
- Anycast: Deploy resolvers in multiple locations
- Monitoring: Track cache hit rates, latency, failures
Key Takeaways
- DNS hierarchy: Root → TLD → Authoritative servers in a tree structure
- Recursive queries: Clients query resolvers (resolver does all work)
- Iterative queries: Resolvers query DNS hierarchy (each server refers to next)
- DNS caching: Results cached at multiple levels (browser, OS, resolver) with TTL
- Record types: A (IPv4), AAAA (IPv6), CNAME (alias), MX (mail), NS (name server)
- TTL (Time To Live): Controls how long DNS records are cached
- DNS propagation: Changes take time to propagate through DNS hierarchy
- Performance: Use caching, batching, failover, and connection pooling for high-performance resolvers
- Security: Use DNSSEC, DoH/DoT for encrypted DNS queries
- Common issues: Cache poisoning, propagation delays, CNAME conflicts, DNS failures