Operating Systems Topic

Virtual Memory: Concepts, Internals & Interview Use Cases

Understand virtual memory: how operating systems provide larger address space than physical memory using paging, swapping, and address translation.

Medium11 min read

Virtual Memory

Why This Matters

Think of virtual memory like a hotel manager giving each guest a room key. The guest thinks they have room 101, but the manager might actually put them in room 205. The guest doesn't need to know—they just use their key. Virtual memory works the same way—your process thinks it has memory at address 0x1000, but the OS might actually put it at a different physical address. Your process doesn't need to know—it just uses the address.

This matters because virtual memory enables many things: each process can have its own address space (isolation), processes can use more memory than physically available (by swapping to disk), and memory can be shared between processes (for code and libraries). Understanding virtual memory helps you understand how processes work, why memory leaks happen, and how to optimize memory usage.

In interviews, when someone asks "How does the OS manage memory?", they're testing whether you understand virtual memory. Do you know how address translation works? Do you understand paging and swapping? Most engineers don't. They just allocate memory and wonder why the system slows down.

What Engineers Usually Get Wrong

Most engineers think "virtual memory means I have infinite RAM." But virtual memory is just an abstraction. When you allocate 1GB of virtual memory, you're not getting 1GB of RAM—you're getting a promise of RAM. The OS only gives you real RAM when you actually write to that memory. Also, when RAM is full, the OS swaps pages to disk, which is 1000x slower than RAM access. This is why swapping kills performance.

Engineers also don't understand that virtual memory allocation is cheap, but physical memory allocation is expensive. You can allocate terabytes of virtual memory instantly, but if you actually use it, the OS must provide physical RAM or swap space. This is why "memory leaks" can be subtle—you might think you freed memory, but the OS hasn't reclaimed it yet.

How This Breaks Systems in the Real World

A Python service was processing large JSON files. It loaded entire files into memory, processed them, and then moved to the next file. Under normal load, this worked. But when processing 100 files concurrently, the service started swapping—the OS moved memory pages to disk because RAM was full. Swapping is 1000x slower than RAM access. The service became unusable, taking minutes to process files that should take seconds.

The fix? Process files in chunks, or limit concurrent processing. But the real lesson is: the OS will swap when RAM is full, and swapping kills performance. Understanding this helps you design systems that work within memory limits.

Another story: A service was allocating memory in a loop without freeing it. The service ran for days, slowly leaking memory. After a week, the service had used all available RAM and started swapping. Performance degraded gradually—users noticed it was slow, but didn't realize it was a memory leak. The fix? Use memory profiling tools to detect leaks, and always free memory you allocate.

What is Virtual Memory?

Virtual memory provides:

Larger address space: Processes can use more memory than RAM
Memory protection: Each process has isolated address space
Simplified programming: Processes don't manage physical memory
Memory sharing: Processes can share code and data

How it works:

Virtual addresses: Process uses virtual addresses (0 to 2^32-1 or 2^64-1)
Physical addresses: OS maps virtual addresses to physical RAM
Paging: Memory divided into pages, mapped to frames
Swapping: Pages can be stored on disk when not in use

Address Translation

Virtual to Physical Address

Virtual Address: [Page Number | Offset]
                  ↓
            Page Table Lookup
                  ↓
         Physical Frame Number
                  ↓
    Physical Address: [Frame Number | Offset]

Example:

Virtual Address: 0x1234
  Page Number: 0x1 (page 1)
  Offset: 0x234 (offset within page)
  
Page Table:
  Page 1 → Frame 5
  
Physical Address: 0x5234
  Frame Number: 5
  Offset: 0x234 (same)

Page Table Entry (PTE)

[Frame Number | Present | Modified | Referenced | Protection]

Flags:

Present: Page in memory (1) or on disk (0)
Modified: Page has been written to (dirty bit)
Referenced: Page recently accessed
Protection: Read, write, execute permissions

Page Fault

Page fault occurs when process accesses page not in memory.

Page Fault Handling

1. Process accesses virtual address
2. Page table: Present bit = 0 (page not in memory)
3. CPU generates page fault (interrupt)
4. OS page fault handler:
   a. Check if page exists on disk
   b. If invalid address: Segmentation fault
   c. Find free frame (or evict page)
   d. Load page from disk to frame
   e. Update page table (Present = 1)
   f. Resume process execution

Types of Page Faults

1. Minor Page Fault:

Page in memory but not in page table
Fast (just update page table)

2. Major Page Fault:

Page not in memory, must load from disk
Slow (disk I/O required)

3. Invalid Page Fault:

Invalid address (segmentation fault)
Terminate process

Swapping

Swapping moves pages between RAM and disk.

Swap Out (Page to Disk)

When memory is full:
1. Select page to evict (page replacement algorithm)
2. If page modified: Write to swap space on disk
3. Update page table: Present = 0, disk address stored
4. Free frame for new page

Swap In (Page from Disk)

On page fault:
1. Check page table: Present = 0, disk address available
2. Find free frame (or evict page)
3. Load page from swap space to frame
4. Update page table: Present = 1
5. Resume process

Page Replacement Algorithms

When memory is full, OS must evict a page:

1. FIFO (First In First Out)

Evict oldest page:

Pages in memory: [A, B, C]
New page D arrives → Evict A (oldest)
Memory: [B, C, D]

Problem: May evict frequently used pages

2. LRU (Least Recently Used)

Evict least recently used page:

Pages in memory: [A, B, C]
Access pattern: A, B, A, C
New page D arrives → Evict B (least recently used)
Memory: [A, C, D]

Optimal: Minimizes page faults (but requires tracking access)

3. Clock (Second Chance)

Circular buffer with reference bit:

Pages: [A(ref=1), B(ref=0), C(ref=1)]
Clock hand points to B
New page D → Evict B (ref=0)
If ref=1: Clear ref, move to next

4. Optimal (Theoretical)

Evict page that won't be used for longest time:

Pages: [A, B, C]
Future access: A, C, A, B
Evict B (won't be used for longest)

Not practical: Requires knowledge of future

Examples

Virtual Memory Address Space

#include <stdio.h>
#include <stdlib.h>

int main() {
    // Virtual addresses (same for all processes)
    int x = 42;
    int *ptr = malloc(1000 * sizeof(int));
    
    printf("Virtual address of x: %p\n", (void*)&x);
    printf("Virtual address of ptr: %p\n", (void*)ptr);
    
    // Each process sees same virtual addresses
    // But mapped to different physical addresses
    return 0;
}

Page Fault Simulation

class VirtualMemory:
    def __init__(self, physical_memory_size, page_size=4096):
        self.physical_memory = {}  # frame → page
        self.page_table = {}  # page → frame or None
        self.swap_space = {}  # page → data on disk
        self.page_size = page_size
        self.max_frames = physical_memory_size // page_size
    
    def access(self, virtual_address):
        """Access virtual address, handle page fault if needed"""
        page_number = virtual_address // self.page_size
        
        # Check page table
        if page_number not in self.page_table:
            # Page fault: Page not in memory
            return self.handle_page_fault(page_number)
        
        frame_number = self.page_table[page_number]
        if frame_number is None:
            # Page fault: Page on disk
            return self.handle_page_fault(page_number)
        
        # Page in memory
        return self.physical_memory[frame_number]
    
    def handle_page_fault(self, page_number):
        """Handle page fault"""
        # Check if page exists on disk
        if page_number not in self.swap_space:
            raise SegmentationFault("Invalid page")
        
        # Find free frame
        frame_number = self.find_free_frame()
        if frame_number is None:
            # No free frame: Page replacement
            frame_number = self.evict_page()
        
        # Load page from disk
        page_data = self.swap_space[page_number]
        self.physical_memory[frame_number] = page_data
        
        # Update page table
        self.page_table[page_number] = frame_number
        
        return page_data
    
    def evict_page(self):
        """Evict page using LRU"""
        # Find least recently used frame
        lru_frame = min(
            self.physical_memory.keys(),
            key=lambda f: self.get_access_time(f)
        )
        
        # Get page number
        page_number = self.get_page_for_frame(lru_frame)
        
        # If modified, write to disk
        if self.is_modified(page_number):
            self.swap_space[page_number] = self.physical_memory[lru_frame]
        
        # Remove from memory
        del self.physical_memory[lru_frame]
        self.page_table[page_number] = None
        
        return lru_frame

Common Pitfalls

Thrashing: Too many page faults, system spends time swapping. Fix: Increase RAM, reduce number of processes, optimize memory usage
Not handling page faults: Assuming all memory always available. Fix: Handle page faults gracefully
Poor page replacement: Evicting frequently used pages. Fix: Use LRU or similar algorithm
Swap space too small: Can't swap enough pages. Fix: Allocate sufficient swap space
Memory leaks: Processes not freeing memory. Fix: Proper memory management, garbage collection

Interview Questions

Beginner

Q: What is virtual memory and why is it used?

Virtual memory is an abstraction that gives each process its own large address space, independent of physical RAM.

Why used:

Larger address space: Processes can use more memory than physically available
Memory protection: Each process has isolated address space
Simplified programming: Processes don't manage physical memory
Memory sharing: Processes can share code and data

How it works:

Virtual addresses: Process uses virtual addresses (0 to 2^32-1)
Page table: OS maps virtual pages to physical frames
Page fault: If page not in memory, load from disk
Swapping: Pages can be stored on disk when not in use

Example:

Process A: Virtual address 0x1000 → Physical address 0x5000
Process B: Virtual address 0x1000 → Physical address 0x8000

Same virtual address, different physical addresses

Benefits:

Processes isolated (can't access each other's memory)
Can use more memory than RAM (using disk)
Easier programming (don't worry about physical addresses)

Intermediate

Q: Explain how page faults work and how the OS handles them.

Page Fault occurs when process accesses page not in memory.

Process:

Process accesses virtual address
```
Process tries to access 0x1234
```

Page table lookup

Page number = 0x1
Page table entry: Present bit = 0 (not in memory)

CPU generates page fault

CPU interrupts, transfers control to OS

OS page fault handler

def handle_page_fault(page_number):
    # Check if page exists on disk
    if page_number not in swap_space:
        raise SegmentationFault()
    
    # Find free frame
    frame = find_free_frame()
    if frame is None:
        frame = evict_page()  # Page replacement
    
    # Load page from disk
    page_data = swap_space[page_number]
    physical_memory[frame] = page_data
    
    # Update page table
    page_table[page_number] = frame
    page_table[page_number].present = 1

Resume process
```
Process continues execution
```

Types:

Minor page fault: Page in memory but not in page table (fast)
Major page fault: Page on disk, must load (slow, disk I/O)
Invalid page fault: Invalid address (segmentation fault)

Performance:

Page in memory: Fast (nanoseconds)
Page on disk: Slow (milliseconds, disk I/O)

Senior

Q: Design a virtual memory system that handles 64-bit address spaces efficiently. How do you optimize page table size, TLB performance, and page replacement?

class VirtualMemorySystem {
  private pageTable: MultiLevelPageTable;
  private tlb: TLB;
  private frameAllocator: FrameAllocator;
  private pageReplacer: PageReplacer;
  private swapSpace: SwapSpace;
  
  constructor() {
    // Multi-level page table for 64-bit
    this.pageTable = new MultiLevelPageTable({
      levels: 4, // 4-level page table
      pageSize: 4096
    });
    
    // TLB for fast translation
    this.tlb = new TLB({ size: 512 });
    
    this.frameAllocator = new FrameAllocator();
    this.pageReplacer = new PageReplacer({ algorithm: 'LRU' });
    this.swapSpace = new SwapSpace();
  }
  
  // 1. Address Translation with TLB
  translateAddress(virtualAddress: bigint): bigint {
    const { pageNumber, offset } = this.splitAddress(virtualAddress);
    
    // Check TLB first (fast path)
    const tlbEntry = this.tlb.lookup(pageNumber);
    if (tlbEntry) {
      return (tlbEntry.frameNumber * PAGE_SIZE) + offset;
    }
    
    // TLB miss: Check page table (slow path)
    const pte = this.pageTable.lookup(pageNumber);
    
    if (!pte.present) {
      // Page fault
      return this.handlePageFault(pageNumber, offset);
    }
    
    // Update TLB
    this.tlb.insert(pageNumber, pte.frameNumber);
    
    return (pte.frameNumber * PAGE_SIZE) + offset;
  }
  
  // 2. Multi-Level Page Table (64-bit)
  class MultiLevelPageTable {
    private rootTable: PageTableLevel;
    private levels: number = 4;
    
    lookup(pageNumber: bigint): PageTableEntry {
      // 64-bit: 4-level page table
      // 48 bits for virtual address (typical)
      // 12 bits offset, 9 bits per level (4 levels × 9 = 36 bits)
      
      const level1 = Number((pageNumber >> 39n) & 0x1FFn);
      const level2 = Number((pageNumber >> 30n) & 0x1FFn);
      const level3 = Number((pageNumber >> 21n) & 0x1FFn);
      const level4 = Number((pageNumber >> 12n) & 0x1FFn);
      
      // Traverse levels
      let table = this.rootTable;
      table = table[level1];
      if (!table) return null;
      
      table = table[level2];
      if (!table) return null;
      
      table = table[level3];
      if (!table) return null;
      
      return table[level4];
    }
  }
  
  // 3. TLB (Translation Lookaside Buffer)
  class TLB {
    private entries: Map<bigint, TLBEntry>;
    private maxSize: number;
    
    lookup(pageNumber: bigint): TLBEntry | null {
      return this.entries.get(pageNumber) || null;
    }
    
    insert(pageNumber: bigint, frameNumber: number): void {
      if (this.entries.size >= this.maxSize) {
        // Evict (LRU)
        const lru = this.findLRU();
        this.entries.delete(lru);
      }
      
      this.entries.set(pageNumber, {
        frameNumber,
        accessTime: Date.now()
      });
    }
  }
  
  // 4. Page Replacement (LRU with Clock)
  class PageReplacer {
    private accessTimes: Map<number, number>;
    private referenceBits: Map<number, boolean>;
    
    async evict(): Promise<number> {
      // Clock algorithm: Circular scan
      let frame = this.clockHand;
      
      while (true) {
        const refBit = this.referenceBits.get(frame);
        
        if (!refBit) {
          // Not recently used: Evict
          this.evictFrame(frame);
          this.clockHand = (frame + 1) % this.maxFrames;
          return frame;
        } else {
          // Recently used: Clear bit, continue
          this.referenceBits.set(frame, false);
          frame = (frame + 1) % this.maxFrames;
        }
      }
    }
  }
  
  // 5. Page Fault Handling
  async handlePageFault(pageNumber: bigint, offset: number): Promise<bigint> {
    // Check if page exists on disk
    if (!await this.swapSpace.hasPage(pageNumber)) {
      throw new SegmentationFault();
    }
    
    // Allocate frame
    let frameNumber = this.frameAllocator.allocate();
    if (frameNumber === null) {
      frameNumber = await this.pageReplacer.evict();
    }
    
    // Load page from disk
    await this.swapSpace.loadPage(pageNumber, frameNumber);
    
    // Update page table
    this.pageTable.update(pageNumber, {
      frameNumber,
      present: true,
      modified: false,
      referenced: true
    });
    
    // Update TLB
    this.tlb.insert(pageNumber, frameNumber);
    
    return (frameNumber * PAGE_SIZE) + offset;
  }
  
  // 6. Optimization: Large Pages
  enableLargePages(): void {
    // Use 2MB pages for large allocations
    // Reduces page table size, TLB entries
  }
  
  // 7. Optimization: Prefetching
  async prefetchPages(pageNumber: bigint): Promise<void> {
    // Prefetch adjacent pages
    const adjacentPages = [
      pageNumber - 1n,
      pageNumber + 1n
    ];
    
    for (const page of adjacentPages) {
      if (!this.pageTable.isPresent(page)) {
        await this.handlePageFault(page, 0);
      }
    }
  }
}

Features:

Multi-level page table: Efficient for 64-bit address spaces
TLB: Fast translation cache
Page replacement: LRU with clock algorithm
Large pages: 2MB pages for large allocations
Prefetching: Load adjacent pages proactively
Optimization: Minimize page table size, maximize TLB hit rate

Failure Stories You'll Recognize

The Swapping Storm: A Python service was processing large JSON files. It loaded entire files into memory, processed them, and then moved to the next file. Under normal load, this worked. But when processing 100 files concurrently, the service started swapping—the OS moved memory pages to disk because RAM was full. Swapping is 1000x slower than RAM access. The service became unusable, taking minutes to process files that should take seconds. The fix? Process files in chunks, or limit concurrent processing. But the real lesson is: the OS will swap when RAM is full, and swapping kills performance.

The Memory Leak: A service was allocating memory in a loop without freeing it. The service ran for days, slowly leaking memory. After a week, the service had used all available RAM and started swapping. Performance degraded gradually—users noticed it was slow, but didn't realize it was a memory leak. The fix? Use memory profiling tools to detect leaks, and always free memory you allocate.

The Virtual Memory Confusion: A developer allocated 10GB of virtual memory and thought they had 10GB of RAM. They wrote code that used all 10GB. The system started swapping heavily. Performance died. The developer was confused—they thought they had enough RAM. The fix? Understand that virtual memory allocation is cheap, but physical memory allocation is expensive. Monitor actual RAM usage, not virtual memory allocation.

What Interviewers Are Really Testing

They want to hear you talk about virtual memory as an abstraction, not infinite RAM. Junior engineers say "virtual memory means I have infinite RAM." Senior engineers say "virtual memory is an abstraction. When RAM is full, the OS swaps to disk, which is 1000x slower. Monitor actual RAM usage. Understand that virtual memory allocation is cheap, but physical memory allocation is expensive."

When they ask "How does the OS manage memory?", they're testing:

Do you understand address translation and paging?
Do you know what happens when RAM is full (swapping)?
Can you design systems that work within memory limits?

Key Takeaways

Virtual memory: Abstraction providing larger address space than physical RAM

Address translation: Virtual addresses mapped to physical addresses via page table

Page fault: Occurs when page not in memory, OS loads from disk

Swapping: Pages moved between RAM and disk

Page replacement: Evict pages when memory full (LRU, FIFO, Clock)

TLB: Translation Lookaside Buffer caches page table entries for fast access

Multi-level page table: Efficient for 64-bit address spaces

Benefits: Larger address space, memory protection, simplified programming

Best practices: Optimize TLB hit rate, use large pages, implement prefetching

Keep exploring

Kernel concepts stack on each other. Return to the hub and pick the next topic that closes a gap you noticed here.