Topic Overview

Context Switching: Concepts, Internals & Interview Use Cases

Understand how the OS switches between processes: saving and restoring CPU state, PCB (Process Control Block), and performance implications.

Medium10 min read

Context Switching

Why This Matters

Think of context switching like switching between tasks on your computer. When you switch from a browser to a text editor, the OS saves the browser's state (what page you're on, cursor position) and restores the text editor's state (what file is open, cursor position). Context switching is the OS doing this for processes—saving one process's state and restoring another's.

This matters because context switching has overhead. Every time the OS switches processes, it must save registers, update memory mappings, and restore the new process's state. This takes time (1-10 microseconds). If you switch too frequently, the overhead becomes significant. Understanding this helps you design systems that minimize unnecessary switches.

In interviews, when someone asks "How would you optimize a system with many processes?", they're testing whether you understand context switching overhead. Do you know when context switching happens? Do you understand the cost? Most engineers don't. They create many processes and wonder why performance degrades.

What Engineers Usually Get Wrong

Most engineers think "more processes = better parallelism." But each process switch has overhead. If you create 1000 processes on a 4-core CPU, most processes will be waiting, and the OS will spend significant time switching between them. This overhead can hurt performance. The key is balance—enough processes to utilize CPU cores, but not so many that switching overhead dominates.

Engineers also don't understand the difference between process and thread context switches. Process switches are expensive (save/restore memory mappings). Thread switches are cheaper (same process, same memory). For CPU-bound work, threads are often more efficient than processes.

How This Breaks Systems in the Real World

A service was processing files. It created a new process for each file. Under normal load (10 files), this worked fine. But during a traffic spike (1000 files), the service created 1000 processes. The OS spent most of its time switching between processes instead of actually processing files. Performance degraded significantly. The service became slow.

The fix? Use a process pool with a bounded size (say, 10 processes). Process files in batches. This reduced context switching overhead and improved performance. But the real lesson is: too many processes can hurt performance due to context switching overhead.

Another story: A service was using threads for I/O-bound work. Each request created a new thread. Under normal load, this worked. But during high traffic, the service created 10,000 threads. The OS spent significant time switching between threads. Also, threads share memory, so one buggy thread could corrupt data used by others. The fix? Use a thread pool with a bounded size, or use async I/O (event-driven) to avoid threads altogether.


What is Context Switching?

Context switching involves:

  • Saving state: Save current process's CPU state
  • Loading state: Load next process's CPU state
  • Switching: Transfer control to new process

When it happens:

  • Time slice expired: Process used up its time quantum
  • I/O wait: Process blocked on I/O
  • Higher priority: Higher priority process ready
  • Process termination: Current process finishes

Process Control Block (PCB)

PCB stores process state information.

PCB Contents:

- Process ID (PID)
- Program Counter (PC)
- CPU Registers (AX, BX, CX, DX, etc.)
- Memory Management Info (page table, limits)
- I/O Status (open files, devices)
- Scheduling Info (priority, state)
- Accounting Info (CPU time, start time)

Context Switch Process

Step-by-Step

1. Save current process state
   - Save CPU registers to PCB
   - Save program counter
   - Save memory management info

2. Update process state
   - Current process: RUNNING → READY (or BLOCKED)
   - Next process: READY → RUNNING

3. Switch memory context
   - Switch page table
   - Update memory management unit (MMU)

4. Restore next process state
   - Load CPU registers from PCB
   - Load program counter
   - Load memory management info

5. Transfer control
   - Jump to program counter
   - Resume execution

Detailed Flow

Current Process (P1) running:
  CPU Registers: [AX=10, BX=20, PC=0x1234]
  Memory: Page table for P1
  
Timer interrupt:
  → Save P1 state to PCB1
  → Update P1 state: RUNNING → READY
  → Select next process (P2)
  → Update P2 state: READY → RUNNING
  → Load P2 state from PCB2
  → Switch page table to P2
  → Jump to P2's program counter
  
P2 now running:
  CPU Registers: [AX=5, BX=15, PC=0x5678]
  Memory: Page table for P2

Context Switch Cost

Overhead:

  • Time: Typically 1-10 microseconds
  • CPU cycles: Save/restore registers, update tables
  • Cache effects: Cache misses after switch
  • TLB flush: May need to flush TLB (Translation Lookaside Buffer)

Factors affecting cost:

  • Number of registers: More registers = more to save
  • Memory management: Page table switching
  • Cache state: Cache may be cold for new process
  • Hardware support: Some CPUs optimize context switching

Examples

Context Switch Simulation

class ProcessControlBlock:
    def __init__(self, pid):
        self.pid = pid
        self.registers = {
            'AX': 0,
            'BX': 0,
            'CX': 0,
            'DX': 0,
            'PC': 0  # Program Counter
        }
        self.state = 'READY'
        self.memory_info = {}
    
    def save_state(self, cpu_state):
        """Save CPU state to PCB"""
        self.registers = cpu_state.copy()
        self.memory_info = self.get_memory_info()
    
    def restore_state(self):
        """Restore CPU state from PCB"""
        return self.registers.copy()

class ContextSwitcher:
    def __init__(self):
        self.current_process = None
        self.ready_queue = []
    
    def switch(self, next_process):
        """Perform context switch"""
        if self.current_process:
            # Save current process state
            self.save_context(self.current_process)
            self.current_process.state = 'READY'
            self.ready_queue.append(self.current_process)
        
        # Switch to next process
        self.current_process = next_process
        self.current_process.state = 'RUNNING'
        self.restore_context(self.current_process)
    
    def save_context(self, process):
        """Save process context"""
        cpu_state = self.get_cpu_state()
        process.pcb.save_state(cpu_state)
        process.pcb.memory_info = self.get_memory_info()
    
    def restore_context(self, process):
        """Restore process context"""
        cpu_state = process.pcb.restore_state()
        self.set_cpu_state(cpu_state)
        self.set_memory_info(process.pcb.memory_info)
        self.jump_to_pc(process.pcb.registers['PC'])

Context Switch with Threads

class ThreadContextSwitch:
    def __init__(self):
        self.current_thread = None
    
    def switch_thread(self, next_thread):
        """Switch between threads (same process)"""
        if self.current_thread:
            # Save thread state (registers, stack pointer)
            self.save_thread_state(self.current_thread)
        
        # Switch to next thread
        self.current_thread = next_thread
        self.restore_thread_state(next_thread)
        
        # Note: Same process, no page table switch needed
        # Faster than process context switch

Common Pitfalls

  • Too frequent switching: High overhead. Fix: Adjust time quantum, reduce unnecessary switches
  • Not saving all state: Missing registers. Fix: Save all CPU state
  • Cache thrashing: Frequent switches cause cache misses. Fix: Optimize scheduling, use CPU affinity
  • TLB flush overhead: Flushing TLB on every switch. Fix: Use ASID (Address Space ID) to avoid flush

Interview Questions

Beginner

Q: What is context switching and when does it occur?

A:

Context switching is saving the state of the current process and restoring the state of another process.

When it occurs:

  1. Time slice expired: Process used up its time quantum
  2. I/O wait: Process blocked waiting for I/O
  3. Higher priority: Higher priority process becomes ready
  4. Process termination: Current process finishes
  5. Interrupt: Hardware interrupt requires different process

What is saved:

  • CPU registers: AX, BX, CX, DX, etc.
  • Program counter: Where process was executing
  • Memory management: Page table, memory limits
  • I/O status: Open files, devices

Example:

Process A running:
  Registers: [AX=10, BX=20, PC=0x1234]
  
Context switch:
  Save Process A state → PCB
  Load Process B state ← PCB
  
Process B running:
  Registers: [AX=5, BX=15, PC=0x5678]

Cost: Typically 1-10 microseconds overhead


Intermediate

Q: Explain the context switching process step by step. What is stored in the PCB?

A:

Context Switch Steps:

  1. Save current process state

    # Save CPU registers to PCB
    pcb.registers = cpu.registers.copy()
    pcb.program_counter = cpu.program_counter
    pcb.stack_pointer = cpu.stack_pointer
    
  2. Update process states

    current_process.state = 'READY'  # or 'BLOCKED'
    next_process.state = 'RUNNING'
    
  3. Switch memory context

    # Switch page table
    mmu.page_table = next_process.page_table
    # May need to flush TLB
    
  4. Restore next process state

    # Load from PCB
    cpu.registers = next_process.pcb.registers
    cpu.program_counter = next_process.pcb.program_counter
    
  5. Transfer control

    # Jump to program counter
    cpu.jump_to(next_process.pcb.program_counter)
    

PCB (Process Control Block) Contents:

  • Process ID: Unique identifier
  • CPU Registers: AX, BX, CX, DX, SP, PC, etc.
  • Memory Management: Page table, memory limits
  • I/O Status: Open files, devices
  • Scheduling Info: Priority, state, CPU time
  • Accounting: Start time, CPU time used

Performance:

  • Time: 1-10 microseconds
  • Overhead: Save/restore registers, update tables
  • Cache effects: Cache may be cold for new process

Senior

Q: Design a context switching system that minimizes overhead. How do you optimize register saving, handle TLB flushes, and reduce cache misses?

A:

class OptimizedContextSwitcher {
  private currentProcess: Process;
  private tlb: TLB;
  private registerFile: RegisterFile;
  
  constructor() {
    this.tlb = new TLB();
    this.registerFile = new RegisterFile();
  }
  
  // 1. Optimized Register Saving
  async switchContext(nextProcess: Process): Promise<void> {
    // Save only modified registers (lazy saving)
    await this.saveModifiedRegisters(this.currentProcess);
    
    // Update process states
    this.currentProcess.state = 'READY';
    nextProcess.state = 'RUNNING';
    
    // Switch memory context (optimized)
    await this.switchMemoryContext(nextProcess);
    
    // Restore next process registers
    await this.restoreRegisters(nextProcess);
    
    // Transfer control
    this.jumpToPC(nextProcess.pcb.programCounter);
  }
  
  // 2. Lazy Register Saving
  async saveModifiedRegisters(process: Process): Promise<void> {
    // Only save registers that were modified
    const modifiedRegisters = this.registerFile.getModified();
    
    for (const reg of modifiedRegisters) {
      process.pcb.registers[reg] = this.registerFile.get(reg);
    }
    
    // Clear modified flags
    this.registerFile.clearModified();
  }
  
  // 3. TLB Optimization (ASID)
  async switchMemoryContext(process: Process): Promise<void> {
    // Use ASID (Address Space ID) to avoid TLB flush
    const asid = process.addressSpaceId;
    
    // Update page table
    this.mmu.setPageTable(process.pageTable);
    
    // Update TLB with ASID (don't flush)
    this.tlb.setASID(asid);
    
    // Only flush if ASID conflict
    if (this.tlb.hasConflict(asid)) {
      this.tlb.flush();
    }
  }
  
  // 4. Cache Optimization
  async optimizeCache(process: Process): Promise<void> {
    // CPU affinity: Keep process on same CPU
    this.setCPUAffinity(process, this.getCurrentCPU());
    
    // Prefetch: Prefetch process data
    await this.prefetchProcessData(process);
  }
  
  // 5. Fast Path for Threads
  async switchThread(nextThread: Thread): Promise<void> {
    // Threads in same process: No page table switch
    // Only save/restore registers and stack pointer
    
    await this.saveRegisters(this.currentThread);
    await this.restoreRegisters(nextThread);
    
    // No memory context switch (faster!)
  }
  
  // 6. Hardware Support
  async useHardwareSupport(): Promise<void> {
    // Use CPU instructions for fast context switch
    // Some CPUs have optimized context switch instructions
    this.useFastContextSwitch();
  }
}

Optimizations:

  1. Lazy register saving: Only save modified registers
  2. ASID: Avoid TLB flush using Address Space ID
  3. CPU affinity: Keep process on same CPU (better cache)
  4. Thread switching: Faster (no page table switch)
  5. Hardware support: Use CPU optimizations

Failure Stories You'll Recognize

The Process Storm: A service was processing files. It created a new process for each file. Under normal load (10 files), this worked fine. But during a traffic spike (1000 files), the service created 1000 processes. The OS spent most of its time switching between processes instead of actually processing files. Performance degraded significantly. The service became slow. The fix? Use a process pool with a bounded size. Process files in batches. This reduced context switching overhead and improved performance.

The Thread Explosion: A service was using threads for I/O-bound work. Each request created a new thread. Under normal load, this worked. But during high traffic, the service created 10,000 threads. The OS spent significant time switching between threads. Also, threads share memory, so one buggy thread could corrupt data used by others. The fix? Use a thread pool with a bounded size, or use async I/O (event-driven) to avoid threads altogether.

The Cache Thrashing: A service was running many processes that accessed different memory regions. Frequent context switches caused cache misses—when the OS switched to a new process, its data wasn't in the CPU cache. This made the system slow. The fix? Use CPU affinity to keep processes on the same CPU cores, improving cache locality.

What Interviewers Are Really Testing

They want to hear you talk about context switching overhead, process vs thread switches, and optimization strategies. Junior engineers say "more processes = better parallelism." Senior engineers say "context switching has overhead. Too many processes can hurt performance. Use process/thread pools with bounded sizes. Understand the difference between process and thread switches."

When they ask "How would you optimize a system with many processes?", they're testing:

  • Do you understand context switching overhead?

  • Do you know when to use processes vs threads?

  • Can you design systems that minimize unnecessary switches?

  • Context switching: Save current process state, restore next process state

  • PCB: Process Control Block stores process state (registers, PC, memory info)

  • When it occurs: Time slice expired, I/O wait, higher priority, termination

  • Cost: 1-10 microseconds overhead

  • Steps: Save state → Update states → Switch memory → Restore state → Transfer control

  • Optimization: Lazy register saving, ASID for TLB, CPU affinity, thread switching

  • Best practices: Minimize switches, optimize register saving, use hardware support

How InterviewCrafted Will Teach This

We'll teach this through production failures, not definitions. Instead of memorizing "context switching saves and restores process state," you'll learn through scenarios like "why did our system become slow when we created 1000 processes?"

You'll see how context switching affects system performance and design. When an interviewer asks "how would you optimize a system with many processes?", you'll think about switching overhead, process pools, and optimization strategies—not just "create more processes."

Key Takeaways

Context switching: Save current process state, restore next process state

PCB: Process Control Block stores process state (registers, PC, memory info)

When it occurs: Time slice expired, I/O wait, higher priority, termination

Cost: 1-10 microseconds overhead

Steps: Save state → Update states → Switch memory → Restore state → Transfer control

Optimization: Lazy register saving, ASID for TLB, CPU affinity, thread switching

Best practices: Minimize switches, optimize register saving, use hardware support


About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.