Topic Overview

Process vs Thread

Understand the fundamental differences between processes and threads: isolation, memory sharing, context switching, and when to use each.

Beginner10 min read

Process vs Thread

Why This Matters

Think of processes as separate apartments and threads as roommates sharing an apartment. If one roommate (thread) breaks something, everyone in that apartment is affected. But if someone in a different apartment (process) has a problem, your apartment is fine.

This distinction matters because it determines how your system fails. When a thread crashes, it can take down the entire process. When a process crashes, other processes keep running. This is why microservices run as separate processes—one service crashing doesn't kill the others.

In interviews, when someone asks "How would you design a system that handles 10,000 concurrent requests?", they're testing whether you understand this trade-off. Do you use processes for isolation, or threads for efficiency? The answer depends on what you're optimizing for.

What Engineers Usually Get Wrong

Most engineers think "threads are faster, so use threads." But that's missing the point. Threads are faster to create and switch between, but they share memory. This means one buggy thread can corrupt data used by other threads. Processes are slower to create, but they're isolated. One process crashing doesn't affect others.

Also, engineers often confuse "concurrency" with "parallelism." Threads give you concurrency (multiple things happening), but true parallelism only happens if you have multiple CPU cores. On a single-core machine, threads just take turns. Processes can actually run in parallel on different cores.

How This Breaks Systems in the Real World

A Java web service was handling requests by creating a new thread for each request. Under normal load, this worked fine. But during a traffic spike, the service tried to create 50,000 threads. The OS ran out of thread slots (there's a limit—usually around 32,000 on Linux). The JVM crashed with "unable to create native thread" errors. The service went down.

The fix? Use a thread pool with a bounded size (say, 1000 threads). But the real lesson is: threads share memory, so one buggy thread can corrupt data used by others. This is why race conditions are so dangerous—they're hard to reproduce and can cause data corruption.

Another story: A Python service was processing files. It used threads because "threads are faster." But Python has the Global Interpreter Lock (GIL), which means only one thread can execute Python code at a time. So the threads weren't actually running in parallel—they were just taking turns. The service was slow. The fix? Use processes instead. Each process has its own Python interpreter, so they can actually run in parallel on multiple CPU cores.


Process

Definition: An independent program in execution with its own memory space.

Characteristics:

  • Isolated memory: Each process has its own address space
  • Independent execution: Processes don't share memory (by default)
  • Heavyweight: Higher overhead for creation and context switching
  • Process ID (PID): Unique identifier for each process
  • Protection: One process cannot directly access another's memory

Memory Layout:

Process A                    Process B
┌─────────────┐             ┌─────────────┐
│   Stack     │             │   Stack     │
│   Heap      │             │   Heap      │
│   Data      │             │   Data      │
│   Code      │             │   Code      │
└─────────────┘             └─────────────┘
     ↓                           ↓
  Separate                      Separate
  Address Space              Address Space

Process Creation Example (Python)

import os
import multiprocessing

def worker_process(name):
    """Worker function for a process"""
    print(f"Process {name} (PID: {os.getpid()})")
    # Each process has its own memory space
    data = [1, 2, 3]  # Isolated to this process
    print(f"Process {name} data: {data}")

if __name__ == '__main__':
    # Create processes
    p1 = multiprocessing.Process(target=worker_process, args=('A',))
    p2 = multiprocessing.Process(target=worker_process, args=('B',))
    
    p1.start()
    p2.start()
    
    p1.join()
    p2.join()

Output:

Process A (PID: 1234)
Process A data: [1, 2, 3]
Process B (PID: 1235)
Process B data: [1, 2, 3]

Thread

Definition: A lightweight unit of execution within a process that shares the process's memory space.

Characteristics:

  • Shared memory: All threads in a process share the same address space
  • Lightweight: Lower overhead for creation and context switching
  • Thread ID (TID): Unique identifier within a process
  • Communication: Threads can directly access shared memory
  • Synchronization needed: Requires locks, mutexes to prevent race conditions

Memory Layout:

Process
┌─────────────────────────────────┐
│         Shared Memory            │
│  ┌─────────┐  ┌─────────┐       │
│  │ Thread1 │  │ Thread2 │       │
│  │  Stack  │  │  Stack  │       │
│  └─────────┘  └─────────┘       │
│         Shared Heap              │
│         Shared Data              │
│         Shared Code              │
└─────────────────────────────────┘

Thread Creation Example (Python)

import threading

shared_data = []  # Shared by all threads

def worker_thread(name):
    """Worker function for a thread"""
    print(f"Thread {name} (TID: {threading.current_thread().ident})")
    # All threads share the same memory
    shared_data.append(name)
    print(f"Shared data: {shared_data}")

# Create threads
t1 = threading.Thread(target=worker_thread, args=('A',))
t2 = threading.Thread(target=worker_thread, args=('B',))

t1.start()
t2.start()

t1.join()
t2.join()

print(f"Final shared data: {shared_data}")

Output:

Thread A (TID: 140234567890432)
Shared data: ['A']
Thread B (TID: 140234567891456)
Shared data: ['A', 'B']
Final shared data: ['A', 'B']

Thread Synchronization Example

import threading

counter = 0
lock = threading.Lock()  # Mutex for synchronization

def increment():
    global counter
    for _ in range(100000):
        with lock:  # Acquire lock
            counter += 1
        # Lock released automatically

# Create threads
t1 = threading.Thread(target=increment)
t2 = threading.Thread(target=increment)

t1.start()
t2.start()

t1.join()
t2.join()

print(f"Final counter: {counter}")  # Should be 200000

Key Differences

AspectProcessThread
MemoryIsolated address spaceShared address space
CommunicationIPC (pipes, sockets, shared memory)Shared memory (direct access)
OverheadHigh (separate memory, resources)Low (shared memory)
Context SwitchExpensive (save/restore entire memory)Cheaper (save/restore registers)
Fault IsolationOne process crash doesn't affect othersOne thread crash can affect all threads
Creation TimeSlowFast
SynchronizationNot needed (isolated)Required (shared memory)

When to Use Processes vs Threads

Use Processes When:

  • Fault isolation needed: One failure shouldn't crash the entire system
  • CPU-bound tasks: Parallel computation on multiple CPUs
  • Independent tasks: Tasks don't need to share data
  • Security: Isolated execution environments

Example: Web server handling multiple requests (each request = process)

Use Threads When:

  • I/O-bound tasks: Waiting for network, disk I/O
  • Shared data: Tasks need to share memory efficiently
  • Lightweight concurrency: Many concurrent tasks
  • GUI applications: Responsive UI while processing

Example: Web server handling multiple requests (each request = thread)


Context Switching

Process Context Switch

Save: Entire process state
  - CPU registers
  - Memory mappings
  - Open files
  - Process control block (PCB)
  
Restore: New process state
  - Flush TLB (Translation Lookaside Buffer)
  - Load new memory mappings
  - Restore registers
  
Cost: High (microseconds)

Thread Context Switch

Save: Thread-specific state
  - CPU registers
  - Stack pointer
  - Thread control block (TCB)
  
Restore: New thread state
  - Same memory space (no TLB flush)
  - Restore registers
  
Cost: Low (nanoseconds)

Performance:

  • Process context switch: ~1-10 microseconds
  • Thread context switch: ~0.1-1 microsecond

Failure Stories You'll Recognize

The Thread Pool Exhaustion: A service created threads on demand. During a traffic spike, it tried to create 10,000 threads. The OS limit was 8,000. The service crashed. The fix? Use a bounded thread pool.

The Race Condition That Corrupted Data: A service used threads to process orders. Two threads tried to update the same order simultaneously. One thread's changes overwrote the other's. Orders were lost. The fix? Use locks or use processes (which don't share memory).

The Process Fork Bomb: A script accidentally called itself recursively, spawning new processes in an infinite loop. Within seconds, the server had thousands of processes competing for CPU time. The system became unresponsive. The fix? Kill the parent process, or reboot. The lesson? Process creation is cheap, but unlimited process creation will kill your system.

What Interviewers Are Really Testing

They want to hear you think about isolation, memory sharing, and when to use each. Junior engineers say "use threads for concurrency." Senior engineers say "processes for isolation and CPU-bound work, threads for I/O-bound work, and always use pools with bounds."

When they ask "How would you design a concurrent web server?", they're testing:

  • Do you understand that threads share memory and need synchronization?
  • Do you know that processes are isolated but have higher overhead?
  • Can you choose the right tool for the job?

Interview Questions

Beginner

Q: What is the difference between a process and a thread?

A:

Process:

  • Independent program in execution
  • Has its own isolated memory space
  • Heavyweight (higher overhead)
  • Process ID (PID) for identification
  • One process crash doesn't affect others
  • Communication via IPC (Inter-Process Communication)

Thread:

  • Lightweight unit of execution within a process
  • Shares memory space with other threads in the same process
  • Lightweight (lower overhead)
  • Thread ID (TID) for identification
  • One thread crash can affect all threads in the process
  • Communication via shared memory (direct access)

Key Difference:

  • Process: Isolated memory, independent execution
  • Thread: Shared memory, requires synchronization

Example:

Process: Like separate apartments (isolated)
Thread: Like rooms in the same apartment (shared)

Intermediate

Q: When would you use processes vs threads? Explain with examples.

A:

Use Processes When:

  1. Fault Isolation

    # Web server: Each request = process
    # If one request crashes, others continue
    for request in requests:
        process = multiprocessing.Process(target=handle_request, args=(request,))
        process.start()
    
  2. CPU-Bound Tasks (Python)

    # Parallel computation on multiple CPUs
    # Python GIL limits threads, use processes
    with multiprocessing.Pool() as pool:
        results = pool.map(compute_heavy_task, data)
    
  3. Independent Tasks

    # Tasks don't need to share data
    # Each process has its own memory
    processes = []
    for task in independent_tasks:
        p = multiprocessing.Process(target=task)
        processes.append(p)
    

Use Threads When:

  1. I/O-Bound Tasks

    # Network requests, file I/O
    # Threads wait while I/O happens
    threads = []
    for url in urls:
        t = threading.Thread(target=fetch_url, args=(url,))
        threads.append(t)
    
  2. Shared Data

    # Multiple threads work on shared data structure
    shared_queue = queue.Queue()
    
    producer = threading.Thread(target=produce, args=(shared_queue,))
    consumer = threading.Thread(target=consume, args=(shared_queue,))
    
  3. GUI Applications

    # Keep UI responsive while processing
    def process_data():
        # Long-running task
        result = heavy_computation()
        update_ui(result)
    
    thread = threading.Thread(target=process_data)
    thread.start()  # UI remains responsive
    

Rule of Thumb:

  • Processes: CPU-bound, fault isolation, independent tasks
  • Threads: I/O-bound, shared data, lightweight concurrency

Senior

Q: Design a concurrent web server that handles 10,000 concurrent connections. Should you use processes or threads? How do you handle context switching, memory management, and fault isolation?

A:

Hybrid Approach: Process Pool + Thread Pool

class ConcurrentWebServer {
  private threadPool: ThreadPool;
  private processPool: ProcessPool;
  private connectionManager: ConnectionManager;
  
  constructor() {
    // Hybrid approach: Processes for isolation, threads for I/O
    this.processPool = new ProcessPool({
      size: os.cpus().length,  // One process per CPU
      strategy: 'prefork'
    });
    
    // Thread pool within each process
    this.threadPool = new ThreadPool({
      size: 1000,  // 1000 threads per process
      queueSize: 10000
    });
    
    this.connectionManager = new ConnectionManager();
  }
  
  async handleRequest(request: Request): Promise<Response> {
    // 1. Accept connection (I/O-bound, use thread)
    const connection = await this.acceptConnection(request);
    
    // 2. Assign to process (load balanced)
    const process = this.processPool.getProcess();
    
    // 3. Handle in thread pool (I/O-bound)
    return await process.handleInThread(connection, async () => {
      // Process request (I/O: database, network)
      const response = await this.processRequest(connection);
      return response;
    });
  }
}

Design Decisions:

  1. Hybrid Approach: Processes for isolation, threads for I/O

    • Processes: Fault isolation, one per CPU core
    • Threads: I/O concurrency, many per process
  2. Context Switching Optimization

    • Use epoll/kqueue (event-driven I/O)
    • Minimize context switches
    • Thread pool to reuse threads
  3. Memory Management

    • Each process: Isolated memory (crash doesn't affect others)
    • Shared memory: Only for connection state (if needed)
    • Connection pooling: Reuse connections
  4. Fault Isolation

    • Process crash: Only affects connections in that process
    • Thread crash: Affects only that thread's connections
    • Health checks: Restart failed processes

Alternative: Event-Driven (Node.js style)

// Single-threaded event loop
// Handles 10,000 connections with async I/O
// No context switching overhead
// But: One crash affects all connections

Trade-offs:

  • Processes + Threads: Better fault isolation, higher overhead
  • Event-driven: Lower overhead, less fault isolation
  • Hybrid: Balance of both

Examples

Example 1: Process Isolation

Scenario: Web server with multiple services

Using processes:

# Each service is a separate process
web_server = Process(target=serve_web)
api_server = Process(target=serve_api)
db_server = Process(target=serve_db)

# If web_server crashes, api_server and db_server continue
# Isolated memory, independent execution

Using threads:

# All services in same process
web_thread = Thread(target=serve_web)
api_thread = Thread(target=serve_api)
db_thread = Thread(target=serve_db)

# If web_thread crashes, entire process crashes
# Shared memory, one crash affects all

Example 2: Context Switching Cost

Process context switch:

Save: CPU registers, page table, memory mappings
Time: ~10-30 microseconds

Thread context switch:

Save: CPU registers only (same memory space)
Time: ~1-5 microseconds

Performance: Thread switching is 5-10x faster

Example 3: Memory Sharing

Processes (isolated):

# Process A
data = [1, 2, 3]  # In Process A's memory

# Process B
data = [4, 5, 6]  # In Process B's memory (different)

# No sharing, must use IPC

Threads (shared):

# Shared data
shared_data = [1, 2, 3]

# Thread A
shared_data.append(4)  # Modifies shared memory

# Thread B
shared_data.append(5)  # Sees Thread A's changes

# Requires synchronization (locks)

Common Pitfalls

Pitfall 1: Using threads for everything

  • Problem: Threads share memory, one buggy thread can crash entire process
  • Solution: Use processes for fault isolation, threads for I/O concurrency
  • Example: Using threads for independent services (one crash kills all)

Pitfall 2: Creating too many processes

  • Problem: Process creation and context switching is expensive
  • Solution: Use process pools, limit process count, use threads for I/O
  • Example: Creating 10,000 processes exhausts system resources

Pitfall 3: Not synchronizing threads

  • Problem: Threads share memory, race conditions cause data corruption
  • Solution: Use mutexes, semaphores, or lock-free data structures
  • Example: Multiple threads modifying shared counter without locks

Pitfall 4: Confusing concurrency with parallelism

  • Problem: Threads provide concurrency, but parallelism requires multiple CPUs
  • Solution: Understand that threads on single CPU just take turns
  • Example: Creating many threads on single-core CPU doesn't improve performance

Pitfall 5: Ignoring resource limits

  • Problem: System has limits on processes and threads
  • Solution: Monitor resource usage, use pools, understand system limits
  • Example: Hitting max process limit (typically 32,000 on Linux)

  • Process: Isolated memory space, independent execution, heavyweight, fault isolation
  • Thread: Shared memory space, lightweight, requires synchronization, one crash can affect all
  • Use processes for: CPU-bound tasks, fault isolation, independent tasks
  • Use threads for: I/O-bound tasks, shared data, lightweight concurrency
  • Context switching: Process switch is expensive (save/restore memory), thread switch is cheaper (save/restore registers)
  • Communication: Processes use IPC, threads use shared memory
  • Synchronization: Threads need locks/mutexes, processes don't (isolated)
  • Best practice: Use processes for isolation, threads for I/O concurrency, hybrid approach for high-performance servers

How InterviewCrafted Will Teach This

We'll teach this through production failures, not definitions. Instead of memorizing "a process is an instance of a running program," you'll learn through scenarios like "what happens when your service tries to create 10,000 threads?"

You'll see how the choice between processes and threads affects system reliability, performance, and debugging. When an interviewer asks "how would you design a concurrent system?", you'll think about isolation, memory sharing, and resource limits—not just "use threads."

Key Takeaways

Process: Isolated memory space, independent execution, heavyweight, fault isolation

Thread: Shared memory space, lightweight, requires synchronization, one crash can affect all

Use processes for: CPU-bound tasks, fault isolation, independent tasks

Use threads for: I/O-bound tasks, shared data, lightweight concurrency

Context switching: Process switch is expensive (save/restore memory), thread switch is cheaper (save/restore registers)

Communication: Processes use IPC, threads use shared memory

Synchronization: Threads need locks/mutexes, processes don't (isolated)

Best practice: Use processes for isolation, threads for I/O concurrency, hybrid approach for high-performance servers


About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.