Topic Overview
Process vs Thread
Understand the fundamental differences between processes and threads: isolation, memory sharing, context switching, and when to use each.
Process vs Thread
Why This Matters
Think of processes as separate apartments and threads as roommates sharing an apartment. If one roommate (thread) breaks something, everyone in that apartment is affected. But if someone in a different apartment (process) has a problem, your apartment is fine.
This distinction matters because it determines how your system fails. When a thread crashes, it can take down the entire process. When a process crashes, other processes keep running. This is why microservices run as separate processes—one service crashing doesn't kill the others.
In interviews, when someone asks "How would you design a system that handles 10,000 concurrent requests?", they're testing whether you understand this trade-off. Do you use processes for isolation, or threads for efficiency? The answer depends on what you're optimizing for.
What Engineers Usually Get Wrong
Most engineers think "threads are faster, so use threads." But that's missing the point. Threads are faster to create and switch between, but they share memory. This means one buggy thread can corrupt data used by other threads. Processes are slower to create, but they're isolated. One process crashing doesn't affect others.
Also, engineers often confuse "concurrency" with "parallelism." Threads give you concurrency (multiple things happening), but true parallelism only happens if you have multiple CPU cores. On a single-core machine, threads just take turns. Processes can actually run in parallel on different cores.
How This Breaks Systems in the Real World
A Java web service was handling requests by creating a new thread for each request. Under normal load, this worked fine. But during a traffic spike, the service tried to create 50,000 threads. The OS ran out of thread slots (there's a limit—usually around 32,000 on Linux). The JVM crashed with "unable to create native thread" errors. The service went down.
The fix? Use a thread pool with a bounded size (say, 1000 threads). But the real lesson is: threads share memory, so one buggy thread can corrupt data used by others. This is why race conditions are so dangerous—they're hard to reproduce and can cause data corruption.
Another story: A Python service was processing files. It used threads because "threads are faster." But Python has the Global Interpreter Lock (GIL), which means only one thread can execute Python code at a time. So the threads weren't actually running in parallel—they were just taking turns. The service was slow. The fix? Use processes instead. Each process has its own Python interpreter, so they can actually run in parallel on multiple CPU cores.
Process
Definition: An independent program in execution with its own memory space.
Characteristics:
- Isolated memory: Each process has its own address space
- Independent execution: Processes don't share memory (by default)
- Heavyweight: Higher overhead for creation and context switching
- Process ID (PID): Unique identifier for each process
- Protection: One process cannot directly access another's memory
Memory Layout:
Process A Process B
┌─────────────┐ ┌─────────────┐
│ Stack │ │ Stack │
│ Heap │ │ Heap │
│ Data │ │ Data │
│ Code │ │ Code │
└─────────────┘ └─────────────┘
↓ ↓
Separate Separate
Address Space Address Space
Process Creation Example (Python)
import os
import multiprocessing
def worker_process(name):
"""Worker function for a process"""
print(f"Process {name} (PID: {os.getpid()})")
# Each process has its own memory space
data = [1, 2, 3] # Isolated to this process
print(f"Process {name} data: {data}")
if __name__ == '__main__':
# Create processes
p1 = multiprocessing.Process(target=worker_process, args=('A',))
p2 = multiprocessing.Process(target=worker_process, args=('B',))
p1.start()
p2.start()
p1.join()
p2.join()
Output:
Process A (PID: 1234)
Process A data: [1, 2, 3]
Process B (PID: 1235)
Process B data: [1, 2, 3]
Thread
Definition: A lightweight unit of execution within a process that shares the process's memory space.
Characteristics:
- Shared memory: All threads in a process share the same address space
- Lightweight: Lower overhead for creation and context switching
- Thread ID (TID): Unique identifier within a process
- Communication: Threads can directly access shared memory
- Synchronization needed: Requires locks, mutexes to prevent race conditions
Memory Layout:
Process
┌─────────────────────────────────┐
│ Shared Memory │
│ ┌─────────┐ ┌─────────┐ │
│ │ Thread1 │ │ Thread2 │ │
│ │ Stack │ │ Stack │ │
│ └─────────┘ └─────────┘ │
│ Shared Heap │
│ Shared Data │
│ Shared Code │
└─────────────────────────────────┘
Thread Creation Example (Python)
import threading
shared_data = [] # Shared by all threads
def worker_thread(name):
"""Worker function for a thread"""
print(f"Thread {name} (TID: {threading.current_thread().ident})")
# All threads share the same memory
shared_data.append(name)
print(f"Shared data: {shared_data}")
# Create threads
t1 = threading.Thread(target=worker_thread, args=('A',))
t2 = threading.Thread(target=worker_thread, args=('B',))
t1.start()
t2.start()
t1.join()
t2.join()
print(f"Final shared data: {shared_data}")
Output:
Thread A (TID: 140234567890432)
Shared data: ['A']
Thread B (TID: 140234567891456)
Shared data: ['A', 'B']
Final shared data: ['A', 'B']
Thread Synchronization Example
import threading
counter = 0
lock = threading.Lock() # Mutex for synchronization
def increment():
global counter
for _ in range(100000):
with lock: # Acquire lock
counter += 1
# Lock released automatically
# Create threads
t1 = threading.Thread(target=increment)
t2 = threading.Thread(target=increment)
t1.start()
t2.start()
t1.join()
t2.join()
print(f"Final counter: {counter}") # Should be 200000
Key Differences
| Aspect | Process | Thread |
|---|---|---|
| Memory | Isolated address space | Shared address space |
| Communication | IPC (pipes, sockets, shared memory) | Shared memory (direct access) |
| Overhead | High (separate memory, resources) | Low (shared memory) |
| Context Switch | Expensive (save/restore entire memory) | Cheaper (save/restore registers) |
| Fault Isolation | One process crash doesn't affect others | One thread crash can affect all threads |
| Creation Time | Slow | Fast |
| Synchronization | Not needed (isolated) | Required (shared memory) |
When to Use Processes vs Threads
Use Processes When:
- Fault isolation needed: One failure shouldn't crash the entire system
- CPU-bound tasks: Parallel computation on multiple CPUs
- Independent tasks: Tasks don't need to share data
- Security: Isolated execution environments
Example: Web server handling multiple requests (each request = process)
Use Threads When:
- I/O-bound tasks: Waiting for network, disk I/O
- Shared data: Tasks need to share memory efficiently
- Lightweight concurrency: Many concurrent tasks
- GUI applications: Responsive UI while processing
Example: Web server handling multiple requests (each request = thread)
Context Switching
Process Context Switch
Save: Entire process state
- CPU registers
- Memory mappings
- Open files
- Process control block (PCB)
Restore: New process state
- Flush TLB (Translation Lookaside Buffer)
- Load new memory mappings
- Restore registers
Cost: High (microseconds)
Thread Context Switch
Save: Thread-specific state
- CPU registers
- Stack pointer
- Thread control block (TCB)
Restore: New thread state
- Same memory space (no TLB flush)
- Restore registers
Cost: Low (nanoseconds)
Performance:
- Process context switch: ~1-10 microseconds
- Thread context switch: ~0.1-1 microsecond
Failure Stories You'll Recognize
The Thread Pool Exhaustion: A service created threads on demand. During a traffic spike, it tried to create 10,000 threads. The OS limit was 8,000. The service crashed. The fix? Use a bounded thread pool.
The Race Condition That Corrupted Data: A service used threads to process orders. Two threads tried to update the same order simultaneously. One thread's changes overwrote the other's. Orders were lost. The fix? Use locks or use processes (which don't share memory).
The Process Fork Bomb: A script accidentally called itself recursively, spawning new processes in an infinite loop. Within seconds, the server had thousands of processes competing for CPU time. The system became unresponsive. The fix? Kill the parent process, or reboot. The lesson? Process creation is cheap, but unlimited process creation will kill your system.
What Interviewers Are Really Testing
They want to hear you think about isolation, memory sharing, and when to use each. Junior engineers say "use threads for concurrency." Senior engineers say "processes for isolation and CPU-bound work, threads for I/O-bound work, and always use pools with bounds."
When they ask "How would you design a concurrent web server?", they're testing:
- Do you understand that threads share memory and need synchronization?
- Do you know that processes are isolated but have higher overhead?
- Can you choose the right tool for the job?
Interview Questions
Beginner
Q: What is the difference between a process and a thread?
A:
Process:
- Independent program in execution
- Has its own isolated memory space
- Heavyweight (higher overhead)
- Process ID (PID) for identification
- One process crash doesn't affect others
- Communication via IPC (Inter-Process Communication)
Thread:
- Lightweight unit of execution within a process
- Shares memory space with other threads in the same process
- Lightweight (lower overhead)
- Thread ID (TID) for identification
- One thread crash can affect all threads in the process
- Communication via shared memory (direct access)
Key Difference:
- Process: Isolated memory, independent execution
- Thread: Shared memory, requires synchronization
Example:
Process: Like separate apartments (isolated)
Thread: Like rooms in the same apartment (shared)
Intermediate
Q: When would you use processes vs threads? Explain with examples.
A:
Use Processes When:
-
Fault Isolation
# Web server: Each request = process # If one request crashes, others continue for request in requests: process = multiprocessing.Process(target=handle_request, args=(request,)) process.start() -
CPU-Bound Tasks (Python)
# Parallel computation on multiple CPUs # Python GIL limits threads, use processes with multiprocessing.Pool() as pool: results = pool.map(compute_heavy_task, data) -
Independent Tasks
# Tasks don't need to share data # Each process has its own memory processes = [] for task in independent_tasks: p = multiprocessing.Process(target=task) processes.append(p)
Use Threads When:
-
I/O-Bound Tasks
# Network requests, file I/O # Threads wait while I/O happens threads = [] for url in urls: t = threading.Thread(target=fetch_url, args=(url,)) threads.append(t) -
Shared Data
# Multiple threads work on shared data structure shared_queue = queue.Queue() producer = threading.Thread(target=produce, args=(shared_queue,)) consumer = threading.Thread(target=consume, args=(shared_queue,)) -
GUI Applications
# Keep UI responsive while processing def process_data(): # Long-running task result = heavy_computation() update_ui(result) thread = threading.Thread(target=process_data) thread.start() # UI remains responsive
Rule of Thumb:
- Processes: CPU-bound, fault isolation, independent tasks
- Threads: I/O-bound, shared data, lightweight concurrency
Senior
Q: Design a concurrent web server that handles 10,000 concurrent connections. Should you use processes or threads? How do you handle context switching, memory management, and fault isolation?
A:
Hybrid Approach: Process Pool + Thread Pool
class ConcurrentWebServer {
private threadPool: ThreadPool;
private processPool: ProcessPool;
private connectionManager: ConnectionManager;
constructor() {
// Hybrid approach: Processes for isolation, threads for I/O
this.processPool = new ProcessPool({
size: os.cpus().length, // One process per CPU
strategy: 'prefork'
});
// Thread pool within each process
this.threadPool = new ThreadPool({
size: 1000, // 1000 threads per process
queueSize: 10000
});
this.connectionManager = new ConnectionManager();
}
async handleRequest(request: Request): Promise<Response> {
// 1. Accept connection (I/O-bound, use thread)
const connection = await this.acceptConnection(request);
// 2. Assign to process (load balanced)
const process = this.processPool.getProcess();
// 3. Handle in thread pool (I/O-bound)
return await process.handleInThread(connection, async () => {
// Process request (I/O: database, network)
const response = await this.processRequest(connection);
return response;
});
}
}
Design Decisions:
-
Hybrid Approach: Processes for isolation, threads for I/O
- Processes: Fault isolation, one per CPU core
- Threads: I/O concurrency, many per process
-
Context Switching Optimization
- Use epoll/kqueue (event-driven I/O)
- Minimize context switches
- Thread pool to reuse threads
-
Memory Management
- Each process: Isolated memory (crash doesn't affect others)
- Shared memory: Only for connection state (if needed)
- Connection pooling: Reuse connections
-
Fault Isolation
- Process crash: Only affects connections in that process
- Thread crash: Affects only that thread's connections
- Health checks: Restart failed processes
Alternative: Event-Driven (Node.js style)
// Single-threaded event loop
// Handles 10,000 connections with async I/O
// No context switching overhead
// But: One crash affects all connections
Trade-offs:
- Processes + Threads: Better fault isolation, higher overhead
- Event-driven: Lower overhead, less fault isolation
- Hybrid: Balance of both
Examples
Example 1: Process Isolation
Scenario: Web server with multiple services
Using processes:
# Each service is a separate process
web_server = Process(target=serve_web)
api_server = Process(target=serve_api)
db_server = Process(target=serve_db)
# If web_server crashes, api_server and db_server continue
# Isolated memory, independent execution
Using threads:
# All services in same process
web_thread = Thread(target=serve_web)
api_thread = Thread(target=serve_api)
db_thread = Thread(target=serve_db)
# If web_thread crashes, entire process crashes
# Shared memory, one crash affects all
Example 2: Context Switching Cost
Process context switch:
Save: CPU registers, page table, memory mappings
Time: ~10-30 microseconds
Thread context switch:
Save: CPU registers only (same memory space)
Time: ~1-5 microseconds
Performance: Thread switching is 5-10x faster
Example 3: Memory Sharing
Processes (isolated):
# Process A
data = [1, 2, 3] # In Process A's memory
# Process B
data = [4, 5, 6] # In Process B's memory (different)
# No sharing, must use IPC
Threads (shared):
# Shared data
shared_data = [1, 2, 3]
# Thread A
shared_data.append(4) # Modifies shared memory
# Thread B
shared_data.append(5) # Sees Thread A's changes
# Requires synchronization (locks)
Common Pitfalls
Pitfall 1: Using threads for everything
- Problem: Threads share memory, one buggy thread can crash entire process
- Solution: Use processes for fault isolation, threads for I/O concurrency
- Example: Using threads for independent services (one crash kills all)
Pitfall 2: Creating too many processes
- Problem: Process creation and context switching is expensive
- Solution: Use process pools, limit process count, use threads for I/O
- Example: Creating 10,000 processes exhausts system resources
Pitfall 3: Not synchronizing threads
- Problem: Threads share memory, race conditions cause data corruption
- Solution: Use mutexes, semaphores, or lock-free data structures
- Example: Multiple threads modifying shared counter without locks
Pitfall 4: Confusing concurrency with parallelism
- Problem: Threads provide concurrency, but parallelism requires multiple CPUs
- Solution: Understand that threads on single CPU just take turns
- Example: Creating many threads on single-core CPU doesn't improve performance
Pitfall 5: Ignoring resource limits
- Problem: System has limits on processes and threads
- Solution: Monitor resource usage, use pools, understand system limits
- Example: Hitting max process limit (typically 32,000 on Linux)
- Process: Isolated memory space, independent execution, heavyweight, fault isolation
- Thread: Shared memory space, lightweight, requires synchronization, one crash can affect all
- Use processes for: CPU-bound tasks, fault isolation, independent tasks
- Use threads for: I/O-bound tasks, shared data, lightweight concurrency
- Context switching: Process switch is expensive (save/restore memory), thread switch is cheaper (save/restore registers)
- Communication: Processes use IPC, threads use shared memory
- Synchronization: Threads need locks/mutexes, processes don't (isolated)
- Best practice: Use processes for isolation, threads for I/O concurrency, hybrid approach for high-performance servers
How InterviewCrafted Will Teach This
We'll teach this through production failures, not definitions. Instead of memorizing "a process is an instance of a running program," you'll learn through scenarios like "what happens when your service tries to create 10,000 threads?"
You'll see how the choice between processes and threads affects system reliability, performance, and debugging. When an interviewer asks "how would you design a concurrent system?", you'll think about isolation, memory sharing, and resource limits—not just "use threads."
- Context Switching - How the OS switches between processes and threads, and the performance overhead involved
- Synchronization (Mutex, Semaphore) - How threads coordinate access to shared memory and prevent race conditions
- Shared Memory vs Message Passing - Communication mechanisms between processes and threads
- PCB (Process Control Block) - How the OS tracks process and thread state
- Thread Models (1:1, N:1, M:N) - Different threading models and their trade-offs
Key Takeaways
Process: Isolated memory space, independent execution, heavyweight, fault isolation
Thread: Shared memory space, lightweight, requires synchronization, one crash can affect all
Use processes for: CPU-bound tasks, fault isolation, independent tasks
Use threads for: I/O-bound tasks, shared data, lightweight concurrency
Context switching: Process switch is expensive (save/restore memory), thread switch is cheaper (save/restore registers)
Communication: Processes use IPC, threads use shared memory
Synchronization: Threads need locks/mutexes, processes don't (isolated)
Best practice: Use processes for isolation, threads for I/O concurrency, hybrid approach for high-performance servers
Related Topics
Context Switching
How the OS switches between processes and threads, and the performance overhead involved
Synchronization (Mutex, Semaphore)
How threads coordinate access to shared memory and prevent race conditions
Shared Memory vs Message Passing
Communication mechanisms between processes and threads
PCB (Process Control Block)
How the OS tracks process and thread state
Thread Models (1:1, N:1, M:N)
Different threading models and their trade-offs
What's next?