Topic Overview

Graceful Shutdown: Draining, Timeouts & In-Flight Requests

Shut down safely: signal handling, connection draining, in-flight requests, timeouts, and load balancer coordination.

16 min read

Graceful Shutdown

Why Engineers Care About This

Graceful shutdown ensures services stop without dropping requests. When you deploy or restart a service, it receives a shutdown signal (SIGTERM). Without graceful shutdown, the service stops immediately, dropping in-flight requests and causing errors. With graceful shutdown, the service stops accepting new requests, finishes in-flight requests, then stops. This prevents request drops and errors during deployments.

When deployments cause request errors, or in-flight requests are dropped, or services take too long to stop, you're hitting graceful shutdown problems. These problems compound. Without graceful shutdown, every deployment causes request errors. Slow shutdowns delay deployments. Good graceful shutdown prevents these problems by ensuring clean shutdowns.

In interviews, when someone asks "How would you handle service shutdown?", they're really asking: "Do you understand graceful shutdown? Do you know how to handle shutdown signals? Do you understand connection draining and in-flight requests?" Most engineers don't. They let services stop immediately (dropping requests) or don't handle shutdown signals at all.

Core Intuitions You Must Build

  • Shutdown signals (SIGTERM, SIGINT) trigger graceful shutdown. When you stop a service (deployment, restart), it receives a shutdown signal (SIGTERM on Linux, SIGINT on Ctrl+C). Handle these signals—stop accepting new requests, finish in-flight requests, then exit. Don't ignore shutdown signals—services will be killed forcefully (SIGKILL) if they don't stop, causing request drops.

  • Connection draining stops accepting new requests while finishing in-flight requests. When shutdown signal is received, stop accepting new requests (remove from load balancer, close listener) but continue processing in-flight requests. This prevents new requests from starting while allowing existing requests to finish. Set a timeout (e.g., 30 seconds)—if requests don't finish, force shutdown.

  • In-flight requests must be allowed to complete. When shutdown signal is received, don't immediately stop—wait for in-flight requests to complete. Track in-flight requests (request counter) and wait until counter reaches zero. Set a timeout—if requests don't complete within timeout, force shutdown. Don't drop in-flight requests—they cause errors and data inconsistency.

  • Background jobs and cleanup must be handled during shutdown. Services often have background jobs (workers, timers, connections). During shutdown, stop background jobs gracefully (finish current work, don't start new work) and clean up resources (close connections, flush buffers). Don't let background jobs run indefinitely—set timeouts and force stop if needed.

  • Shutdown timeout prevents indefinite waiting. Set a shutdown timeout (e.g., 30 seconds). If shutdown doesn't complete within timeout, force shutdown (kill process). This prevents services from hanging during shutdown (stuck requests, deadlocks). Balance timeout—too short causes request drops, too long delays deployments.

  • Health checks should fail during shutdown. During graceful shutdown, health checks should fail (return non-200) so load balancers stop sending traffic. This enables connection draining—load balancers stop sending new requests while service finishes in-flight requests. Don't return 200 OK during shutdown—it causes new requests to be sent to shutting-down service.

Subtopics (Taught Through Real Scenarios)

Shutdown Signal Handling

What people usually get wrong:

Engineers often don't handle shutdown signals. When a service is stopped (deployment, restart), it receives SIGTERM but doesn't handle it. The service is killed forcefully (SIGKILL), dropping in-flight requests and causing errors. Handle shutdown signals—stop accepting new requests, finish in-flight requests, then exit. This prevents request drops during deployments.

How this breaks systems in the real world:

A service didn't handle shutdown signals. During deployments, the service was killed immediately, dropping in-flight requests. Users experienced errors (requests failed mid-processing). The fix? Handle shutdown signals (SIGTERM)—stop accepting new requests, wait for in-flight requests to complete, then exit. Now deployments don't cause request errors. But the real lesson is: shutdown signals must be handled. Without handling, services are killed forcefully, dropping requests.

What interviewers are really listening for:

They want to hear you talk about shutdown signals, graceful shutdown, and request handling. Junior engineers say "services just stop." Senior engineers say "handle shutdown signals (SIGTERM, SIGINT)—stop accepting new requests, finish in-flight requests, then exit to prevent request drops." They're testing whether you understand that shutdown is a process, not just "stopping."

Connection Draining

What people usually get wrong:

Engineers often stop services immediately when shutdown signal is received. But this drops in-flight requests. Connection draining stops accepting new requests while finishing in-flight requests. Remove service from load balancer (health check fails), close listener (stop accepting connections), but continue processing in-flight requests. This prevents new requests while allowing existing requests to finish.

How this breaks systems in the real world:

A service stopped immediately when shutdown signal was received. In-flight requests were dropped, causing errors. During deployments, users experienced request failures. The fix? Implement connection draining—stop accepting new requests (remove from load balancer, close listener) but continue processing in-flight requests. Now deployments don't drop requests. But the real lesson is: connection draining prevents request drops. Stop accepting new requests, finish in-flight requests.

What interviewers are really listening for:

They want to hear you talk about connection draining, stopping new requests, and finishing in-flight requests. Junior engineers say "just stop the service." Senior engineers say "implement connection draining—stop accepting new requests (remove from load balancer, close listener) but continue processing in-flight requests to prevent drops." They're testing whether you understand that shutdown is about finishing work, not just "stopping."

Shutdown Timeout

What people usually get wrong:

Engineers often wait indefinitely for in-flight requests to complete. But some requests might hang (deadlocks, slow external APIs), causing services to hang during shutdown. Set a shutdown timeout (e.g., 30 seconds)—if shutdown doesn't complete within timeout, force shutdown. Balance timeout—too short causes request drops, too long delays deployments.

How this breaks systems in the real world:

A service waited indefinitely for in-flight requests to complete during shutdown. Some requests hung (waiting for external API that was down), causing the service to hang during shutdown. Deployments were delayed (waiting for service to stop). The fix? Set shutdown timeout (30 seconds)—if requests don't complete within timeout, force shutdown. Now deployments complete within acceptable time. But the real lesson is: shutdown timeout prevents indefinite waiting. Set timeout and force shutdown if needed.

What interviewers are really listening for:

They want to hear you talk about shutdown timeout, balancing timeout duration, and force shutdown. Junior engineers say "just wait for requests to complete." Senior engineers say "set shutdown timeout (e.g., 30 seconds)—if shutdown doesn't complete within timeout, force shutdown to prevent hanging and delay deployments." They're testing whether you understand that shutdown is about completing within acceptable time, not just "waiting."


  • Shutdown signals (SIGTERM, SIGINT) trigger graceful shutdown—handle them to prevent request drops
  • Connection draining stops accepting new requests while finishing in-flight requests—prevents new requests while allowing existing requests to finish
  • In-flight requests must be allowed to complete—track requests and wait until counter reaches zero
  • Background jobs and cleanup must be handled during shutdown—stop jobs gracefully and clean up resources
  • Shutdown timeout prevents indefinite waiting—set timeout and force shutdown if needed
  • Health checks should fail during shutdown—return non-200 so load balancers stop sending traffic
  • Good graceful shutdown prevents request drops and errors during deployments

Key Takeaways

Shutdown signals (SIGTERM, SIGINT) trigger graceful shutdown—handle them to prevent request drops

Connection draining stops accepting new requests while finishing in-flight requests—prevents new requests while allowing existing requests to finish

In-flight requests must be allowed to complete—track requests and wait until counter reaches zero

Background jobs and cleanup must be handled during shutdown—stop jobs gracefully and clean up resources

Shutdown timeout prevents indefinite waiting—set timeout and force shutdown if needed

Health checks should fail during shutdown—return non-200 so load balancers stop sending traffic

Good graceful shutdown prevents request drops and errors during deployments


About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.