Topic Overview
API Gateways
Understand API gateways as the single entry point for microservices. Learn routing, authentication, rate limiting, request/response transformation, and circuit breaking.
An API Gateway is a single entry point for all client requests to microservices. It handles cross-cutting concerns like routing, authentication, rate limiting, and request/response transformation, allowing microservices to focus on business logic.
What is an API Gateway?
Definition: A reverse proxy that sits between clients and microservices, providing a unified interface and handling common concerns.
Benefits:
- Single entry point: Clients interact with one endpoint
- Cross-cutting concerns: Authentication, rate limiting, logging centralized
- Protocol translation: HTTP to gRPC, WebSocket, etc.
- Request routing: Route to appropriate microservice
- Load balancing: Distribute requests across service instances
- API versioning: Manage multiple API versions
API Gateway Functions
1. Request Routing
Route requests to appropriate microservices based on path, method, headers.
Client Request: GET /api/v1/users/123
↓
API Gateway: Route to user-service
↓
user-service: Handle request
Example:
routes:
- path: /api/v1/users/*
service: user-service
method: GET, POST, PUT, DELETE
- path: /api/v1/orders/*
service: order-service
method: GET, POST
- path: /api/v1/payments/*
service: payment-service
method: POST
2. Authentication & Authorization
Validate tokens, check permissions before forwarding requests.
Client Request: GET /api/v1/users/123
↓
API Gateway: Validate JWT token
↓
API Gateway: Check user has permission
↓
user-service: Process request
3. Rate Limiting
Limit requests per client, API key, or IP address.
Client Request: GET /api/v1/users/123
↓
API Gateway: Check rate limit (100 req/hour)
↓
If exceeded: Return 429 Too Many Requests
If OK: Forward to service
4. Request/Response Transformation
Modify requests and responses (add headers, transform data, aggregate).
Client Request: GET /api/v1/users/123
↓
API Gateway: Add internal headers
↓
user-service: Returns user data
↓
API Gateway: Transform response format
↓
Client: Receives formatted response
5. Circuit Breaking
Prevent cascading failures by stopping requests to failing services.
Client Request: GET /api/v1/users/123
↓
API Gateway: Check circuit breaker
↓
If open: Return cached response or error
If closed: Forward to service
6. Load Balancing
Distribute requests across multiple service instances.
Client Request: GET /api/v1/users/123
↓
API Gateway: Load balance across instances
↓
user-service-1, user-service-2, user-service-3
API Gateway Architecture
Basic Architecture
┌─────────────┐
│ Clients │
│ (Web, Mobile)│
└──────┬───────┘
│
┌──────▼───────┐
│ API Gateway │
│ │
│ - Routing │
│ - Auth │
│ - Rate Limit │
└──┬───┬───┬───┘
│ │ │
┌──────────────┘ │ └──────────────┐
│ │ │
┌───────▼──────┐ ┌────────▼──────┐ ┌────────▼──────┐
│ user-service │ │ order-service │ │payment-service│
└──────────────┘ └──────────────┘ └──────────────┘
With Service Mesh
Clients → API Gateway → Service Mesh (Istio/Linkerd) → Microservices
Examples
Simple API Gateway Implementation
from flask import Flask, request, jsonify
import requests
from functools import wraps
app = Flask(__name__)
# Service registry
SERVICES = {
'users': 'http://user-service:8001',
'orders': 'http://order-service:8002',
'payments': 'http://payment-service:8003'
}
# 1. Request Routing
@app.route('/api/v1/<service>/<path:path>', methods=['GET', 'POST', 'PUT', 'DELETE'])
def route_request(service, path):
if service not in SERVICES:
return jsonify({'error': 'Service not found'}), 404
# Get target service URL
target_url = f"{SERVICES[service]}/{path}"
# Forward request
response = requests.request(
method=request.method,
url=target_url,
headers=dict(request.headers),
params=request.args,
json=request.get_json() if request.is_json else None,
timeout=30
)
return jsonify(response.json()), response.status_code
# 2. Authentication Middleware
def require_auth(f):
@wraps(f)
def decorated_function(*args, **kwargs):
token = request.headers.get('Authorization')
if not token:
return jsonify({'error': 'Unauthorized'}), 401
# Validate token (JWT, OAuth, etc.)
if not validate_token(token):
return jsonify({'error': 'Invalid token'}), 401
return f(*args, **kwargs)
return decorated_function
# 3. Rate Limiting
from collections import defaultdict
from time import time
rate_limits = defaultdict(list)
def rate_limit(max_requests=100, window=3600):
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
client_id = request.remote_addr
now = time()
# Remove old requests
rate_limits[client_id] = [
req_time for req_time in rate_limits[client_id]
if now - req_time < window
]
if len(rate_limits[client_id]) >= max_requests:
return jsonify({
'error': 'Rate limit exceeded',
'retry_after': window
}), 429
rate_limits[client_id].append(now)
return f(*args, **kwargs)
return decorated_function
return decorator
# 4. Circuit Breaker
class CircuitBreaker:
def __init__(self, failure_threshold=5, timeout=60):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.failures = 0
self.last_failure_time = None
self.state = 'CLOSED' # CLOSED, OPEN, HALF_OPEN
def call(self, func, *args, **kwargs):
if self.state == 'OPEN':
if time() - self.last_failure_time > self.timeout:
self.state = 'HALF_OPEN'
else:
raise Exception('Circuit breaker is OPEN')
try:
result = func(*args, **kwargs)
if self.state == 'HALF_OPEN':
self.state = 'CLOSED'
self.failures = 0
return result
except Exception as e:
self.failures += 1
self.last_failure_time = time()
if self.failures >= self.failure_threshold:
self.state = 'OPEN'
raise e
circuit_breakers = defaultdict(lambda: CircuitBreaker())
# 5. Request Transformation
@app.route('/api/v1/users/<user_id>')
@require_auth
@rate_limit(max_requests=100, window=3600)
def get_user(user_id):
service = 'users'
circuit_breaker = circuit_breakers[service]
try:
# Add internal headers
headers = dict(request.headers)
headers['X-Internal-Request'] = 'true'
headers['X-User-ID'] = get_user_from_token(request.headers.get('Authorization'))
def make_request():
return requests.get(
f"{SERVICES[service]}/users/{user_id}",
headers=headers,
timeout=10
)
response = circuit_breaker.call(make_request)
# Transform response
data = response.json()
return jsonify({
'id': data['id'],
'name': data['name'],
'email': data['email']
# Hide internal fields
}), response.status_code
except Exception as e:
return jsonify({'error': 'Service unavailable'}), 503
API Gateway with Kong
# Kong configuration
_format_version: "3.0"
services:
- name: user-service
url: http://user-service:8001
routes:
- name: user-route
paths:
- /api/v1/users
plugins:
- name: rate-limiting
config:
minute: 100
- name: jwt
config:
secret_is_base64: false
- name: cors
config:
origins:
- "*"
Common Pitfalls
- Single point of failure: API gateway becomes bottleneck. Fix: Use multiple gateway instances, load balance
- Not handling timeouts: Requests hang when services are slow. Fix: Set appropriate timeouts, use circuit breakers
- Authentication overhead: Validating tokens on every request. Fix: Cache token validation, use short-lived tokens
- No request/response logging: Difficult to debug issues. Fix: Log all requests/responses, use correlation IDs
- Not versioning APIs: Breaking changes affect clients. Fix: Version APIs (/v1, /v2), support multiple versions
- Inefficient routing: Slow path matching. Fix: Use efficient routing algorithms, cache routes
- Not monitoring: No visibility into gateway performance. Fix: Add metrics, alerts, dashboards
Interview Questions
Beginner
Q: What is an API Gateway and why is it used in microservices architecture?
A:
API Gateway is a single entry point for all client requests to microservices. It sits between clients and services, handling cross-cutting concerns.
Why use it:
- Single entry point: Clients interact with one endpoint instead of multiple services
- Centralized concerns: Authentication, rate limiting, logging in one place
- Protocol translation: Convert HTTP to gRPC, WebSocket, etc.
- Request routing: Route requests to appropriate microservices
- Load balancing: Distribute requests across service instances
- API versioning: Manage multiple API versions (/v1, /v2)
- Security: Centralized authentication, authorization, SSL termination
Example:
Without Gateway:
Client → user-service (port 8001)
Client → order-service (port 8002)
Client → payment-service (port 8003)
With Gateway:
Client → API Gateway (port 80)
→ Routes to user-service
→ Routes to order-service
→ Routes to payment-service
Benefits:
- Clients only need to know one endpoint
- Services can change without affecting clients
- Centralized security and monitoring
Intermediate
Q: How does an API Gateway handle authentication and authorization? What are the common patterns?
A:
Authentication Patterns:
-
API Key Authentication
Client → Gateway: API-Key: abc123 Gateway → Validates key → Forwards to service -
JWT Token Authentication
Client → Gateway: Authorization: Bearer <token> Gateway → Validates JWT signature → Extracts user info → Forwards to service -
OAuth 2.0
Client → Gateway: Authorization: Bearer <access_token> Gateway → Validates token with OAuth server → Forwards to service
Authorization Patterns:
-
Role-Based Access Control (RBAC)
# Gateway checks user role if user.role == 'admin': allow_request() else: deny_request() -
Attribute-Based Access Control (ABAC)
# Gateway checks attributes if user.department == 'finance' and resource.type == 'payment': allow_request() -
Policy-Based
policies: - path: /api/v1/payments/* roles: [admin, finance] - path: /api/v1/users/* roles: [admin, hr]
Implementation:
def authenticate_request(request):
token = request.headers.get('Authorization')
if not token:
return None, 401
# Validate JWT
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])
user = get_user(payload['user_id'])
return user, 200
except jwt.InvalidTokenError:
return None, 401
def authorize_request(user, path, method):
# Check permissions
if not user.has_permission(path, method):
return False, 403
return True, 200
Best Practices:
- Cache token validation results
- Use short-lived tokens
- Validate tokens at gateway (not in every service)
- Pass user context to services (headers, not tokens)
Senior
Q: Design a high-performance API Gateway that handles 100,000 requests per second. How do you handle routing, caching, and ensure low latency?
A:
class HighPerformanceAPIGateway {
private routeCache: LRUCache;
private authCache: LRUCache;
private servicePool: Map<string, ConnectionPool>;
private loadBalancer: LoadBalancer;
constructor() {
// Route cache for fast path matching
this.routeCache = new LRUCache({
max: 10000,
ttl: 60000 // 1 minute
});
// Auth cache to avoid repeated token validation
this.authCache = new LRUCache({
max: 100000,
ttl: 300000 // 5 minutes
});
// Connection pools for services
this.servicePool = new Map();
// Load balancer with health checks
this.loadBalancer = new LoadBalancer({
algorithm: 'least-connections',
healthCheckInterval: 5000
});
}
// 1. Fast Route Matching
async routeRequest(path: string, method: string): Promise<Service> {
const cacheKey = `${method}:${path}`;
// Check cache first
let route = this.routeCache.get(cacheKey);
if (route) {
return route;
}
// Match route (use trie for O(path_length) matching)
route = this.matchRoute(path, method);
// Cache result
this.routeCache.set(cacheKey, route);
return route;
}
// 2. Cached Authentication
async authenticate(token: string): Promise<User | null> {
// Check cache
const cached = this.authCache.get(token);
if (cached) {
return cached;
}
// Validate token (async, non-blocking)
const user = await this.validateToken(token);
// Cache result
if (user) {
this.authCache.set(token, user);
}
return user;
}
// 3. Connection Pooling
getServiceConnection(service: Service): Connection {
if (!this.servicePool.has(service.name)) {
this.servicePool.set(service.name, new ConnectionPool({
max: 100,
min: 10,
idleTimeout: 30000
}));
}
return this.servicePool.get(service.name).acquire();
}
// 4. Async Processing
async handleRequest(request: Request): Promise<Response> {
// Parse request (non-blocking)
const parsed = await this.parseRequest(request);
// Parallel operations
const [route, user, rateLimit] = await Promise.all([
this.routeRequest(parsed.path, parsed.method),
this.authenticate(parsed.token),
this.checkRateLimit(parsed.clientId)
]);
// Check authorization
if (!this.authorize(user, route, parsed.method)) {
return this.errorResponse(403);
}
// Get service instance (load balanced)
const serviceInstance = this.loadBalancer.select(route.service);
// Forward request (with connection pooling)
const connection = this.getServiceConnection(route.service);
try {
const response = await connection.request(parsed, {
timeout: 5000,
retries: 2
});
// Transform response
return this.transformResponse(response);
} finally {
connection.release();
}
}
// 5. Response Caching
async getCachedResponse(cacheKey: string): Promise<Response | null> {
// Check cache for GET requests
const cached = await this.redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
return null;
}
// 6. Monitoring
async recordMetrics(request: Request, response: Response, latency: number) {
// Async metrics recording (non-blocking)
this.metrics.record({
path: request.path,
method: request.method,
status: response.status,
latency,
timestamp: Date.now()
});
}
}
Optimizations:
- Route caching: Cache route matches (LRU cache)
- Auth caching: Cache token validation results
- Connection pooling: Reuse connections to services
- Async processing: Parallel operations (route, auth, rate limit)
- Response caching: Cache GET responses
- Load balancing: Distribute requests efficiently
- Health checks: Remove unhealthy instances
- Monitoring: Track latency, errors, throughput
Key Takeaways
- API Gateway: Single entry point for microservices, handles cross-cutting concerns
- Functions: Routing, authentication, rate limiting, transformation, circuit breaking, load balancing
- Authentication: API keys, JWT, OAuth 2.0 - validate at gateway
- Authorization: RBAC, ABAC, policy-based - check permissions before forwarding
- Performance: Route caching, auth caching, connection pooling, async processing
- Resilience: Circuit breakers, timeouts, retries, health checks
- Monitoring: Track latency, errors, throughput, cache hit rates
- Best practices: Fail fast, cache aggressively, use connection pooling, monitor everything