Topic Overview

API Gateways

Understand API gateways as the single entry point for microservices. Learn routing, authentication, rate limiting, request/response transformation, and circuit

12 min read

An API Gateway is a single entry point for all client requests to microservices. It handles cross-cutting concerns like routing, authentication, rate limiting, and request/response transformation, allowing microservices to focus on business logic.

What is an API Gateway?

Definition: A reverse proxy that sits between clients and microservices, providing a unified interface and handling common concerns.

Benefits:

Single entry point: Clients interact with one endpoint
Cross-cutting concerns: Authentication, rate limiting, logging centralized
Protocol translation: HTTP to gRPC, WebSocket, etc.
Request routing: Route to appropriate microservice
Load balancing: Distribute requests across service instances
API versioning: Manage multiple API versions

API Gateway Functions

1. Request Routing

Route requests to appropriate microservices based on path, method, headers.

Client Request: GET /api/v1/users/123
                ↓
API Gateway: Route to user-service
                ↓
user-service: Handle request

Example:

routes:
  - path: /api/v1/users/*
    service: user-service
    method: GET, POST, PUT, DELETE
  
  - path: /api/v1/orders/*
    service: order-service
    method: GET, POST
  
  - path: /api/v1/payments/*
    service: payment-service
    method: POST

2. Authentication & Authorization

Validate tokens, check permissions before forwarding requests.

Client Request: GET /api/v1/users/123
                ↓
API Gateway: Validate JWT token
                ↓
API Gateway: Check user has permission
                ↓
user-service: Process request

3. Rate Limiting

Limit requests per client, API key, or IP address.

Client Request: GET /api/v1/users/123
                ↓
API Gateway: Check rate limit (100 req/hour)
                ↓
If exceeded: Return 429 Too Many Requests
If OK: Forward to service

4. Request/Response Transformation

Modify requests and responses (add headers, transform data, aggregate).

Client Request: GET /api/v1/users/123
                ↓
API Gateway: Add internal headers
                ↓
user-service: Returns user data
                ↓
API Gateway: Transform response format
                ↓
Client: Receives formatted response

5. Circuit Breaking

Prevent cascading failures by stopping requests to failing services.

Client Request: GET /api/v1/users/123
                ↓
API Gateway: Check circuit breaker
                ↓
If open: Return cached response or error
If closed: Forward to service

6. Load Balancing

Distribute requests across multiple service instances.

Client Request: GET /api/v1/users/123
                ↓
API Gateway: Load balance across instances
                ↓
user-service-1, user-service-2, user-service-3

API Gateway Architecture

Basic Architecture

                    ┌─────────────┐
                    │   Clients   │
                    │ (Web, Mobile)│
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │ API Gateway  │
                    │              │
                    │ - Routing    │
                    │ - Auth       │
                    │ - Rate Limit │
                    └──┬───┬───┬───┘
                       │   │   │
        ┌──────────────┘   │   └──────────────┐
        │                  │                  │
┌───────▼──────┐  ┌────────▼──────┐  ┌────────▼──────┐
│ user-service │  │ order-service │  │payment-service│
└──────────────┘  └──────────────┘  └──────────────┘

With Service Mesh

Clients → API Gateway → Service Mesh (Istio/Linkerd) → Microservices

Examples

Simple API Gateway Implementation

from flask import Flask, request, jsonify
import requests
from functools import wraps

app = Flask(__name__)

# Service registry
SERVICES = {
    'users': 'http://user-service:8001',
    'orders': 'http://order-service:8002',
    'payments': 'http://payment-service:8003'
}

# 1. Request Routing
@app.route('/api/v1/<service>/<path:path>', methods=['GET', 'POST', 'PUT', 'DELETE'])
def route_request(service, path):
    if service not in SERVICES:
        return jsonify({'error': 'Service not found'}), 404
    
    # Get target service URL
    target_url = f"{SERVICES[service]}/{path}"
    
    # Forward request
    response = requests.request(
        method=request.method,
        url=target_url,
        headers=dict(request.headers),
        params=request.args,
        json=request.get_json() if request.is_json else None,
        timeout=30
    )
    
    return jsonify(response.json()), response.status_code

# 2. Authentication Middleware
def require_auth(f):
    @wraps(f)
    def decorated_function(*args, **kwargs):
        token = request.headers.get('Authorization')
        
        if not token:
            return jsonify({'error': 'Unauthorized'}), 401
        
        # Validate token (JWT, OAuth, etc.)
        if not validate_token(token):
            return jsonify({'error': 'Invalid token'}), 401
        
        return f(*args, **kwargs)
    return decorated_function

# 3. Rate Limiting
from collections import defaultdict
from time import time

rate_limits = defaultdict(list)

def rate_limit(max_requests=100, window=3600):
    def decorator(f):
        @wraps(f)
        def decorated_function(*args, **kwargs):
            client_id = request.remote_addr
            
            now = time()
            # Remove old requests
            rate_limits[client_id] = [
                req_time for req_time in rate_limits[client_id]
                if now - req_time < window
            ]
            
            if len(rate_limits[client_id]) >= max_requests:
                return jsonify({
                    'error': 'Rate limit exceeded',
                    'retry_after': window
                }), 429
            
            rate_limits[client_id].append(now)
            return f(*args, **kwargs)
        return decorated_function
    return decorator

# 4. Circuit Breaker
class CircuitBreaker:
    def __init__(self, failure_threshold=5, timeout=60):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.failures = 0
        self.last_failure_time = None
        self.state = 'CLOSED'  # CLOSED, OPEN, HALF_OPEN
    
    def call(self, func, *args, **kwargs):
        if self.state == 'OPEN':
            if time() - self.last_failure_time > self.timeout:
                self.state = 'HALF_OPEN'
            else:
                raise Exception('Circuit breaker is OPEN')
        
        try:
            result = func(*args, **kwargs)
            if self.state == 'HALF_OPEN':
                self.state = 'CLOSED'
                self.failures = 0
            return result
        except Exception as e:
            self.failures += 1
            self.last_failure_time = time()
            
            if self.failures >= self.failure_threshold:
                self.state = 'OPEN'
            
            raise e

circuit_breakers = defaultdict(lambda: CircuitBreaker())

# 5. Request Transformation
@app.route('/api/v1/users/<user_id>')
@require_auth
@rate_limit(max_requests=100, window=3600)
def get_user(user_id):
    service = 'users'
    circuit_breaker = circuit_breakers[service]
    
    try:
        # Add internal headers
        headers = dict(request.headers)
        headers['X-Internal-Request'] = 'true'
        headers['X-User-ID'] = get_user_from_token(request.headers.get('Authorization'))
        
        def make_request():
            return requests.get(
                f"{SERVICES[service]}/users/{user_id}",
                headers=headers,
                timeout=10
            )
        
        response = circuit_breaker.call(make_request)
        
        # Transform response
        data = response.json()
        return jsonify({
            'id': data['id'],
            'name': data['name'],
            'email': data['email']
            # Hide internal fields
        }), response.status_code
        
    except Exception as e:
        return jsonify({'error': 'Service unavailable'}), 503

API Gateway with Kong

# Kong configuration
_format_version: "3.0"

services:
  - name: user-service
    url: http://user-service:8001
    routes:
      - name: user-route
        paths:
          - /api/v1/users
        plugins:
          - name: rate-limiting
            config:
              minute: 100
          - name: jwt
            config:
              secret_is_base64: false
          - name: cors
            config:
              origins:
                - "*"

Common Pitfalls

Single point of failure: API gateway becomes bottleneck. Fix: Use multiple gateway instances, load balance
Not handling timeouts: Requests hang when services are slow. Fix: Set appropriate timeouts, use circuit breakers
Authentication overhead: Validating tokens on every request. Fix: Cache token validation, use short-lived tokens
No request/response logging: Difficult to debug issues. Fix: Log all requests/responses, use correlation IDs
Not versioning APIs: Breaking changes affect clients. Fix: Version APIs (/v1, /v2), support multiple versions
Inefficient routing: Slow path matching. Fix: Use efficient routing algorithms, cache routes
Not monitoring: No visibility into gateway performance. Fix: Add metrics, alerts, dashboards

Interview Questions

Beginner

Q: What is an API Gateway and why is it used in microservices architecture?

API Gateway is a single entry point for all client requests to microservices. It sits between clients and services, handling cross-cutting concerns.

Why use it:

Single entry point: Clients interact with one endpoint instead of multiple services
Centralized concerns: Authentication, rate limiting, logging in one place
Protocol translation: Convert HTTP to gRPC, WebSocket, etc.
Request routing: Route requests to appropriate microservices
Load balancing: Distribute requests across service instances
API versioning: Manage multiple API versions (/v1, /v2)
Security: Centralized authentication, authorization, SSL termination

Example:

Without Gateway:
Client → user-service (port 8001)
Client → order-service (port 8002)
Client → payment-service (port 8003)

With Gateway:
Client → API Gateway (port 80)
  → Routes to user-service
  → Routes to order-service
  → Routes to payment-service

Benefits:

Clients only need to know one endpoint
Services can change without affecting clients
Centralized security and monitoring

Intermediate

Q: How does an API Gateway handle authentication and authorization? What are the common patterns?

Authentication Patterns:

API Key Authentication

Client → Gateway: API-Key: abc123
Gateway → Validates key → Forwards to service

JWT Token Authentication

Client → Gateway: Authorization: Bearer <token>
Gateway → Validates JWT signature → Extracts user info → Forwards to service

OAuth 2.0

Client → Gateway: Authorization: Bearer <access_token>
Gateway → Validates token with OAuth server → Forwards to service

Authorization Patterns:

Role-Based Access Control (RBAC)

# Gateway checks user role
if user.role == 'admin':
    allow_request()
else:
    deny_request()

Attribute-Based Access Control (ABAC)

# Gateway checks attributes
if user.department == 'finance' and resource.type == 'payment':
    allow_request()

Policy-Based

policies:
  - path: /api/v1/payments/*
    roles: [admin, finance]
  - path: /api/v1/users/*
    roles: [admin, hr]

Implementation:

def authenticate_request(request):
    token = request.headers.get('Authorization')
    
    if not token:
        return None, 401
    
    # Validate JWT
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=['HS256'])
        user = get_user(payload['user_id'])
        return user, 200
    except jwt.InvalidTokenError:
        return None, 401

def authorize_request(user, path, method):
    # Check permissions
    if not user.has_permission(path, method):
        return False, 403
    return True, 200

Best Practices:

Cache token validation results
Use short-lived tokens
Validate tokens at gateway (not in every service)
Pass user context to services (headers, not tokens)

Senior

Q: Design a high-performance API Gateway that handles 100,000 requests per second. How do you handle routing, caching, and ensure low latency?

class HighPerformanceAPIGateway {
  private routeCache: LRUCache;
  private authCache: LRUCache;
  private servicePool: Map<string, ConnectionPool>;
  private loadBalancer: LoadBalancer;
  
  constructor() {
    // Route cache for fast path matching
    this.routeCache = new LRUCache({
      max: 10000,
      ttl: 60000 // 1 minute
    });
    
    // Auth cache to avoid repeated token validation
    this.authCache = new LRUCache({
      max: 100000,
      ttl: 300000 // 5 minutes
    });
    
    // Connection pools for services
    this.servicePool = new Map();
    
    // Load balancer with health checks
    this.loadBalancer = new LoadBalancer({
      algorithm: 'least-connections',
      healthCheckInterval: 5000
    });
  }
  
  // 1. Fast Route Matching
  async routeRequest(path: string, method: string): Promise<Service> {
    const cacheKey = `${method}:${path}`;
    
    // Check cache first
    let route = this.routeCache.get(cacheKey);
    if (route) {
      return route;
    }
    
    // Match route (use trie for O(path_length) matching)
    route = this.matchRoute(path, method);
    
    // Cache result
    this.routeCache.set(cacheKey, route);
    
    return route;
  }
  
  // 2. Cached Authentication
  async authenticate(token: string): Promise<User | null> {
    // Check cache
    const cached = this.authCache.get(token);
    if (cached) {
      return cached;
    }
    
    // Validate token (async, non-blocking)
    const user = await this.validateToken(token);
    
    // Cache result
    if (user) {
      this.authCache.set(token, user);
    }
    
    return user;
  }
  
  // 3. Connection Pooling
  getServiceConnection(service: Service): Connection {
    if (!this.servicePool.has(service.name)) {
      this.servicePool.set(service.name, new ConnectionPool({
        max: 100,
        min: 10,
        idleTimeout: 30000
      }));
    }
    
    return this.servicePool.get(service.name).acquire();
  }
  
  // 4. Async Processing
  async handleRequest(request: Request): Promise<Response> {
    // Parse request (non-blocking)
    const parsed = await this.parseRequest(request);
    
    // Parallel operations
    const [route, user, rateLimit] = await Promise.all([
      this.routeRequest(parsed.path, parsed.method),
      this.authenticate(parsed.token),
      this.checkRateLimit(parsed.clientId)
    ]);
    
    // Check authorization
    if (!this.authorize(user, route, parsed.method)) {
      return this.errorResponse(403);
    }
    
    // Get service instance (load balanced)
    const serviceInstance = this.loadBalancer.select(route.service);
    
    // Forward request (with connection pooling)
    const connection = this.getServiceConnection(route.service);
    
    try {
      const response = await connection.request(parsed, {
        timeout: 5000,
        retries: 2
      });
      
      // Transform response
      return this.transformResponse(response);
    } finally {
      connection.release();
    }
  }
  
  // 5. Response Caching
  async getCachedResponse(cacheKey: string): Promise<Response | null> {
    // Check cache for GET requests
    const cached = await this.redis.get(cacheKey);
    if (cached) {
      return JSON.parse(cached);
    }
    return null;
  }
  
  // 6. Monitoring
  async recordMetrics(request: Request, response: Response, latency: number) {
    // Async metrics recording (non-blocking)
    this.metrics.record({
      path: request.path,
      method: request.method,
      status: response.status,
      latency,
      timestamp: Date.now()
    });
  }
}

Optimizations:

Route caching: Cache route matches (LRU cache)
Auth caching: Cache token validation results
Connection pooling: Reuse connections to services
Async processing: Parallel operations (route, auth, rate limit)
Response caching: Cache GET responses
Load balancing: Distribute requests efficiently
Health checks: Remove unhealthy instances
Monitoring: Track latency, errors, throughput

API Gateway: Single entry point for microservices, handles cross-cutting concerns
Functions: Routing, authentication, rate limiting, transformation, circuit breaking, load balancing
Authentication: API keys, JWT, OAuth 2.0 - validate at gateway
Authorization: RBAC, ABAC, policy-based - check permissions before forwarding
Performance: Route caching, auth caching, connection pooling, async processing
Resilience: Circuit breakers, timeouts, retries, health checks
Monitoring: Track latency, errors, throughput, cache hit rates
Best practices: Fail fast, cache aggressively, use connection pooling, monitor everything

What's next?

Practice this in System Design →Next topic →