Topic Overview

Message Queues (Kafka, RabbitMQ)

Compare message queue systems: Kafka for event streaming vs RabbitMQ for traditional messaging. Learn use cases, patterns, and when to choose each.

Message queues enable asynchronous communication between services. Kafka and RabbitMQ are two popular systems with different strengths: Kafka excels at event streaming, while RabbitMQ is better for traditional message queuing.


What are Message Queues?

Message queues provide:

  • Asynchronous communication: Services don't need to be available simultaneously
  • Decoupling: Services communicate via queue, not directly
  • Reliability: Messages persisted until consumed
  • Scalability: Handle high message volumes

Use cases:

  • Microservices communication
  • Event-driven architecture
  • Task queues
  • Log aggregation
  • Real-time analytics

RabbitMQ

Type: Traditional message broker Protocol: AMQP (Advanced Message Queuing Protocol) Model: Queue-based (point-to-point)

Key Features

  • Queues: Messages stored in queues
  • Exchanges: Route messages to queues
  • Bindings: Connect exchanges to queues
  • Acknowledgment: Messages acknowledged after processing
  • Durability: Messages can be persisted

Architecture

Producer → Exchange → Queue → Consumer

Exchange Types:

  • Direct: Route by routing key
  • Topic: Route by pattern matching
  • Fanout: Broadcast to all queues
  • Headers: Route by header attributes

Example

import pika

# Producer
connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)

channel.basic_publish(
    exchange='',
    routing_key='task_queue',
    body='Hello World!',
    properties=pika.BasicProperties(delivery_mode=2)  # Make message persistent
)

# Consumer
def callback(ch, method, properties, body):
    print(f"Received: {body}")
    ch.basic_ack(delivery_tag=method.delivery_tag)  # Acknowledge

channel.basic_consume(queue='task_queue', on_message_callback=callback)
channel.start_consuming()

Kafka

Type: Distributed event streaming platform Protocol: Custom binary protocol over TCP Model: Topic-based (pub-sub)

Key Features

  • Topics: Messages organized in topics
  • Partitions: Topics split into partitions for parallelism
  • Producers: Write to topics
  • Consumers: Read from topics (consumer groups)
  • Retention: Messages retained for configurable time
  • High throughput: Designed for high message volumes

Architecture

Producer → Topic (Partitions) → Consumer Groups

Concepts:

  • Topic: Category/stream of messages
  • Partition: Topic split into partitions (parallelism)
  • Offset: Position in partition
  • Consumer Group: Group of consumers sharing work

Example

from kafka import KafkaProducer, KafkaConsumer

# Producer
producer = KafkaProducer(bootstrap_servers=['localhost:9092'])

producer.send('user-events', key=b'user123', value=b'{"action": "login"}')
producer.flush()

# Consumer
consumer = KafkaConsumer(
    'user-events',
    bootstrap_servers=['localhost:9092'],
    group_id='analytics-group',
    auto_offset_reset='earliest'
)

for message in consumer:
    print(f"Topic: {message.topic}, Partition: {message.partition}, Offset: {message.offset}")
    print(f"Key: {message.key}, Value: {message.value}")

Comparison

FeatureRabbitMQKafka
ModelQueue-based (point-to-point)Topic-based (pub-sub)
Message retentionDeleted after consumptionRetained for configurable time
ThroughputLower (thousands/sec)Higher (millions/sec)
LatencyLower (milliseconds)Higher (milliseconds to seconds)
OrderingPer-queuePer-partition
ReplayNoYes (messages retained)
Use caseTask queues, RPCEvent streaming, log aggregation

Use Cases

Use RabbitMQ When:

  • Task queues: Background job processing
  • RPC: Request-response patterns
  • Low latency: Need immediate processing
  • Simple routing: Direct, topic, fanout exchanges
  • Message acknowledgment: Need delivery guarantees

Example:

  • Email sending
  • Image processing
  • PDF generation
  • Notification delivery

Use Kafka When:

  • Event streaming: High-volume event streams
  • Log aggregation: Collecting logs from many sources
  • Real-time analytics: Processing events in real-time
  • Replay capability: Need to replay messages
  • High throughput: Millions of messages per second

Example:

  • User activity tracking
  • IoT sensor data
  • Financial transactions
  • Clickstream analysis

Examples

RabbitMQ: Task Queue

# Producer (Task dispatcher)
import pika
import json

connection = pika.BlockingConnection(pika.ConnectionParameters('localhost'))
channel = connection.channel()

channel.queue_declare(queue='image_processing', durable=True)

def send_task(image_url):
    message = json.dumps({'image_url': image_url, 'task_id': '123'})
    channel.basic_publish(
        exchange='',
        routing_key='image_processing',
        body=message,
        properties=pika.BasicProperties(delivery_mode=2)
    )

# Consumer (Worker)
def process_image(ch, method, properties, body):
    task = json.loads(body)
    # Process image
    result = process_image_task(task['image_url'])
    
    # Acknowledge
    ch.basic_ack(delivery_tag=method.delivery_tag)

channel.basic_qos(prefetch_count=1)  # Fair dispatch
channel.basic_consume(queue='image_processing', on_message_callback=process_image)
channel.start_consuming()

Kafka: Event Streaming

# Producer (Event publisher)
from kafka import KafkaProducer
import json

producer = KafkaProducer(
    bootstrap_servers=['localhost:9092'],
    value_serializer=lambda v: json.dumps(v).encode('utf-8'),
    key_serializer=lambda k: k.encode('utf-8') if k else None
)

def publish_event(user_id, event_type, data):
    producer.send('user-events', key=user_id, value={
        'user_id': user_id,
        'event_type': event_type,
        'data': data,
        'timestamp': time.time()
    })

# Consumer (Event processor)
from kafka import KafkaConsumer

consumer = KafkaConsumer(
    'user-events',
    bootstrap_servers=['localhost:9092'],
    group_id='analytics-processor',
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

for message in consumer:
    event = message.value
    # Process event
    process_event(event)
    # Offset automatically committed

Common Pitfalls

  • Choosing wrong system: Using Kafka for simple task queues. Fix: Use RabbitMQ for task queues, Kafka for event streaming
  • Not handling failures: Messages lost on failure. Fix: Use acknowledgments, persistence, replication
  • Consumer lag: Consumers can't keep up. Fix: Scale consumers, optimize processing, increase partitions
  • Message ordering: Assuming global ordering. Fix: Kafka orders per-partition, RabbitMQ per-queue
  • Not monitoring: No visibility into queue health. Fix: Monitor queue depth, consumer lag, throughput

Interview Questions

Beginner

Q: What is the difference between RabbitMQ and Kafka? When would you use each?

A:

RabbitMQ:

  • Type: Traditional message broker
  • Model: Queue-based (point-to-point)
  • Message retention: Deleted after consumption
  • Throughput: Lower (thousands/sec)
  • Latency: Lower (milliseconds)
  • Use case: Task queues, RPC, simple messaging

Kafka:

  • Type: Distributed event streaming platform
  • Model: Topic-based (pub-sub)
  • Message retention: Retained for configurable time
  • Throughput: Higher (millions/sec)
  • Latency: Higher (milliseconds to seconds)
  • Use case: Event streaming, log aggregation, real-time analytics

When to use:

  • RabbitMQ: Task queues, background jobs, RPC, low latency needed
  • Kafka: Event streaming, high throughput, replay capability, log aggregation

Example:

RabbitMQ: Email sending queue
  Producer → Queue → Consumer (sends email, deletes message)

Kafka: User activity events
  Producer → Topic → Multiple Consumers (analytics, logging, etc.)
  Messages retained for replay

Intermediate

Q: Explain how Kafka achieves high throughput and how consumer groups work.

A:

High Throughput:

  1. Partitioning

    Topic: user-events
      Partition 0: [msg1, msg4, msg7]
      Partition 1: [msg2, msg5, msg8]
      Partition 2: [msg3, msg6, msg9]
    
    Producers write to partitions in parallel
    Consumers read from partitions in parallel
    
  2. Batching

    Producer batches messages before sending
    Reduces network overhead
    
  3. Zero-copy

    Kafka uses zero-copy for efficient data transfer
    Data not copied multiple times in memory
    
  4. Sequential I/O

    Messages appended sequentially to disk
    Fast disk writes (vs random access)
    

Consumer Groups:

Topic: user-events (3 partitions)
Consumer Group: analytics-group

Consumer 1 → Partition 0
Consumer 2 → Partition 1
Consumer 3 → Partition 2

Each consumer reads from one partition
Parallel processing

Rebalancing:

Add Consumer 4:
  Rebalance: Each consumer gets fewer partitions
  Consumer 1 → Partition 0
  Consumer 2 → Partition 1
  Consumer 3 → Partition 2
  Consumer 4 → (no partition, idle)

Remove Consumer 3:
  Rebalance: Remaining consumers get more partitions
  Consumer 1 → Partition 0, 2
  Consumer 2 → Partition 1

Benefits:

  • Parallelism: Multiple consumers process in parallel
  • Scalability: Add consumers to scale
  • Fault tolerance: Consumer failure, others take over

Senior

Q: Design a messaging system that handles 10 million messages per second. How do you choose between Kafka and RabbitMQ, handle failures, and ensure message delivery?

A:

class HighThroughputMessagingSystem {
  private kafka: KafkaCluster;
  private rabbitmq: RabbitMQCluster;
  private router: MessageRouter;
  
  constructor() {
    // Kafka for high-throughput event streaming
    this.kafka = new KafkaCluster({
      brokers: ['kafka1', 'kafka2', 'kafka3'],
      replicationFactor: 3,
      partitions: 100 // High parallelism
    });
    
    // RabbitMQ for task queues
    this.rabbitmq = new RabbitMQCluster({
      nodes: ['rabbit1', 'rabbit2'],
      mirroredQueues: true
    });
    
    this.router = new MessageRouter();
  }
  
  // 1. Route by Message Type
  async publish(message: Message): Promise<void> {
    if (message.type === 'event' || message.throughput > 10000) {
      // High throughput: Use Kafka
      await this.kafka.publish(message.topic, message);
    } else {
      // Task queue: Use RabbitMQ
      await this.rabbitmq.publish(message.queue, message);
    }
  }
  
  // 2. Kafka Configuration for High Throughput
  class KafkaCluster {
    async configureForHighThroughput(): Promise<void> {
      // Increase batch size
      this.producerConfig = {
        batchSize: 100000, // 100KB batches
        lingerMs: 10, // Wait 10ms to batch
        compressionType: 'snappy'
      };
      
      // Increase partitions for parallelism
      await this.createTopic('events', {
        partitions: 100,
        replicationFactor: 3
      });
      
      // Tune consumers
      this.consumerConfig = {
        fetchMinBytes: 1024,
        fetchMaxWaitMs: 500,
        maxPartitionFetchBytes: 1048576
      };
    }
  }
  
  // 3. Message Delivery Guarantees
  class MessageDelivery {
    // Kafka: At-least-once delivery
    async publishWithAcks(message: Message): Promise<void> {
      await this.kafka.producer.send({
        topic: message.topic,
        messages: [message],
        acks: 'all' // Wait for all replicas
      });
    }
    
    // RabbitMQ: Exactly-once delivery
    async publishWithConfirmation(message: Message): Promise<void> {
      const channel = await this.rabbitmq.createChannel();
      await channel.confirmSelect(); // Publisher confirms
      
      await channel.publish(
        message.exchange,
        message.routingKey,
        message.body,
        { persistent: true }
      );
      
      await channel.waitForConfirms();
    }
  }
  
  // 4. Failure Handling
  class FailureHandler {
    async handleBrokerFailure(broker: string): Promise<void> {
      // Kafka: Automatic failover (replication)
      if (this.kafka.isBroker(broker)) {
        // Replication handles failure
        // Consumers automatically reconnect
      }
      
      // RabbitMQ: Queue mirroring
      if (this.rabbitmq.isNode(broker)) {
        // Mirrored queues on other nodes
        // Automatic failover
      }
    }
    
    async handleConsumerFailure(consumer: Consumer): Promise<void> {
      // Kafka: Consumer group rebalancing
      // Other consumers take over partitions
      
      // RabbitMQ: Unacknowledged messages redelivered
      // Other consumers can pick up messages
    }
  }
  
  // 5. Monitoring
  class Monitoring {
    async getMetrics(): Promise<Metrics> {
      return {
        kafka: {
          throughput: await this.kafka.getThroughput(),
          consumerLag: await this.kafka.getConsumerLag(),
          partitionCount: await this.kafka.getPartitionCount()
        },
        rabbitmq: {
          queueDepth: await this.rabbitmq.getQueueDepth(),
          messageRate: await this.rabbitmq.getMessageRate(),
          consumerCount: await this.rabbitmq.getConsumerCount()
        }
      };
    }
  }
}

Architecture:

High Throughput Events → Kafka
  - User events
  - Logs
  - Analytics

Task Queues → RabbitMQ
  - Email sending
  - Image processing
  - Notifications

Features:

  1. Hybrid approach: Kafka for events, RabbitMQ for tasks
  2. High throughput: Kafka configured for 10M msg/sec
  3. Delivery guarantees: At-least-once (Kafka), exactly-once (RabbitMQ)
  4. Failure handling: Replication, mirroring, automatic failover
  5. Monitoring: Track throughput, lag, queue depth

Key Takeaways

  • RabbitMQ: Traditional message broker, queue-based, good for task queues, low latency
  • Kafka: Event streaming platform, topic-based, high throughput, message retention
  • Use RabbitMQ for: Task queues, RPC, simple messaging, low latency
  • Use Kafka for: Event streaming, high throughput, log aggregation, replay capability
  • Consumer groups: Kafka consumers share work via consumer groups
  • Partitioning: Kafka topics split into partitions for parallelism
  • Delivery guarantees: At-least-once (Kafka), exactly-once (RabbitMQ with confirms)
  • Best practices: Choose right system for use case, handle failures, monitor performance

About the author

InterviewCrafted helps you master system design with patience. We believe in curiosity-led engineering, reflective writing, and designing systems that make future changes feel calm.