Database Topic
Time-Series Databases
Master time-series databases for metrics, logs, and IoT data. Essential for observability and monitoring system design interviews.
Time-Series Databases
Why This Matters
Think of time-series databases like a data logger for a car. Every second, it records speed, RPM, temperature, etc. Over time, you have millions of data points. A time-series database is optimized for this—storing data points with timestamps, querying by time range, and aggregating over time. Regular databases aren't optimized for this workload.
This matters because many systems generate time-series data: metrics (CPU, memory, requests per second), logs (application logs, system logs), IoT sensors (temperature, humidity, motion). Time-series databases handle high write throughput (millions of points per second) and time-based queries efficiently. Regular databases struggle with this workload.
In interviews, when someone asks "How would you store metrics?", they're testing whether you understand time-series databases. Do you know when to use them? Do you understand their characteristics? Most engineers don't. They use regular databases and wonder why writes are slow.
What Engineers Usually Get Wrong
Most engineers think "time-series databases are just databases with timestamps." But time-series databases are optimized differently: they're append-only (rarely updated), they compress data by time (similar values compress well), they automatically downsample old data, and they're optimized for time-range queries. Understanding this helps you use them effectively.
Engineers also don't understand that time-series databases have different trade-offs. They're fast for time-based queries but slow for complex queries. They handle high write throughput but aren't good for updates. Use them for metrics, logs, and sensor data. Don't use them for transactional workloads.
How This Breaks Systems in the Real World
A service was storing metrics in a relational database. Each metric was a row with a timestamp. With millions of metrics per day, writes were slow. The database became a bottleneck. Queries for "average CPU over last hour" required scanning millions of rows. The fix? Use a time-series database. It's optimized for this workload—append-only writes are fast, time-range queries are efficient, and data compresses well.
Another story: A service was using a time-series database but didn't configure retention. Old data accumulated, consuming disk space. The database ran out of space. The fix? Configure retention policies. Automatically delete or downsample old data. Time-series databases are designed for recent data—old data should be archived or deleted.
What is Time-Series Data?
Data points with timestamps, typically collected over time:
timestamp | metric_name | value | tags
2024-01-15 10:00:00 | cpu_usage | 45.2 | host=server1
2024-01-15 10:00:05 | cpu_usage | 46.1 | host=server1
2024-01-15 10:00:10 | cpu_usage | 44.8 | host=server1
2024-01-15 10:00:00 | memory_usage | 62.5 | host=server1
2024-01-15 10:00:05 | memory_usage | 63.2 | host=server1
Characteristics:
- High write volume (millions of points per second)
- Append-only (rarely updated, mostly inserted)
- Time-ordered queries
- Aggregations over time windows
Why Specialized Databases?
Traditional Databases Struggle With
- High write throughput: Millions of inserts per second
- Time-based queries: Range queries, downsampling, aggregations
- Data retention: Efficiently deleting old data
- Compression: Time-series data compresses very well
Time-Series Databases Excel At
- Append-optimized storage: Optimized for writes
- Time-indexed queries: Fast time-range scans
- Automatic downsampling: Pre-aggregate data at different resolutions
- Data lifecycle: Automatic expiration of old data
Popular Time-Series Databases
InfluxDB
Purpose-built for time-series data.
-- Create database
CREATE DATABASE metrics;
-- Write data (Line Protocol)
cpu,host=server1,region=us-east value=45.2 1705312800000000000
memory,host=server1,region=us-east value=62.5 1705312800000000000
-- Query
SELECT mean("value")
FROM "cpu"
WHERE time >= now() - 1h
GROUP BY time(5m), "host";
Features:
- High write throughput
- Downsampling and retention policies
- Continuous queries
- TICK stack integration
TimescaleDB
PostgreSQL extension for time-series.
-- Create hypertable
CREATE TABLE metrics (
time TIMESTAMPTZ NOT NULL,
device_id INT,
temperature FLOAT,
humidity FLOAT
);
SELECT create_hypertable('metrics', 'time');
-- Query
SELECT
time_bucket('1 hour', time) as hour,
device_id,
avg(temperature) as avg_temp
FROM metrics
WHERE time >= now() - INTERVAL '24 hours'
GROUP BY hour, device_id;
Features:
- SQL interface (PostgreSQL compatible)
- Automatic partitioning by time
- Compression
- Continuous aggregates
Prometheus
Monitoring and alerting toolkit with time-series database.
# Query
cpu_usage{host="server1"}[5m]
# Aggregation
rate(http_requests_total[5m])
# Functions
avg_over_time(cpu_usage[1h])
Features:
- Pull-based metrics collection
- PromQL query language
- Alerting rules
- Integration with Grafana
Amazon Timestream
Fully managed time-series database on AWS.
CREATE TABLE metrics (
measure_name VARCHAR(2048),
measure_value DOUBLE,
time TIMESTAMP,
device_id VARCHAR(2048)
);
SELECT
BIN(time, 5m) as time_bin,
device_id,
AVG(measure_value) as avg_value
FROM metrics
WHERE time >= ago(1h)
GROUP BY time_bin, device_id;
Data Modeling
Metrics and Tags
Metric: cpu_usage
Tags: host=server1, region=us-east, env=production
Value: 45.2
Time: 2024-01-15 10:00:00
Tags: Used for filtering and grouping (like dimensions) Values: The actual measurement
Series
A unique combination of metric name and tags:
Series 1: cpu_usage{host=server1, region=us-east}
Series 2: cpu_usage{host=server2, region=us-east}
Series 3: cpu_usage{host=server1, region=us-west}
Cardinality
High cardinality (many unique series) impacts performance:
Low cardinality: cpu_usage{host=server1}
High cardinality: cpu_usage{host=server1, user_id=12345, request_id=abc123}
Best practice: Limit tag cardinality, use high-cardinality data as values.
Common Operations
Downsampling
Reduce data resolution over time:
Raw data: 1 point per second
After 1 hour: 1 point per minute (average)
After 1 day: 1 point per hour (average)
After 1 week: 1 point per day (average)
Benefits:
- Reduced storage
- Faster queries on historical data
- Maintains trends while discarding detail
Retention Policies
Automatically delete old data:
Raw data: Keep for 7 days
Downsampled 1h: Keep for 30 days
Downsampled 1d: Keep for 1 year
Aggregations
-- Average over time window
SELECT avg(value) FROM metrics
WHERE time >= now() - 1h
GROUP BY time(5m);
-- Rate of change
SELECT derivative(mean(value), 1m)
FROM metrics
WHERE time >= now() - 1h;
-- Percentiles
SELECT percentile(value, 95)
FROM metrics
WHERE time >= now() - 1h;
Query Patterns
Range Queries
-- Last hour
SELECT * FROM metrics
WHERE time >= now() - 1h;
-- Specific time range
SELECT * FROM metrics
WHERE time >= '2024-01-15 10:00:00'
AND time < '2024-01-15 11:00:00';
Filtering by Tags
-- Filter by tag
SELECT * FROM metrics
WHERE host = 'server1'
AND time >= now() - 1h;
-- Multiple tags
SELECT * FROM metrics
WHERE region = 'us-east'
AND env = 'production'
AND time >= now() - 1h;
Aggregations
-- Group by tag
SELECT host, avg(value)
FROM metrics
WHERE time >= now() - 1h
GROUP BY host;
-- Time-based grouping
SELECT
time_bucket('5 minutes', time) as interval,
avg(value) as avg_value
FROM metrics
WHERE time >= now() - 1h
GROUP BY interval;
Use Cases
Monitoring and Observability
- System metrics: CPU, memory, disk, network
- Application metrics: Request rate, latency, error rate
- Business metrics: Revenue, user signups, conversions
IoT and Sensors
- Temperature sensors: Building climate control
- GPS tracking: Vehicle locations over time
- Smart meters: Energy consumption
Financial Data
- Stock prices: Price movements over time
- Trading volumes: Market activity
- Exchange rates: Currency fluctuations
Log Analysis
- Application logs: Error rates, request patterns
- Security logs: Failed login attempts, suspicious activity
- Audit logs: User actions over time
Best Practices
- Limit cardinality: Too many unique series slow down queries
- Use appropriate retention: Delete old data you don't need
- Downsample early: Pre-aggregate data at ingestion time
- Batch writes: Group multiple points in single write
- Index time first: Time is the primary access pattern
- Compress data: Time-series data compresses very well
Performance Considerations
Write Optimization
- Batch inserts: Group multiple points
- Async writes: Don't block on every insert
- Compression: Enable compression for storage efficiency
Query Optimization
- Time ranges: Always specify time ranges in queries
- Limit series: Filter by tags to reduce data scanned
- Use downsampled data: Query aggregated data when possible
- Index tags: Index frequently filtered tags
Trade-offs
Advantages
- High write throughput: Millions of points per second
- Efficient storage: Excellent compression ratios
- Time-optimized queries: Fast range scans and aggregations
- Automatic lifecycle: Built-in retention and downsampling
Limitations
- Limited updates: Not optimized for updating existing points
- Complex queries: Less flexible than general-purpose databases
- Learning curve: Specialized query languages (PromQL, Flux)
Integration Patterns
Metrics Collection
Application → StatsD/Telegraf → Time-Series DB → Grafana
Log Aggregation
Applications → Log shipper → Time-Series DB → Dashboards
Alerting
Time-Series DB → Alerting Rules → Notification System
Interview Questions
1. Beginner Question
Q: What is a time-series database, and when should you use one?
A: A time-series database is optimized for storing and querying data points indexed by time (metrics, logs, sensor data).
When to use:
- High write throughput: Millions of data points per second
- Time-ordered queries: Range queries, aggregations over time
- Automatic downsampling: Reduce data resolution over time
- Data retention: Efficiently delete old data
Example use cases:
- System metrics (CPU, memory, disk)
- Application metrics (request rate, latency, errors)
- IoT sensor data (temperature, GPS)
- Financial data (stock prices, trading volumes)
When NOT to use:
- Transactional data (use relational databases)
- Complex relationships (use graph databases)
- Ad-hoc queries (use data warehouses)
2. Intermediate Question
Q: Explain how time-series databases handle high write throughput and data retention.
A:
High write throughput:
- Append-only storage: Optimized for writes, not updates
- Batch writes: Group multiple points in single write
- Compression: Time-series data compresses very well (similar values)
- No indexes on every write: Indexes built asynchronously
Data retention:
-- Retention policy: Keep raw data for 7 days
CREATE RETENTION POLICY raw_data ON metrics DURATION 7d REPLICATION 1;
-- Downsampled data: Keep 1-minute averages for 30 days
CREATE RETENTION POLICY downsampled_1m ON metrics DURATION 30d REPLICATION 1;
-- Further downsampled: Keep 1-hour averages for 1 year
CREATE RETENTION POLICY downsampled_1h ON metrics DURATION 365d REPLICATION 1;
Automatic downsampling:
-- Continuous query that downsamples automatically
CREATE CONTINUOUS QUERY downsample_1m ON metrics
BEGIN
SELECT mean(value) INTO metrics."1m".downsampled
FROM metrics."raw"
GROUP BY time(1m), *
END
Benefits:
- Raw data: High detail, short retention
- Downsampled: Lower detail, longer retention
- Automatic: No manual cleanup needed
3. Senior-Level System Question
Q: Design a monitoring system for a microservices architecture with 1000 services, each emitting 1000 metrics/second. The system needs to support: real-time dashboards, alerting, and 1-year historical analysis. How would you architect this?
A:
Architecture: Multi-tier time-series storage
-
Data ingestion (High throughput):
Services → StatsD/Telegraf → Kafka → Time-Series DB- Kafka buffer: Handles traffic spikes, provides backpressure
- Batching: Collect metrics for 10-30 seconds before writing
- Sharding: Partition by service/metric type
-
Storage tiers:
-- Tier 1: Hot storage (InfluxDB/TimescaleDB) - Last 7 days -- Raw data, high resolution (1 second) -- Fast queries for dashboards -- Tier 2: Warm storage (ClickHouse) - 7-90 days -- Downsampled to 1-minute resolution -- Good for historical analysis -- Tier 3: Cold storage (S3 + Parquet) - 90-365 days -- Downsampled to 1-hour resolution -- Cheap storage, slower queries -
Real-time layer (Redis):
# Cache recent metrics for dashboards def get_metric(metric_name, time_range): if time_range < 1_hour: # Get from Redis (in-memory, fast) return redis.get(f"metric:{metric_name}:{time_range}") else: # Query time-series database return query_tsdb(metric_name, time_range) -
Downsampling pipeline:
# Continuous downsampling jobs def downsample_metrics(): # Every hour: Downsample 1-second data to 1-minute raw_data = query_tsdb("raw", last_hour, resolution="1s") downsampled = aggregate(raw_data, "1m", function="mean") write_to_tsdb("downsampled_1m", downsampled) # Every day: Downsample 1-minute to 1-hour minute_data = query_tsdb("downsampled_1m", last_day) hour_data = aggregate(minute_data, "1h", function="mean") write_to_s3("downsampled_1h", hour_data) # Cold storage -
Query optimization:
- Materialized views: Pre-aggregate common queries
- Time-based partitioning: Partition by day/week
- Compression: Use columnar format (Parquet) for cold storage
-
Alerting system:
# Real-time alerting on recent data def check_alerts(): recent_metrics = query_redis("metrics:recent") for alert_rule in alert_rules: if evaluate(alert_rule, recent_metrics): send_alert(alert_rule)
Scalability:
- Write path: 1M metrics/sec → Kafka (handles spikes) → Batch writes to TSDB
- Read path: Redis cache → Hot TSDB → Warm storage → Cold storage
- Storage: Compression + downsampling reduces storage by 1000x
Trade-offs:
- Latency vs. Cost: Hot storage is fast but expensive
- Detail vs. Retention: More downsampling = longer retention
- Complexity: Multiple storage tiers require data pipeline
-
Time-series databases are optimized for time-stamped data—metrics, logs, sensor data
-
Append-only storage enables high write throughput (millions of points per second)
-
Automatic downsampling reduces data resolution over time (1s → 1m → 1h) for longer retention
-
Retention policies automatically delete old data to manage storage costs
-
High cardinality is expensive—limit unique series (tags) for better performance
-
Use for monitoring and observability—system metrics, application metrics, IoT data
-
Not for transactional data—use relational databases for that
-
Compression is excellent—time-series data compresses 10-100x better than random data
-
Time-based partitioning is essential for large datasets
-
Batch writes are much more efficient than individual point writes
-
Downsample early—pre-aggregate data at ingestion time when possible
-
Multi-tier storage (hot/warm/cold) balances performance and cost
-
NoSQL Basics - Time-series databases are a type of NoSQL database. Understanding NoSQL basics helps understand time-series characteristics.
-
Columnar Databases - Time-series databases often use columnar storage. Understanding columnar databases helps understand time-series internals.
-
Query Optimization - Time-series queries have specific optimization patterns. Understanding query optimization helps optimize time-series queries.
-
Partitioning - Time-series databases partition by time. Understanding partitioning helps design time-series schemas.
-
Data Replication - Time-series databases use replication for availability. Understanding replication helps design distributed time-series systems.
Key Takeaways
Time-series databases are optimized for time-stamped data—metrics, logs, sensor data
Append-only storage enables high write throughput (millions of points per second)
Automatic downsampling reduces data resolution over time (1s → 1m → 1h) for longer retention
Retention policies automatically delete old data to manage storage costs
High cardinality is expensive—limit unique series (tags) for better performance
Use for monitoring and observability—system metrics, application metrics, IoT data
Not for transactional data—use relational databases for that
Compression is excellent—time-series data compresses 10-100x better than random data
Time-based partitioning is essential for large datasets
Batch writes are much more efficient than individual point writes
Downsample early—pre-aggregate data at ingestion time when possible
Multi-tier storage (hot/warm/cold) balances performance and cost
Related Topics
NoSQL Basics
Time-series databases are a type of NoSQL database. Understanding NoSQL basics helps understand time-series characteristics.
Columnar Databases
Time-series databases often use columnar storage. Understanding columnar databases helps understand time-series internals.
Query Optimization
Time-series queries have specific optimization patterns. Understanding query optimization helps optimize time-series queries.
Partitioning
Time-series databases partition by time. Understanding partitioning helps design time-series schemas.
Data Replication
Time-series databases use replication for availability. Understanding replication helps design distributed time-series systems.
Keep exploring
Database concepts build on each other. Explore related topics to deepen your understanding of how data systems work.