Instant Messaging Platform - ashtishad/system-design GitHub Wiki
1. Requirements
Functional Requirements:
- Users can send/receive 1:1 messages in real time.
- Users can create/join group chats (up to 1K members).
- Users can view message history.
Non-Functional Requirements:
- Real-Time Delivery: <1s latency for messages.
- Scalability: Handle surges (e.g., 10% peak events like New Year’s).
- Availability: 99.99% uptime.
- Durability: No message loss over 5 years.
- Capacity Estimation (5 years):
- DAU: 500M users.
- Messages: 100B/day * 365 * 5 = 182.5T messages.
- Storage:
- Message: 1KB (text + metadata) * 182.5T = 182.5PB raw, ~1.8EB with replication/indexes.
- Users: 500M * 1KB = 500GB.
- QPS:
- Avg: 100B/day ÷ 86,400s ≈ 1.16M QPS.
- Peak (10% events): 50M users * 20 msg/user ÷ 2,400s (busy hour) ≈ 416K QPS.
2. Core Entities
- User: {id, username, phone, status}
- Message: {id, sender_id, receiver_id/group_id, content, timestamp, status (sent/delivered/read)}
- Group: {id, name, member_ids[]}
3. APIs
- POST /messages/send: {receiver_id/group_id, content}, sends message.
- GET /messages/{user_id}?since={timestamp}: Returns message history.
- POST /groups/create: {name, member_ids[]}, creates group.
4. High-Level Design
- Client: Mobile/web app.
- API Gateway: Routes, auth, rate-limiting.
- Microservices:
- Chat Service: Manages 1:1 and group messaging.
- Group Service: Handles group metadata.
- Storage Service: Persists messages/users.
- Data Stores:
- Cassandra: Messages/users (wide-column, high write throughput).
- Redis: Real-time delivery (pub/sub), caching.
- Flow:
- Send → Chat Service → Redis pub/sub → Cassandra.
- History → Storage Service → Cassandra.
- Group → Group Service → Cassandra.
Why Cassandra?
Cassandra is chosen for its high write throughput (millions/sec), horizontal scalability, and tunable consistency. Messaging platforms need fast, durable writes (1.16M QPS avg) and can tolerate eventual consistency for reads (e.g., history). Its partition tolerance suits global distribution, critical for 500M users.
5. Deep Dives
1. Real-Time Message Delivery
- Problem: Deliver messages in <1s at 416K QPS peak.
- Solution:
- Redis Pub/Sub: Sender publishes to channel:user_id, receivers subscribe.
- Fallback: WebSockets for persistent connections.
- Industry Example (WhatsApp): WhatsApp uses XMPP (modified) with persistent TCP connections. Messages route via a gateway to the recipient’s device if online; otherwise, queued. We use Redis for simplicity/scalability, WebSockets for reliability.
- Tech Details:
- Redis: 100K ops/s per node, scales to 416K QPS with 5 nodes.
- WebSockets: 1M connections/server, heartbeat every 30s.
- Tradeoffs: Redis is fast but volatile (mitigated by Cassandra persistence); WebSockets are reliable but resource-intensive.
2. Group Chat Scalability
- Problem: Support 1K-member groups at scale.
- Solution:
- Fan-Out on Write: Write message to each member’s inbox in Cassandra.
- Redis Cache: Store active group metadata (member list).
- Industry Example (Telegram): Telegram uses a “supergroup” model, sharding large groups across servers. Messages fan out to inboxes with MTProto. We adopt fan-out for simplicity, caching for speed.
- Tech Details:
- Cassandra: Partition by user_id, 416K writes/s split across 100 nodes (~4K/node).
- Redis: 1K members * 100B groups = 100GB cache.
- Tradeoffs: Fan-out scales writes linearly (costly for big groups); read-based fan-out reduces writes but delays delivery.
3. Durability & Storage
- Problem: Store 1.8EB over 5 years.
- Solution:
- Cassandra: Multi-DC replication (3x), TTL=5 years for old messages.
- Cold Storage: Move >1-year-old data to S3.
- Tech Details:
- Cassandra: 182.5PB ÷ 100 nodes = 1.8PB/node, 3x replication = 5.4PB/node.
- S3: 100PB, cheaper ($0.023/GB vs. $0.10/GB for Cassandra).
- Tradeoffs: Cassandra ensures fast access but is expensive; S3 saves cost but slows retrieval.
4. Scalability for Surges
- Problem: 416K QPS during peaks.
- Solution:
- Load Balancers: Auto-scale Chat Service (CPU > 70%).
- Sharding: Cassandra by user_id, Redis by channel_id.
- Tech Details:
- 416K QPS ÷ 100 shards = 4.16K QPS/shard, fits single node.
- Auto-scaling: Add 1 node/10K QPS, ~50 nodes at peak.
5. High Availability
- Problem: 99.99% uptime.
- Solution:
- Multi-Region: Cassandra (3 DCs), Redis sentinel.
- Fallback: Queue undelivered messages in Kafka.
- Tech Details:
- Cassandra: Quorum reads/writes, 5ms latency.
- Kafka: 1M msg/s, 1-day retention.
Summary of Solutions and Industry Practices
- Authentication: PostgreSQL + JWT + Redis – WhatsApp’s secure login.
- Real-time Delivery: WebSockets + Kafka – Slack’s instant messaging.
- Persistence: Cassandra + Sharding – WhatsApp’s scalable storage.
- Group Messaging: PostgreSQL + Kafka + Redis – Telegram’s group efficiency.
- Notifications: FCM + SQS + Retry – Slack’s reliable alerts.
- Security: Signal Protocol + GDPR – WhatsApp’s privacy focus.