Live Auction - ashtishad/system-design GitHub Wiki
Functional Requirements:
- Users can list items for auction (set start price, duration).
- Users can bid on auctions (place bids in real time).
- Users can view auction status (current bid, time remaining).
Non-Functional Requirements (Brief):
- Consistency: Accurate bid tracking, no overbidding.
- Scalability: Handle 10M DAU, 10% peak surges (e.g., rare item auctions).
- Low Latency: Bid <500ms, status <200ms.
- Durability: No data loss over 5 years.
-
Capacity Estimation (5 years):
- DAU: 10M users.
- Auctions: 1M/day * 365 * 5 = 1.825B auctions.
- Bids: 10M/day * 365 * 5 = 18.25B bids.
-
Storage:
- Auctions: 1.825B * 1KB = 1.825TB.
- Bids: 18.25B * 500B = 9.125TB.
- Total: ~11TB raw, ~110TB with replication/indexes.
- QPS: Avg: 10M/day ÷ 86,400s ≈ 116 QPS; Peak (10% surges): 1M * 10 bids ÷ 2,400s ≈ 4,167 QPS.
- User: {id, name, payment_method}
- Auction: {id, seller_id, item_name, start_price, current_bid, highest_bidder_id, end_time, status (active/ended), timestamp}
- Bid: {id, auction_id, user_id, amount, timestamp}
- POST /auctions/list: {seller_id, item_name, start_price, duration}, creates auction.
- POST /auctions/{auction_id}/bid: {user_id, amount, idempotency_key}, places bid.
- GET /auctions/{auction_id}/status: Returns current bid, time remaining.
- Client: Web/mobile app.
- API Gateway: Routes, auth (JWT), rate-limiting.
-
Microservices:
- Auction Service: Manages listing and status.
- Bidding Service: Handles bid placement.
- Status Service: Tracks real-time auction updates.
-
Data Stores:
- PostgreSQL: Auctions, bids (ACID for consistency).
- Redis: Real-time bid tracking, idempotency.
- Kafka: Event streaming for bid updates.
-
Flow:
- List → Auction Service → PostgreSQL.
- Bid → Bidding Service → Redis → PostgreSQL → Kafka.
- Status → Status Service → Redis → PostgreSQL.
Why PostgreSQL?
PostgreSQL ensures ACID compliance for bid and auction state, preventing overbidding with row-level locking and MVCC, scalable to 4K QPS with sharding.
1. Listing Items (Auction Creation)
- Problem: Enable reliable auction listing at 4,167 QPS peak.
-
Approaches & Tradeoffs:
-
Direct Insert
- How: POST /auctions/list; INSERT INTO auctions (seller_id, item_name, start_price, end_time);
- Pros: Simple (~100ms), ACID-safe.
- Cons: No deduplication, slow at scale (~1K QPS), no validation.
- Use Case: Low-scale systems (<100 QPS).
-
Validated Insert
- How: Validate input, INSERT INTO auctions ... WHERE NOT EXISTS (SELECT ... WHERE seller_id={seller_id} AND item_name={item_name} AND status='active');
- Pros: Prevents duplicates (~150ms), durable.
- Cons: Higher latency, contention on index (~10ms).
- Use Case: Medium-scale apps.
-
Event-Driven Listing
- How: Write to Kafka, Auction Service consumes, inserts into PostgreSQL with uniqueness check.
- Pros: Decouples load (~500ms), scales to 10K QPS, retryable.
- Cons: Higher latency, eventual consistency risks.
- Use Case: High-throughput systems with latency tolerance.
-
Optimistic Listing
- How: Add version to auctions; INSERT ... WHERE NOT EXISTS (SELECT ... WHERE seller_id={seller_id} AND version={old_version});
- Pros: High concurrency (~2ms), no locks.
- Cons: Retries on conflict (5-10% at peak), complex logic.
- Use Case: Low-conflict systems.
-
Hybrid (Redis + PostgreSQL)
- How: Cache listing intent in Redis (auction:seller_id:item_name, TTL=5min), INSERT INTO auctions ON CONFLICT DO NOTHING; notify via Kafka.
- Pros: Fast (<50ms), idempotent, 4K QPS scalable, durable.
- Cons: Redis failure risks duplicates (mitigated by PostgreSQL), sync complexity.
- Use Case: eBay-scale auctions.
-
Direct Insert
- Industry Example (eBay): Uses a relational DB with caching for listing, ensuring uniqueness and durability.
- Optimal Solution: Hybrid—Redis for deduplication (<1ms), PostgreSQL with ON CONFLICT DO NOTHING for persistence (~5ms), Kafka for notifications.
- Why Optimal: Balances speed (Redis: 100K ops/s), consistency (PostgreSQL: 10 nodes, 400 QPS/node), and scalability (Kafka: 10K msg/s).
- Tradeoffs: Adds Redis/Kafka infra (mitigated by PostgreSQL fallback), slight latency overhead (5ms vs. 2ms for optimistic).
2. Bidding on Auctions (Real-Time Bidding)
- Problem: Process bids in real time at 4,167 QPS without overbidding or duplicates.
-
Approaches & Tradeoffs:
-
Direct Bid
- How: POST /auctions/{auction_id}/bid; BEGIN; SELECT current_bid FROM auctions WHERE id={auction_id} FOR UPDATE; INSERT INTO bids; UPDATE auctions SET current_bid={amount}, highest_bidder_id={user_id}; COMMIT;
- Pros: Simple (~100ms), ACID-safe.
- Cons: High contention (~10ms/row), no idempotency, scales poorly (~1K QPS).
- Use Case: Small-scale auctions.
-
Idempotent Bid
- How: Check idempotency_key in Redis, BEGIN; SELECT ... FOR UPDATE; INSERT INTO bids ON CONFLICT (idempotency_key) DO NOTHING; UPDATE auctions; COMMIT;
- Pros: Prevents duplicates (~5ms with Redis), durable.
- Cons: Lock contention (~10ms), Redis failure risks (mitigated by PostgreSQL).
- Use Case: Medium-scale apps (e.g., Sotheby’s online).
-
Optimistic Bidding
- How: Add version to auctions; UPDATE auctions SET current_bid={amount}, highest_bidder_id={user_id}, version=version+1 WHERE id={auction_id} AND current_bid<{amount} AND version={old_version}; INSERT INTO bids;
- Pros: High concurrency (~2ms), no locks.
- Cons: Retries on conflict (10-20% at peak), complex retry logic.
- Use Case: High-read, low-conflict systems.
-
Event-Driven Bidding
- How: Write bid to Kafka, Bidding Service consumes, updates PostgreSQL with FOR UPDATE and UNIQUE (idempotency_key).
- Pros: Decouples load (~500ms), scales to 10K QPS, retryable.
- Cons: Higher latency, eventual consistency risks.
- Use Case: High-throughput auctions with latency tolerance.
-
Hybrid (Redis + PostgreSQL + Kafka)
- How: Check idempotency_key in Redis (<1ms), write bid to Kafka, process with SELECT FOR UPDATE in PostgreSQL, update auctions and bids, notify via Kafka/WebSockets.
- Pros: Real-time (<500ms), idempotent, 4K QPS scalable, durable.
- Cons: Multi-system complexity, Redis failure risks (mitigated by PostgreSQL).
- Use Case: eBay-scale real-time bidding.
-
Direct Bid
- Industry Example (eBay): Uses a hybrid approach with in-memory caching (e.g., Redis) for bid deduplication and a relational DB for consistency, notifying via queues.
- Optimal Solution: Hybrid—Redis for idempotency (<1ms), Kafka for decoupling, PostgreSQL with FOR UPDATE and UNIQUE (idempotency_key) for consistency (~5ms), WebSockets for updates.
- Why Optimal: Meets <500ms latency, scales to 4K QPS (Redis: 100K ops/s; PostgreSQL: 10 nodes, 400 QPS/node), ensures no overbidding.
- Tradeoffs: Adds Redis/Kafka/WebSocket infra (mitigated by retries), slight latency overhead (5ms vs. 2ms for optimistic).
3. Viewing Auction Status (Real-Time Updates)
- Problem: Provide real-time auction status (bid, time) at 4,167 QPS with <200ms latency.
-
Approaches & Tradeoffs:
-
Polling
- How: Client polls GET /auctions/{auction_id}/status every 5s; SELECT current_bid, end_time FROM auctions WHERE id={auction_id};
- Pros: Simple (~100ms), no infra.
- Cons: High QPS (4K * 12/min = 48K QPS), slow updates (~5s).
- Use Case: Low-scale systems.
-
Long Polling
- How: Client sends GET /auctions/{auction_id}/status, server holds (30s), responds on update.
- Pros: Lower QPS (~4K/30s = 133 QPS), near-real-time (~1s).
- Cons: Server resource use, timeouts (~30s).
- Use Case: Medium-scale apps.
-
Server-Sent Events (SSE)
- How: Open SSE connection; server pushes auction:status updates via Kafka events.
- Pros: Real-time (<1s), efficient (4K QPS sustainable).
- Cons: Connection overhead (~1M connections/server), infra cost.
- Use Case: High-traffic auctions (e.g., Christie’s online).
-
WebSockets
- How: Bidirectional connection; server pushes bid/time updates, client queries status.
- Pros: Real-time (<1s), interactive, 4K QPS scalable.
- Cons: Higher resource use (~500K connections/server), complexity.
- Use Case: Premium auctions (e.g., eBay).
-
Hybrid (Redis + WebSockets)
- How: Redis caches auction_id:current_bid (<1ms), WebSockets push updates (<1s), PostgreSQL persists history.
- Pros: Ultra-fast (<10ms cache), real-time (<200ms), durable, 4K QPS.
- Cons: Redis volatility (mitigated by PostgreSQL), dual-system sync.
- Use Case: Scalable, real-time auctions.
-
Polling
- Industry Example (eBay): Uses WebSockets with in-memory caching (e.g., Redis) for live bid updates, ensuring <200ms status delivery.
- Optimal Solution: Hybrid—Redis for real-time bid caching (<10ms), WebSockets for push updates (<200ms), PostgreSQL for persistence.
- Why Optimal: Meets <200ms latency, scales to 4K QPS (Redis: 100K ops/s; PostgreSQL: 10 nodes, 400 QPS/node), enhances UX.
- Tradeoffs: Adds Redis/WebSocket infra (mitigated by load balancers), cache staleness tolerable for UX.
text
CollapseWrapCopy
+----------------+ +----------------+ | Client |<------->| API Gateway | | (Web/Mobile) | | (JWT, Routing) | +----------------+ +----------------+ | | | +----------------+ +-----+ +-----+-------+ | Auction Service|<---+ | | | (Listing) | | | +----------------+ | | +----------------+ +-----+ +-----+-------+ | Bidding Service|<--------| | | | | (Bids) | | | | | +----------------+ | | | | +----------------+ | | +----------+ | | Status Service |<--------| | | | | (Real-Time) | +-----+ | | | | | +----------------+ +----------------+ +----------------+ | Redis | | PostgreSQL | | Kafka | | (Cache, Bids) | | (Auctions, Bids)| | (Events) | +----------------+ +----------------+ +----------------+ | | +----------------+ +-----+ +----------------+ | WebSockets | | | Stripe | | (Real-Time) | | | (Payments) | +----------------+ +---+---------------+
- Authentication: PostgreSQL + JWT – eBay’s secure login.
- Listing Items: Redis + PostgreSQL – Sotheby’s auction creation.
- Bidding on Auctions: Redis + PostgreSQL + WebSockets – eBay’s real-time bidding.
- Viewing Status: Redis + WebSockets – Christie’s live updates.
- Consistency: PostgreSQL (ACID) – Bonhams’ bid accuracy.
- Scalability: Kafka + Redis – eBay’s surge handling.