Design Messaging Service | Expertifie - sulabh84/SystemDesign GitHub Wiki
Assignment
- Instagram Design
- URL Shortener Design (Tiny URL)
Functional Requirements
- Chat messaging between 2 users
- group chat (Out of scope)
- Check the status of the User (Online, last seen time)
- Check the status of the message (Sent, Delivered, Seen)
- Fetch the previous messages
- Delete the message on the chat (Delete for yourself, everyone)
- Share media
Non-Functional Requirements
- High Availability
- High Consistency (Eventual Consistency)
- Trade off consistency of availability (if system is offline then message should sync once system is up)
- Low latency -> almost real time experience when the messages are sent
- Reliability
- messages should be e2e encrypted
- Retrieve previous messages/chats
Estimations
- Assumptions
- Total 500M Users
- 100M active users per day
- 20 msgs on an avg sent by a user to other user
- Avg length of a message -> 100 characters
- 2 bytes per character
- 1 media per active user -> 10 request to send one media file
- QPS
- Read QPS - Write QPS - Almost same
- 100M * 20 messages per day
- 100 * 10^6 * 20 / 86400
- 2 * 10^9 / 10^5
- 20000 Reads
- 20000 Writes
- Media -> 100M * 10 = 10000 QPS for read/write
- Capacity planning
- 100M * 20 * 365 * 300 bytes + 100M * 1 * 365 * 1000 bytes
- 100M * 20 * 400 * 300 + 100M * 400 * 1000
- 25 * 10^13 bytes + 4 * 10^13 bytes
- 3 * 10^14 bytes
- 300 TB of data to be stored in next 1 year
Detailed Design
- APIs
- SendMessage
- ReceiveMessage
- ChangeMessageStatus
- CheckLatestStatusOfMessage
- CheckUserStatus
- Database
- Message Table
- MessageId (PK)
- UserId -Sender
- UserId -Receiver
- MessageStatus - (Sent, Delivered, Seen)
- CreationTimestamp
- Message Content (In Case of Media URL will be stored)
- Blob Storage (To store medias & lies closer to the server for lower network I/O)
Sharding
- Messages can be shard based on the userId
- Region based sharding
- Consistent Hashing based on the userId so that all the messages for the user lies in the same shard
Replication
- Master - Slave replication
- where writes in the master and p copies in slave out of x replications
- Look up to know the copies in which the data is written
Encryption/Decryption
- User1 -> User2
- Encrypt the message for the user2 with the public key of user2
- user2 will decrypt the message with private key