learning System design as a landscape architect 9
Rethink system design in a much fun way, as a former urban planner/landscape planner. Take Bigtable as example
Design WeChat
Scenario analysis
basic function:
- login
- concact list
- send message to friend
- group chat
other function:
- multiple device login
- online status
wechat:
- 10B Month active user
- 450 B daily message
QPS: Average QPS = 450B / 86400 ~ 500K Peak QPS = 500K * 3 ~ 1.5 m
Storage calculation every message: 30bytes
- 1.3T
Service
- Message Service
- Realtime Service
Storage
Before designing the message table, we need clarify the basic concept,
wechat Inbox has a list a Treads, Thread has a list of Message
Message Table (NoSQL)
Complxt SQL would slow down your query.
use Tread Table and Message Table can improve the Performance.
However, some message is un_read to user A, but readed to user B.
method 1 : separate into multiple tables
Thread - store basic information
UserThread - store User’s private message in Thread Thread Table
id primary key bigint
last_message text
avatar varchar
created_at timestamp
User Thread Table
id primary key bigint // Or use user_id + thread_id as primary key
user_id foreign key
thread_id foreign key
unread_count int
is_muted boolean
updated_at timestamp // sort the message acording to updated_at row
joined_at timestamp
- Thread Table store shared Thread information
- index by thread_id to query certain message
- participant_hash_code to search thread between some users
But you need use thread_id in User Thread Table to get information in Thread Table.
And Join SQL would slow down your query.
method 2 : user UserThread, copy the shared information into User’s UserThread
But everytime updated operation happens, would cause repeat saving and nonconstant problem.
Method 1 still a better choice.
when user A sent message to User B, how to get thread_id in server?
Thread id
add a new row participants_hash_code
in Thread Table.
- group chat:
hash sorted user_id,
partcipants_hash_code = any_hashfunc(sorted(participants_user_ids))
, use uuid
then no need to think about hash collision.
this row key wouldnot use sorted user_ids
, cause in group chat, the user_id would a too long.
- private conversation between two users
Create a custom format private::user1::user2
NoSQL would be a better choice.
NoSQL store Thread Table
NoSQL store Thread Table and support query by thread_id and participant_hash_code
2 tables need here:
Thread Table
- row_key = thread_id
- column_key = created_at
- value = other info
ParticipantHashCode Table
- row_key = partcipants_hash_code
- column_key = null
- value = thread_id
since column_key is not needed here, we can use key_value NoSQL, like RocksDB
NoSQL store UserThread Table
UserThread Table store private Thread information
sharding key
UserThread Table
- row_key = user_id
- column_key = updated_at
- value = other info
Sharding by Querying
dataflow
A send message to B
- server get request to check if a “Thread” is already exists between A and B, create one if not exists.
- create Message by Thread id
- B visites server to get latest message every 10s (Poll)
- B receive message
A – return those message (receive from and sent message to B ) –> Web server
Web serve – query Thread + create Message –> DB
B <–visit every 10s and receive message –>Web server
Scale
update message every 10s, can user get message in real time? How to speed up?
Android GCM (Google Cloud Messaging)
iOS APNS (Apple Push Notification Service)
Push Notification
- A send message to Web Server
- server query Thread + create Message, notify the APNS
- APNS notify users of updates
- B get new message from Web Server
- and APNS can sent short message directly to B
A – return those message (receive from and sent message to B ) –> Web server
Web serve – query Thread create Message –> DB
Web serve – –> APNS
B <–receive message –>Web server
But APNS cannot support web (windows wechat)
Socket Flow
Wechat , facebook message has lots push notification, Socket would be a good choice.
- A open APP –get Pish Service ip from –>
Web Server
- A –connected(Socket)–>
Push Server
- B –send message to A–>
Web Server
- Web Server –send message to A–>
Push Server
- A get message notification.
Push Service
Message Service (get message from wechat users)
-
sharding by user_id
- Push Server 1
- Socker <—> wechat user 1
- Socker <—> wechat user 2
- Socker <—> wechat user 3
- Push Server 2
- Push Server 1
group chat
one wechat group is 500 people, A sent message to this group, (A –> Web Server–send 500 times Push Request to Push Server), but only a few people is active in this group.
use channel service to solve this problem
Channel Service
- Add a Channel Service
- Add a Channel message for each chat thread
- For larger groups, online users first need to subscribe to the corresponding Channel
- When the user goes online, the Web Server (message service) finds the channel (group) to which the user belongs, and notifies the Channel Service to complete the subscription
- Channel knows which users are still active on that channels
- If the user is disconnected, the Push Service will know that the user is disconnected and notify the Channel Service to remove it from the channel to which it belongs
- After the Message Service receives the message from the user
- Find the corresponding channel
- Send a message request to the Channel Service
- Sending 1 message instead of sending 500 messages
- Channel Service finds currently online users
- Then send it to Push Service to push the message out
Message Service (get message from wechat users)
- Subscribe channel and Dispatch message to Channel Service
realize Real-time Service
- Chaneel Service
Channel service has many channel servers and sharding by channel id
-
sharding by user_id
- Push Server 1
- Socker <—> wechat user 1
- Socker <—> wechat user 2
- Socker <—> wechat user 3
- Push Server 2
- Push Server 1
Channel Service is a key-value structure. The key is the channel name, which can be a string such as “#personal::user_1”. value is a set represents who subscribed to this channel.
Redis is a good choice.
? How to know which Channels a user should subscribe to?
-
users need to subscribe to their own personal channel, such as #personal::user_1, receive those private chat information in this channel.
-
Group chats with less than a certain number of people can still be pushed through personal channel
-
Group with large numbers can use lazy subscribe method. User opens the APP only subscribe those group is close to recent. Actively subscribed group chats rely on the poll mode to get the latest news. Q: Can users still receive reminders after closing the app? A: If you really close the APP, it will not work. Therefore, many APPs will stay in the background to ensure that at least the Poll mode can still work.