learning System design as a landscape architect 3
Rethink system design in a much fun way, as a former urban planner/landscape planner. Take User System as example
User System
1. what’s is the scenario planning of this project
A system allow users to regest, login, query, modify their information.
we need consider the query qps.
assume this system support 100 M DAO
- regest + login + modify QPS
- 100 M * 0.1 / 86400 = 100
- 0.1 = every user regest + login + modify per day
- Peak = 100 * 3 = 300
- query QPS
- 100 M * 100 / 86400 = 100 K
- 100 = every user query per day (check friends, update message page, sent message)
- Peak = 100 k* 3 = 300 k
- Analyzing user system service (zoning function)
- AuthenticationServer: regest, login
- UserService: user information storage and query
- FriendService: store friend relate information
-
MySQL / PosgreSQL :1k QPS
-
MongoDB / Cassandra NoSQL :10k QPS
-
Redis / Memcached : 100k ~ 1m QPS
UserSystem is system read heavy and write less frequently, use Cache can
- Demand disassembly shops:
- register for sales on
- set up detail information
clients:
- flash sale page
- buy
- order
- pay
Memcached to improve DB query
WOULD CAUSE INCONSISTENT : DIRTY DATA
A: database.set(user); cache.set(key, user);
B: database.set(user); cache.delete(key);
C: cache.set(key, user); database.set(user);
RECOMMEND
database.set(key, user);cache.delete(key)
User system is heavily Read System, INCONSISTENT Occurrence probabilities much lower than cache.delete + db.set
Furthurmore, Use Cache ttl
mechanism
Set a short valid time, such as 7 days. Then even if there is a data inconsistency at a very low probability, it will be inconsistent for up to 7 days.
It means we allow the database and cache to be inconsistent “for a short time”, but will eventually be consistent.
Cache Aside (more frequently used)
DB <—> Web Server <–> Cache
Cache Through
Web Server <—> DB <–> Cache
2 Service
2.1 Authentication Service
- session
- cookie
2.2 Friendship Service
directed relationship
Twitter, Instagram
store data in SQL DB
Friendship Table
from_user_id Foreign key user
to_user_id Foreign key followee
- get all followees of X
select * from friendship where from_user_id=x
- get all followers of X
select * from friendship where to_user_id=x
store data in NoSQL DB
take Cassandra as an example
3 layer NoSQL DB Table
- row_key: Hash key or Partition Key
- column_key
- insert(row_key, column_key, value)
- column_key can be sorted
- query(row_key, column_start, column_end)
- column_key can be complex type, timestamp + user_id
- value: serialize data store into value
how Cassandra store friendship table
Cassandra key = row_key + column_key
row_key user_id 1 –> column_key <friend_user_id2> –> value <is_mutual_friend, is_blocked, timestamp> . | . –> column_key <friend_user_id3>–> value <is_mutual_friend, is_blocked, timestamp>
row_key user_id 2 –> column_key <friend_user_id1> –> value <is_mutual_friend, is_blocked, timestamp>
how Cassandra store NewsFeed
row_key owner_id 1 –> column_key <created_at_1, tweet_id1> –> value <tweet_data1> | . –> column_key <created_at_2, tweet_id2> –> value <tweet_data2>
SQL VS Nosql
SQL –> Transaction
SQL –> structured data, index
NoSQL –> Distributed, Auto_scale, Replica
more frequently, User Table
would be saved in SQL : multi-index
Friendship Table
would be saved in NoSQL : more efficient for querying
But if we use Cassandra to store User, find users by email address or phone numberers
- save User information in UserTable
- Redis: Key = user_id, value = user information
- Cassandra
- row_key = user_id
- column_key
- value
- create other tables as index
- Redis: Key = email/phone/username, value = user_id
- Cassandra
- row_key = email/phone/username
- column_key
- value
how to find Mutual Friends between A and B
-
find A’s friends list
-
find B’s friends list
-
get their intersection
improve:
use Cache, save their list in Cache