learning System design as a landscape architect 1
Rethink system design in a much fun way, as a former urban planner/landscape planner. Take design twitter as example
Design twitter – Present a Landscape Design
1. what’s is the scenario planning of this project
Landscape Architects won’t jump into drawing a good site planning immediately, nor the system design. The first step is always to clarify the project and know your client demands as much as possible.
- what problem you want to solve?
- what basic function need here?
- Features, QPS, DAU, Interface analysis (site condition like land, weather, population, culture analysis in a project)
-
need basic data about twitter: MAU 330M, DAO: 200m
-
connunicate the basic function need here
- Register / Login
- User Profile Display / Edit
- Upload Image / Video
- Search
- Post / Share a tweet
- Timeline/News Feed
- Follow / unFollow a user
-
Pick the one your clients need (interviewee)
- Post
- Timeline
- News Feed
- Follow / Unfollow
- Register / Login
-
Features (site condition)
-
Concurrent User :
-
Read QPS: DAO * average number of queries by a user / seconds in a day
150 M * 50 / 86400 ~ 100k
- Peak = 100k * 3 ~ 300k
- Fast Growing: Max peak users in 3 months = Peak users * 2 (not for twitter)
- Write QPS
-
QPS
- 100: your laptop can the web server
- 1 k : a web server, need consider single point failure, SQL database
- 1 m : 100 web server cluster, need consider maintainance, NoSQL database(Memchched), Cassandra is 10 k QPS
2. master plan and zoning concept
After site analysis, It’s time to give a draft drawing sketche, with enough imformation (which need communicate with your client/team members)
In a draft drawing sketches, it would contains:
- traffic plan
- commerical plan
- green system
System design:(zoning plan)
Service
Router
+ User Service
- Register
- Login
+ Tweet Service
- Post a tweet
- News Feed
- Timeline
+ Media Service
- Upload Image
- Upload Video
+ Friendship Service
- Follow
- Unfollow
3. core area plans(1: 500)
In a draft drawing sketches, it would contains:
- pavment plan
- rainstom plan
Storage
+ User Service
- Register
- Login
- SQL
+ Tweet Service
- Post a tweet
- News Feed
- Timeline
- NoSQL
+ Media Service
- Upload Image
- Upload Video
- File System
+ Friendship Service
- Follow
- Unfollow
- SQL/NoSQL
Please design schema User Service
User Table
id integer
username varchar
email varchar
password varchar
Tweet Service
Tweet Table
id integer
user_id Foreign Key
content text
created_time timestamp
Media Service
Friendship Service
Friendship Table
from_user_id Foreign Key
to_user_id Foreign Key
created_time timestamp
Interviewer: how to storage and get the News Feed
PULL vs PUSH
PULL
algorithm: when users check New Feed, get every friend’s the first 100 tweets, merge them into the frist 100 News Feed
Mergr K sorted Arrays
time complexity: basically you can ignore the Mergr K sorted Arrays times, N times DB Reads is much more costly
User
– fetch News Feed -> Web Server
<-get followings->Friendship Table
Web Server
<-get tweets from followings->Tweet Table
Web Server
-merge and return->User
cons:
multiple DB Reads is very slow when users fetch News Feed
getNewsFeed(request)
• followings = DB.getFollowings(user=request.user)
• news_feed = empty
• for follow in followings: • tweets = DB.getTweets(follow.to_user, 100) • news_feed.merge(tweets)
• sort(news_feed)
• return news_feed[:100] # return the first 100
• postTweet(request, tweet)
• DB.insertTweet(request.user, tweet)
• return succes
PUSH
algorithm: every user has a list store his News Feed information
Fanout: User post a tweet, push the tweet into his followers’s News Feed lists
when a user check his News Feed, he just read from his News Feed list
time complexity: 1 DB Read, N times Writes. (Async)
User
<– Post a new tweet -> Web Server
<-Insert the tweet to DB->tweet Table
Web Server
-sent tweets to friends->Async Tasks Server
Async Tasks Server
<-get followers->friendship Table
Async Tasks Server
<-Fanout: insert new tweet to followers news feed->New Feed Table
cons:
huge followers numbers
getNewsFeed(request)
• return DB.getNewsFeed(request.user)
• postTweet(request, tweet_info)
• tweet = DB.insertTweet(request.user, tweet_info)
• AsyncService.fanoutTweet(request.user, tweet)
• return success
AsyncService::fanoutTweet(user, tweet)
• followers = DB.getFollowers(user)
• for follower in followers:
• DB.insertNewsFeed(tweet, follower )
PULL vs PUSH
tweet and facebook are using pull
4. details design show how to solve some specific problem and improve the plan in the future
-
Optimize
-
Scalability Enhancement of Pull Server functions
- the slowest part is happening when user request News Feed
- add Cache before DB Read
- cache every user’s Timeline
- n times cache reqest
- trade off: all the cache, or the recent 1000 in cache
- cache every user’s News Feed
- with Cache: merge users recent 100 tweets, then get the first 100
- without Cache: merger all the tweets after timestamp
- the slowest part is happening when user request News Feed
-
Scalability Enhancement of Push Server functions
- compare Pull model store News Feed in Memory, Push model stores in Disk, and Disk is cheap
- Inactive Users
- Rank followers by weight(last login time)
- followers » following: fanout would take hours
- trade off: Pull + Push vs Pull
-
right Optimize thinking
Try to optimize with minimal changes to the existing model;
Estimate long-term growth and assess whether it is worth switching the entire model
-
Push + Pull
- ordinal user (Push)
- influencer (Pull), followers come to influencer’s timeline fetch the News Feed, merge into their News Feed
User Table
id integer
username varchar
email varchar
password varchar
is_influencer boolean
-
Maintenance
- Robust
- Scalability
unfollow problem(async)
-
follow: merge timeline into News Feed asynchronously
-
unfollow: pick out tweets from News Feed asynchronously
-
Async reduces the processing time
-
async improve user experience
-
however, after unfollowing, the user may still find his News Feed in timeline
store likes
Tweet Table
id integer
user_id Foreign Key
content text
created_time timestamp
nums_of_like * integer
nums_of_comments * integer
nums_of_retweets * integer
Like Table
id integer
user_id Foreign Key
tweet_id Foreign Key
created_time timestamp
Normalize vs Denormalize
Normalize vs Denormalize likes
Normalize likes
SELECT COUNT * FROM like_table where tweet_id=xxx;
-
Pros: Standardized, most accurate
-
cons: slow, would cause O(n) SQL Queries
Denormalize likes
UPDATE like_table SET num_of_likes = num_of_likes + 1 where tweet_id=xxx;
UPDATE like_table SET num_of_likes = num_of_likes - 1 where tweet_id=xxx;
- no need SQL Queries, because num_of_likes stores in tweet
Thundering Herd
The “thundering herd” problem occurs in a highly concurrent environment (typically, many users). When many users make a request to the same piece of data at the same time, and there is a cache miss (the data for the cached element is not present in the cache, data is kicked out due to cache expiration or elimination by the elimination algorithm, etc) the thundering herd problem is triggered.