Methodology: READ MF!
[Originally from the Post: System design interview: how to design a chat system (e.g., Facebook Messenger, WeChat or WhatsApp)]Remind ourselves with the "READ MF!" methodology.
Requirements
Simple text based search with "AND" or "OR" support. E.g., give me all the tweets that have words "Black" AND "Life" in it sorted by published time in descending order.
search(api_dev_key, search_terms, maximum_results_to_return, sort, page_token)
A follow up question would be sort those results in other orders, E.g., give me all the tweets that have words "Black" AND "Life" in it sorted by the most commented tweets in descending order.
Estimation
- Storage: assuming twitter has 1B users, 50% of them post a tweet per day, each tweet is 140 char = 140 Byte. Each day, 1B * 0.5 * 280(roughly 300)Byte = 150GB data
- Write QPS = 1B user * 0.5 post a tweet per day / 86400 = 5787 QPS = 6000 QPS
- Query QPS = assuming 3x than write QPS = 18000 QPS
Key designs and terms
A few key terminologies.
If the follow up question, we just need to build another index given the sort key or we could fetch the results on flight and sort (local optimal). Follow "Index -> Search -> Rank" steps.
nice
ReplyDelete