Saturday, March 21, 2020

System design interview: how to design comments and reply, likes button and total views on Youtube

System design interview: how to design comments and reply, likes button and total views on Youtube

Methodology: READ MF!

[Originally from the Post: System design interview: how to design a chat system (e.g., Facebook Messenger, WeChat or WhatsApp)]
Remind ourselves with the "READ MF!" methodology.

This is a follow up on the previous post: System design interview: how to design a video platform (e.g, Youtube, Netflix)


First, let's quickly about requirements. Likes and views are relatively straightforward, users can torllerate a bit delay and inaccuracy. For the majority case, as long as user clicks the button, it pluses one, then it's fine. Sometimes it's not the case if videos have too many likes, e.g., if a Youtube video already has 10K likes, you plus 1, it still shows 10K, it's just the UI tricks you that it's toggled.

For comments reply, there are a few different styles. The major ones such as Youtube, Instagram and TikTok uses following style. It displays comments (directly reply to video) based on order of likes and timestamp (descending) and any reply to comments are only 1 on 1, meaning A@B How are you, then B@A I am fine thank you and you? There is no more indentation needed.

Reddit uses the "block building" reply-to style (中文俗称"盖楼"), where it shows which reply replies to which reply, and it needs to show the indentations about those replies.


For viral videos, say normally it has around 10M views
For likes, assuming 20% people liked a video, 10M * 20% = 2M likes
For comments, 1% people would leave a comment (we are lazy) 10M * 10% = 100K comments

For the majority normal videos, it would probably has 1000 views and 100 likes top and maybe 10 comments, a relationship DB could solve it pretty well.


Key designs and terms 

Comments design

If you start building your product, just bootstrap it with a relational DB
Introduce a comments table shard by video UUID, add a reply_to_uuid to know which comment the reply is targeted to and leave it null for root comment. Build an index on the reply_to_uuid

Select * from comments where reply_to_uuid is null order by comment likes desc, timestamp desc

If you need to see the replies to those comments, just

Select * from comments where reply_to_uuid is the_target_comment_uuid order by comment likes desc, timestamp desc

Even if your product becomes Youtube scale, the comments would be around 100K for viral videos, the above solution would still works fine. Simply add more capacity to better shard your comments using consistent hashing, cache the comments would do the trick.

If you need to build the Reddit tree structure, just sort it in memory. If the problem can fit into memory, it becomes much easier.

The extreme case is your comments section becoming a chat, then we can do something like an append only in memory DB or redis cache keep appending the values to the queue with async backup to DB.

Views and Likes count design

Similarly, when you bootstrap the project, keep a counter in DB or in memory cache solves your problem when traffic is low. If within one machine, you don't even need locks just use compare and swap (CAS), atomic operations for counting, thread safe.

If your product starts to become popular, add more capacity using consistent hashing. Add in memory cache like Redis to count the values (memory access time 100us vs disk access time 10ms. 100Kx improvement). Could be further optimized using distributed counter, aggregating the results together when read.

If you product becomes YouTube scale, then use offline counting. Build a pipeline to promote the videos from cold to hot/viral once the view counts hit a certain threshold (say 1M). Use async messaging like Kafka to ingest from those logs and pump it to data warehouse, query it and update the values on a cron schedule. Of course on the UI side, you need to toggle the like button, plus 1 if needed (Sometimes you would see a 100K likes video, even if clicked the like, the count would not be increased)

 Baozi Youtube Video

References (Credits to original authors)


Thank your for your comment! Check out us at if you need mock interviews!