## Saturday, March 28, 2020

### Problem Statement

Reverse bits of a given 32 bits unsigned integer.

Example 1:
```Input: 00000010100101000001111010011100
Output: 00111001011110000010100101000000
Explanation: The input binary string 00000010100101000001111010011100 represents the unsigned integer 43261596, so return 964176192 which its binary representation is 00111001011110000010100101000000.
```
Example 2:
```Input: 11111111111111111111111111111101
Output: 10111111111111111111111111111111
Explanation: The input binary string 11111111111111111111111111111101 represents the unsigned integer 4294967293, so return 3221225471 which its binary representation is 10111111111111111111111111111111.```

Note:
• Note that in some languages such as Java, there is no unsigned integer type. In this case, both input and output will be given as signed integer type and should not affect your implementation, as the internal binary representation of the integer is the same whether it is signed or unsigned.
• In Java, the compiler represents the signed integers using 2's complement notation. Therefore, in Example 2 above the input represents the signed integer `-3` and the output represents the signed integer `-1073741825`.

If this function is called many times, how would you optimize it?

### Video Tutorial

You can find the detailed video tutorial here

### Thought Process

Use this problem to bring awareness of bit manipulation in interviews, it's rare but sometimes do get asked by some interviewers.

We can directly use Java's Integer.reverse() method, not suitable for interview purpose.

Simple simulation, reading bits from the right most and adding them up while shifting left.

Regarding the follow up, caching is an obvious solution. There are two ways to cache
• Cache all the integers,  2^32 = 4294967296, total memory needed is 2^32*4B = 16GB roughly, easily handled by one server
• Cache each byte, divide the 32 bits into 4 bytes. Memory footprint is only 2^8*1B = 256B, significantly smaller than 16GB
If you are curious on how Java SDK implemented this method, it uses the algorithm in Hackers Delight book section 7-1
``` 1195       /**
1196        * Returns the value obtained by reversing the order of the bits in the
1197        * two's complement binary representation of the specified {@code int}
1198        * value.
1199        *
1200        * @return the value obtained by reversing order of the bits in the
1201        *     specified {@code int} value.
1202        * @since 1.5
1203        */
1204       public static int reverse(int i) {
1205           // HD, Figure 7-1
1206           i = (i & 0x55555555) << 1 | (i >>> 1) & 0x55555555;
1207           i = (i & 0x33333333) << 2 | (i >>> 2) & 0x33333333;
1208           i = (i & 0x0f0f0f0f) << 4 | (i >>> 4) & 0x0f0f0f0f;
1209           i = (i << 24) | ((i & 0xff00) << 8) |
1210               ((i >>> 8) & 0xff00) | (i >>> 24);
1211           return i;
1212       }```

### Solutions

#### Simple simulation with follow up question

Time Complexity: O(N), where N is number of bits in the input integer
Space Complexity: O(1), no extra space needed

## System design interview: how to design comments and reply, likes button and total views on Youtube

### Methodology: READ MF!

[Originally from the Post: System design interview: how to design a chat system (e.g., Facebook Messenger, WeChat or WhatsApp)]
Remind ourselves with the "READ MF!" methodology.

This is a follow up on the previous post: System design interview: how to design a video platform (e.g, Youtube, Netflix)

### Requirements

First, let's quickly about requirements. Likes and views are relatively straightforward, users can torllerate a bit delay and inaccuracy. For the majority case, as long as user clicks the button, it pluses one, then it's fine. Sometimes it's not the case if videos have too many likes, e.g., if a Youtube video already has 10K likes, you plus 1, it still shows 10K, it's just the UI tricks you that it's toggled.

For comments reply, there are a few different styles. The major ones such as Youtube, Instagram and TikTok uses following style. It displays comments (directly reply to video) based on order of likes and timestamp (descending) and any reply to comments are only 1 on 1, meaning A@B How are you, then B@A I am fine thank you and you? There is no more indentation needed.

Reddit uses the "block building" reply-to style (中文俗称"盖楼"), where it shows which reply replies to which reply, and it needs to show the indentations about those replies.

### Estimation

For viral videos, say normally it has around 10M views
For likes, assuming 20% people liked a video, 10M * 20% = 2M likes
For comments, 1% people would leave a comment (we are lazy) 10M * 10% = 100K comments

For the majority normal videos, it would probably has 1000 views and 100 likes top and maybe 10 comments, a relationship DB could solve it pretty well.

### Key designs and terms

If you start building your product, just bootstrap it with a relational DB
Introduce a comments table shard by video UUID, add a reply_to_uuid to know which comment the reply is targeted to and leave it null for root comment. Build an index on the reply_to_uuid

Select * from comments where reply_to_uuid is null order by comment likes desc, timestamp desc

If you need to see the replies to those comments, just

Select * from comments where reply_to_uuid is the_target_comment_uuid order by comment likes desc, timestamp desc

Even if your product becomes Youtube scale, the comments would be around 100K for viral videos, the above solution would still works fine. Simply add more capacity to better shard your comments using consistent hashing, cache the comments would do the trick.

If you need to build the Reddit tree structure, just sort it in memory. If the problem can fit into memory, it becomes much easier.

The extreme case is your comments section becoming a chat, then we can do something like an append only in memory DB or redis cache keep appending the values to the queue with async backup to DB.

#### Views and Likes count design

Similarly, when you bootstrap the project, keep a counter in DB or in memory cache solves your problem when traffic is low. If within one machine, you don't even need locks just use compare and swap (CAS), atomic operations for counting, thread safe.

If your product starts to become popular, add more capacity using consistent hashing. Add in memory cache like Redis to count the values (memory access time 100us vs disk access time 10ms. 100Kx improvement). Could be further optimized using distributed counter, aggregating the results together when read.

If you product becomes YouTube scale, then use offline counting. Build a pipeline to promote the videos from cold to hot/viral once the view counts hit a certain threshold (say 1M). Use async messaging like Kafka to ingest from those logs and pump it to data warehouse, query it and update the values on a cron schedule. Of course on the UI side, you need to toggle the like button, plus 1 if needed (Sometimes you would see a 100K likes video, even if clicked the like, the count would not be increased)

## Tuesday, March 10, 2020

### 美国菜鸟买枪看家护院指南

【作者】 包子小旋风老师  【包子铺里聊IT 微信公众号

### 新冠肺炎来袭, 视频onsite指南

【作者】 包子霹雳火老师  【包子铺里聊IT 微信公众号

•  视频onsite如何更好地做自我介绍
• 很多小伙伴可能把自己的简历准备得很是熟练，尤其是介绍past experience的时候很多时候在onsite中都可以通过画图的方式来体现自己的projectimpactcomplexity. 但是当onsite转到视频之后这些准备好的图可能一下就无法展现出来了，这时候小伙伴就应该换一种思维去想面试官想要从你的自我介绍中得到什么?
• 自我介绍一般都是面试官给面试者一个熟悉环境，放松的时间，所以从面试官本身来讲更多是期待面试者能够自然地总结自己过去的经历。这个时候就需要面试者能够有以下的一些能力
• High-level 讲述自己组或者自己负责serviceresponsibilitybusiness impact
• 指出在整个system中或者整个project中自我的contribution. 这里的contribution不仅仅是写的code, 还可以是很多自己driveprocess
• Detail-oriented to key components, 最后面试者要能够详细的讲出自己contribution里面的technical details。这一步一般在视频onsite的时候会被省略掉，但是如果面试官问起来，面试者还需要能够用最简单明了的语言去描述。
• 总上所述，面试者在准备自我介绍的过程中，可以试着把自我介绍介绍给非CS的小伙伴，看看别人是否能大概知道你在公共做或者学校project里有什么样的作用.

• 视频onsite如何更好地做online coding
• 在视频onsite过程中，对于面试这最熟悉的应该就是coding的部分了但是跟普通onsite不同，在视频过程中你的表情，说话语气，包括神态都会放大般的展现在面试官的面前。所以在此，小编要给大家提醒一下几点
• 不要自言自语，尤其是说中文。因为在小编过往模拟面试的过程中，有些面试者还时不时得自言自语两句中文，这在视频面试中是一个很大的red flag.
• 跟面试官沟通，沟通，沟通！！！重要的事情说三次。Onsite面试的时候有些面试者喜欢那道题之后自己思考，但是在视频面试中，面试者一定要学会think loud and keep talking. 要让面试官知道你的想法是什么，你的思路是否正确，或者你stuck在哪里。对于面试官来讲，面试的过程是一个选择自己潜在同事的过程，而不是一个考试的过程，所以面试官注重的一点就是能否跟面试者有有效的沟通。
• Take feedback. 在一般onsite面试过程中，如果面试者的solution有问题，面试官都会给一些hint或者一些feedback通过各式各样的交流。但是在视频面试过程中，面试官可能就会通过举反例或者提问的方式向面试者提供一些feedback,这时候面试者就应该去jump out然后去想这些feedback是否对你的solution有用.
• 最后，还是要鼓励大家熟悉一些自己coding language. Online coding有些公司是需要跑code的，所以对自己的language一定要熟悉，小编见过很多人喜欢用python只是因为简单好写，但是到了online coding的部分，如果你选了你不熟悉的语言，不知道怎么debug，那么从面试官的角度来讲就是你technical skills不过关了。

• 视频onsite如何更好地做system design
• 对于System design的问题视频onsite确实会有很大的挑战。首先面试者没有自己擅长的白板，这样很多架构图就无法展现出来。这时候对于面试者的挑战就变成了如何清晰地展现自己处理问题的思路和如何利用好bullet point来表达自己的观点了。
• 通常来讲，面试者先要把system designrequirement清晰地记录下来，不明确的要去跟面试官沟通，把需求确定好。
• 其次就是要把整个system分成不同的部分，这时候就要用bullet point把自己的思维一步步得展现给面试官。就当自己在写ppt的概要
• 做完上述两步之后，面试者就可以跟面试官讨论然后着重挑其中的一个componentdetail design. 比如db schema, API signature 然后customer use scenario , error handling. 这些都可以用简单的statement写好
• 最后面试者需要做出一些summary的总结，列举出pros and cons.向面试官展现出一个完整的设计思路

• 视频onsite如何更好地做behavior question
• 对于behavior 来说，应该是对于视频onsite来说最直观的了。重要的一点就是管理好自己的表情，表达方式，毕竟onsite面试官看到的是一个整体的人，而在视频onsite中看到的只是面试者的脸。所以任何小的问题都会被放大。