## Saturday, February 29, 2020

### Problem Statement

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: `get` and `put`.
`get(key)` - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
`put(key, value)` - Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.
The cache is initialized with a positive capacity.
Could you do both operations in O(1) time complexity?
Example:
```LRUCache cache = new LRUCache( 2 /* capacity */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // returns 1
cache.put(3, 3);    // evicts key 2
cache.put(4, 4);    // evicts key 1
cache.get(3);       // returns 3
cache.get(4);       // returns 4```

### Video Tutorial

You can find the detailed video tutorial here

### Thought Process

The LRU cache has become an absolute standard interview question that we have to know how Java's LinkedHashMap or Python's OrderedDict works under the cover (A doubly linked list and a hashmap). I would actually recommend implementing a doubly linked list + hashmap solution in real interviews.

Follow up is how to implement a least frequent cache (LFU)?
• A doubly linkedlist + hashmap + A min heap (to count), however, both get and put would be O(LgN) since the heap would be modified due to the frequency increased by 1
• A doubly linkedlist + hashmap + ArrayList<Frequency, doubly linked list), Node structure would also carry a Frequency field.  For each frequency, we keep track of the elements and move them around when accessed. Get would be O(1), Put could potentially be worst case unbounded (we have to loop from frequency 0 to integer.max_value), or could be optimized to O(N)

### Solutions

Time Complexity: O(1) for both get and set
Space Complexity: O(N) Where N is the number of items inserted

#### Using Doubly LinkedList & Hashmap

Time Complexity: O(1) for both get and set
Space Complexity: O(N) Where N is the number of items inserted

## Saturday, February 22, 2020

### Problem Statement

Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.
You have the following 3 operations permitted on a word:
1. Insert a character
2. Delete a character
3. Replace a character
Example 1:
```Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation:
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')
```
Example 2:
```Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation:
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')```

### Video Tutorial

You can find the detailed video tutorial here

### Thought Process

Brute force sounds like really overkill because we have to list all the possible ways and it is exponential time complexity. On a high level, the recursion looks like
• if cannot convert A to B, return -1 (the recursion termination function); if A == B, then return 0
• try 3 different ways, insert, remove and replace, cut that character and continue the recursion. Compare the minimum of that 3 methods (exclude -1)
Edit distance is a classic Dynamic Programming (DP) problem, just like Coin Change Problem. It follows the DP template perfectly (asking for extreme values)

The mathematical induction function is
DP[i][j] is the minimum operations needed to convert String[0-i] to String[0-j]
Initial values: dp[0][j] = j and dp[i][0] = i
If (A[i] == B[j]) dp[i][j] =  dp[i-1][j-1]
Else min(dp[i][j-1], dp[i-1][j], dp[i-1][j-1]) +1;

### Solutions

#### DP

Time Complexity: O(M*N) where M is word1 length and N is word2 length
Space Complexity: O(M*N) since we need an extra 2D array

References

## Saturday, February 8, 2020

### Problem Statement

Given a list of words and two words word1 and word2, return the shortest distance between these two words in the list.
Example:
Assume that words = `["practice", "makes", "perfect", "coding", "makes"]`.
```Input: word1 = `“coding”`, word2 = `“practice”`
Output: 3
```
```Input: word1 = `"makes"`, word2 = `"coding"`
Output: 1
```
Note:
You may assume that word1 does not equal to word2, and word1 and word2 are both in the list.

### Video Tutorial

You can find the detailed video tutorial here

### Thought Process

It is a bit confusing at first to be confused with the "Edit Distance" problem. However, there are duplicate words in the array, it's just a matter of finding the minimum distance between two words, where words could be found in multiple places in the array.

Brute force way (for each word, find the closest distance with the other word) would give you O(N^2) time complexity where N is the array size. An O(N) optimization would be having two indices for each word and keep updating the minimum distance. It is greedy because the closer the two elements are, the smaller the distance would be.

You can refer to Leetcode official solution for a detailed explanation.

### Solutions

#### Implementation

Time Complexity: O(N) where N is the array size
Space Complexity: O(1) Constant space

References

## System design interview: how to design a chat system (e.g., Messenger, WeChat or WhatsApp)

update Dec 2022: Extend READ MF to BREAD MF (Bread My Friend) :). B stands for Business here, where even in design interviews, before the requirement, more senior folks tend to ask and want to understand the business use case, why we want to build this. No need to deep dive into it, but get the point checked.

For system design interview questions, normally we should follow the "READ MF!" steps. You can easily remember this as "Read! Mother Fucker!", similar to RTFM, with an attitude to your interviewer (Because if I can design this complicated system in 60 minutes, why would I waste my time interviewing here? JK)

READ MF! (Requirement, Estimation, Architecture, Details, Miscellaneous, Future)

1. Requirement clarification, keep it simple first and make some assumptions to make your life easier, later we can drill down or improve.
2. Estimation on scalability (QPS, storage size, network bandwidth per sec) the scalability determines the final architecture since we should not over engineering things, keep it simple first
3. Design system Architecture & Layers, major services and responsibilities, front-end/edge layer, service layer & storage layer
4. Drill down Details into each component, e.g, data models on storage layers, API interface between micro services
5. Miscellaneous. Talk about design trade-offs, bottlenecks, peak traffic handling, failover plan, monitoring & alerting, security concerns. Talk things that you are most comfortable with
6. Future work on if new features are added, how system would be extended to support them

### Key designs and terms

• WeSocket(On OSI 7, application layer) is more suitable for real time chatting over HTTP due to the bidirectional nature (HTTP long polling is also not efficient)
• Maintaining connection efficiently. We should reuse those socket connections since it's not efficient to recreate them (still under IP&TCP), also utilize each machine in Chat Service to host more connections. In real world, WhatsApp tech stack Erlang + FreeBSD can host 1 Million connections per commodity host.
• Data storage solutions. For large volumn writes and range query (for latest messages), we might not want to go with relational database (write is not super efficient for large scale) and we want to read from memory as well. We could cache over a relational DB but we can also choose HBase (LSM Tree), Cassandra(SSTables) or even Redis
• Sent, Delivered & Seen State are easy given the above graph. Sent is when Chat Service returns success. Delivered when message is written into the main coveration storage, Seen is when user pulls the latest chat history or connection is open chat service successfully pushed the message. Tying is simply a listener on the client side.