Saturday, December 28, 2019

Leetcode solution 59: Sprial Matrix II


Problem Statement 


Given a positive integer n, generate a square matrix filled with elements from 1 to n2 in spiral order.
Example:
Input: 3
Output:
[
 [ 1, 2, 3 ],
 [ 8, 9, 4 ],
 [ 7, 6, 5 ]
]
 Problem link

Video Tutorial

You can find the detailed video tutorial here

Thought Process

Straight forward question, could use Spiral Matrix exact same recursion algorithm to solve this, you can also watch that video tutorial here

Still needs pay attribute to this guy, had another great solution:
https://leetcode.com/problems/spiral-matrix-ii/discuss/22282/4-9-lines-Python-solutions

Solutions

Simulation using Recursion

Time Complexity: O(M*N) where M, N is row and col of matrix
Space Complexity: O(M*N) since we used list to store the result, where M, N is row and col of matrix


References

Friday, December 27, 2019

System design interview: how to design a feeds system (e.g., Twitter, Instagram and Facebook news feed)

System design interview: how to design a feeds system (e.g., Twitter, Instagram and Facebook news feed)

Methodology: READ MF!

Please use this "READ MF!" framework for software engineer system interview purpose.

Key designs and terms 




  • So far the best detailed explanation on designing twitter is from Raffi (used to be VP of Twitter) on QCon. Presentation, Slides [All credits go to QCon). Raffi is very smart and articulate, really solid guy!
  • Read heavy system, not write heavy, optimized for user timeline. To be clear, there are two timelines, one is user's own tweets (easy to do), the other is the main timeline which is an aggregation of all the tweets from the people that user follows. 
  • Pre-calculate all the timelines. This is the interesting part of the design vs using a mysql and having index to query in realtime, which would not be scalable. When a tweet is posted, the tweets service would
    • Store this tweet in memory, and that later would be flushed to a main DB
    • Call the fanout deliver service to publish this tweet to all the users' timeline that followed this particular user. It could simply store a tweet ID (and later the content could be retrieved from the Tweets Cache) or hydrating the entire text content is also fine (note how we want to handle eidt or delete, twitter probably doesn't allow delete)
    • Call search service to index (Lucene). The search index is also hosted in memory on Redis. Note search here needs to fanout to all search clusters but due to the in memory hosting, it's acceptable.
  • Always remember disk access is at least 100k times slower than memory access, e.g, disk is 10ms vs 100ns on memory. https://gist.github.com/jboner/2841832
  • With the pre-caculate timeline design, there might be race conditions when celebrity (people with millions of followers) starts to talking to each other with replies. E.g., celebrity A tweets something, takes 30 sec to deliver to all the followers,  celebrity B replies before deliver finishes, some followers follow both A and B might see B's reply first before A's original post. One cheat workaround is to sort by timestamp or tweet IDs, but they are also experimenting with only pre-calculate non-celebrity tweets, and when generating timeline, realtime fetching the celebrity tweets. It depends which way is better and also the user experience. This is a good stop point to talk to your interviewers in real world about tradeoffs.
  • Since twitter is heavily relying on cache, you might want to checkout how they optimized the caching with twemproxy.

Baozi Youtube Video

Saturday, December 21, 2019

Leetcode solution 54: Sprial Matrix


Problem Statement 

Given a matrix of m x n elements (m rows, n columns), return all elements of the matrix in spiral order.
Example 1:
Input:
[
 [ 1, 2, 3 ],
 [ 4, 5, 6 ],
 [ 7, 8, 9 ]
]
Output: [1,2,3,6,9,8,7,4,5]
Example 2:
Input:
[
  [1, 2, 3, 4],
  [5, 6, 7, 8],
  [9,10,11,12]
]
Output: [1,2,3,4,8,12,11,10,9,5,6,7]
 Problem link

Video Tutorial

You can find the detailed video tutorial here

Thought Process

It's a straight forward implementation question, we can simply just simulate the spiral order and print. You can choose to do it either iteratively (see reference to download the official Leetcode solution) or use recursion.

There is this awesome one line solution from this guy which is pretty insane.
def spiralOrder(self, matrix):
    return matrix and list(matrix.pop(0)) + self.spiralOrder(zip(*matrix)[::-1])
https://leetcode.com/problems/spiral-matrix/discuss/20571/1-liner-in-Python-%2B-Ruby

Solutions

Simulation using Recursion

Time Complexity: O(M*N) where M, N is row and col of matrix
Space Complexity: O(M*N) since we used list to store the result, where M, N is row and col of matrix


References

Saturday, December 14, 2019

Leetcode solution 322: Coin Change


Problem Statement 

You are given coins of different denominations and a total amount of money amount. Write a function to compute the fewest number of coins that you need to make up that amount. If that amount of money cannot be made up by any combination of the coins, return -1. Example 1:
Input: coins = [1, 2, 5], amount = 11
Output: 3 
Explanation: 11 = 5 + 5 + 1
Example 2:
Input: coins = [2], amount = 3
Output: -1
 
Problem link

Video Tutorial

You can find the detailed video tutorial here

Thought Process

This is a classic problem and greedy might be the first thing comes into mind. For example, sort the coins on denomination and always use the largest amount coin. However, the optimal solution might not include the largest denomination.

for example , [1, 3, 4] and target value 6. The optimal is two 3s, not [4, 1, 1].

It seems we have to list out all the different combinations and find the min value, that leads to a brute fore solution as described here using recursion. It would be an exponential time complexity.

We can use a hash map as a lookup to remember for each amount, what would be the min cost. That would significantly reduce the time complexity from exponential to O(S*N) where S is the amount and N is the coins array size. It's not obvious why but once you see the DP solution, you can understand better, it's the same time complexity.

Another way is to use dynamic programming because the problem is asking for a single yes/no, extreme value(min, max), building up a lookup table using formula is the coding pattern. We have similar problems, Wild Card Matching, Regular Expression Matching

The idea is to build a lookup table for each amount, what's the minimal number of coins needed given current  denomination.
lookup[i] = min(lookup[i], lookup[i - coin value] + 1) given i - coin value >= 0

Solutions

Dynamic Programming

Time Complexity: O(S*N) where S is the amount and N is the coins array size
Space Complexity: O(N) since we used a lookup array


References

Saturday, December 7, 2019

Leetcode solution 274: H-Index

包子小道消息@12/07/2019


从growth at all cost 到profitability mindset,都是被Uber Wework ​弄得。软银老板Masa的头发估计又得脱落几根,不过airbnb也许是2020年被大家追捧的一个新星,从cash at hand / total funds raised 的比例来看,就是下一个google,Facebook和zoom的节奏啊!

 

Problem Statement 

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.
According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
Example:
Input: citations = [3,0,6,1,5]
Output: 3 
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had 
             received 3, 0, 6, 1, 5 citations respectively. 
             Since the researcher has 3 papers with at least 3 citations each and the remaining 
             two with no more than 3 citations each, her h-index is 3.
Note: If there are several possible values for h, the maximum one is taken as the h-index.
Hint 1
An easy approach is to sort the array first.
Hint 2
What are the possible values of h-index?

Hint 3
A faster approach is to use extra space. 


Problem link

Video Tutorial

You can find the detailed video tutorial here

Thought Process

It’s not the most straightforward question to understand I have to admit that. I normally start with small examples to help me understand

The requirement is: his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each

What happens if citation contains only 1 element?
  • Citations = [100] , N = 1, h = 1, N-h = 0
This gives us hint that h has an upper bound of N

What happens if citation contains only 2 elements? And we can assign some large citation values
  • Citations = [1000, 2000] , N = 2, h = 2. It seems h-index has not too much to do with the citation values itself, rather it’s about the index and the values.

We should now have a super brute force way, since h is bounded to be [0, N], we just go through each value and try to see if it could meet the h criteria (e.g., we start from N, then we see if there is at least N papers that have at least N citations) This would result in a O(N^2) time complexity.

Naturally, we normally can think of if we can sort the array to reduce the time complexity. After sort, using the below graph, the problem becomes find the citation more than the citation array index, and that citation index could be the h-index (that's why it's called h-index, not h-value, it's about index). This would give a O(NlgN) time complexity. Code is attached in the References section. 
source: leetcode solution

Finally, we can think about bucket sort since h-index is bounded from [0, N], and if paper citations are larger than N, it's just considered on counting on index N.

You know how to do a bucket sort 😄

Solutions

Use bucket sort

Time Complexity: O(N) since we went through citations only once
Space Complexity: O(N) since we are using an extra array to keep track of the count


References

Saturday, November 30, 2019

Leetcode solution 133: Clone Graph

Problem Statement 

Given a reference of a node in a connected undirected graph, return a deep copy (clone) of the graph. Each node in the graph contains a val (int) and a list (List[Node]) of its neighbors.

Example:
Input:
{"$id":"1","neighbors":[{"$id":"2","neighbors":[{"$ref":"1"},{"$id":"3","neighbors":[{"$ref":"2"},{"$id":"4","neighbors":[{"$ref":"3"},{"$ref":"1"}],"val":4}],"val":3}],"val":2},{"$ref":"4"}],"val":1}

Explanation:
Node 1's value is 1, and it has two neighbors: Node 2 and 4.
Node 2's value is 2, and it has two neighbors: Node 1 and 3.
Node 3's value is 3, and it has two neighbors: Node 2 and 4.
Node 4's value is 4, and it has two neighbors: Node 1 and 3.

Note:
  1. The number of nodes will be between 1 and 100.
  2. The undirected graph is a simple graph, which means no repeated edges and no self-loops in the graph.
  3. Since the graph is undirected, if node p has node q as neighbor, then node q must have node p as neighbor too.
  4. You must return the copy of the given node as a reference to the cloned graph.
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

It's a basic graph problem that can be solved in 3 different ways: BFS using queue, DFS using recursion, DFS using stack(very similar to BFT using queue). The trick is using a map to keep a one-to-one mapping between the old nodes and the copied nodes, meanwhile, we can also use the map to avoid a cycle when performing BFS or DFS.

Solutions

BFS using queue

Time Complexity: O(N)
Space Complexity: O(N) the extra queue and map


DFS using recursion

Time Complexity: O(N)
Space Complexity: O(N) the extra map


DFS using stack

Time Complexity: O(N)
Space Complexity: O(N) the extra queue and stack

The main test method

References

Saturday, November 2, 2019

Leetcode solution 135: Candy

Problem Statement 

There are N children standing in a line. Each child is assigned a rating value.
You are giving candies to these children subjected to the following requirements:
  • Each child must have at least one candy.
  • Children with a higher rating get more candies than their neighbors.
What is the minimum candies you must give?
Example 1:
Input: [1,0,2]
Output: 5
Explanation: You can allocate to the first, second and third child with 2, 1, 2 candies respectively.
Example 2:
Input: [1,2,2]
Output: 4
Explanation: You can allocate to the first, second and third child with 1, 2, 1 candies respectively.
             The third child gets 1 candy because it satisfies the above two conditions.
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

The straight forward idea is go through the ranking array, for each ranking, we try to make sure it has more points than its left and right neighbor. Note that it’s not going to work if you just do that because while you are increasing the values, it’s a chain effect that will affect the previous values as well, therefore we have to go all the way to the beginning.

Use below ranking as an example, assume everyone has 1 point already
Ranking: [5, 3, 1]

Points [1, 0, 0]
When at 5, [1, 0, 0]
When at 3, [2, 1, 0]
When at 1, [2, 2, 1]
you might think 5 points is enough, but it’s wrong. The culprit is when you at 1, and increase child at 3, you have the potential to break the assumption that child with 3 ranking always has less than its left neighbor if that neighbor has more points. Therefore, we have to go all the way to the beginning and comparison, thus resulting O(N^2) time complexity

That said, we have to go all the way to the beginning when we increase any value, so in the end it looks like [3, 2, 1]

Hint: if you cannot solve it in one pass, can we solve it in two passes?

A more clever thought is trying to distribute the candies in two passes. Previously we are trying to ensure at certain point both the left and right constrains should met. With two passes, namely first pass (from left to right) ensure all the left neighbors constrains are satisfied, and second pass ensures all the right neighbors contains are satisfied.
  • Left to right pass: Ensure I have more points than my left neighbor if my ranking is higher. Else I just have 1 candy (this is minimum)
  • Right to left pass: Ensure I have more points than my right neighbor if my ranking is higher.

Solutions

Naive solution 

Time Complexity: O(N^2)
Space Complexity: O(N) the extra candies array

Two pass linear solution

Time Complexity: O(N)
Space Complexity: O(N) the extra candies array

References

Saturday, October 26, 2019

Leetcode solution 201: Bitwise AND of Numbers Range

Problem Statement 

Given a range [m, n] where 0 <= m <= n <= 2147483647, return the bitwise AND of all numbers in this range, inclusive.
Example 1:
Input: [5,7]
Output: 4
Example 2:
Input: [0,1]
Output: 0

Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

Normally we just implement as we do at work, which is keep increasing the number and do an AND. It will still pass the OJ, just need to pay attention to overflow situation (using a long would solve the problem)

Now we are pushing ourselves, can we solve it more than linear. Only log/binary is faster than linear. The idea is to find the common left bits of m and n, and later shift n (total number of digits - common length digits) because the right part would end up to be 0.

For example, from 4 to 7, thte common left part is 1, the range and value would be 100 (which is n left shift twice)
  • 1 00
  • 1 01
  • 1 10
  • 1 11

Solutions

Linear solution


Time Complexity: O(N) essentially n - m
Space Complexity: O(1) no extra space is needed

Logarithmic solution


Time Complexity: O(lgN) because we keep dividing 2 (left shift) of n
Space Complexity: O(1) no extra space is needed

References

Saturday, October 12, 2019

Leetcode solution 103: Binary Tree Zigzag Level Order Traversal

Problem Statement 

Given a binary tree, return the zigzag level order traversal of its nodes' values. (ie, from left to right, then right to left for the next level and alternate between).
For example:
Given binary tree [3,9,20,null,null,15,7],
    3
   / \
  9  20
    /  \
   15   7
return its zigzag level order traversal as:
[
  [3],
  [20,9],
  [15,7]
]

Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

It’s purely an implementation problem to be honest. Very similar to to binary tree level order traversal, we just need to use some data structure to hold the values while we control the traversal order. I choose to use a deque (i.e., doubled linked list) to keep the order. You can also use a queue like many other online solutions but note when calling add(0, value) to an array list is not very efficient since to have array copy every time.

Solutions


Time Complexity: O(N) since we visit each node once
Space Complexity: O(N) since used a deque

References

Saturday, October 5, 2019

Leetcode solution 199: Binary Tree Right Side View

Problem Statement 


Given a binary tree, imagine yourself standing on the right side of it, return the values of the nodes you can see ordered from top to bottom.
Example:
Input: [1,2,3,null,5,null,4]
Output: [1, 3, 4]
Explanation:

   1            <---
 /   \
2     3         <---
 \     \
  5     4       <---

Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

This is a great problem and I highly recommend we solve it in both DFS and BFS for level order traversal because it  covers everything we need to know about trees in interviews.

This would be a better test case because right subtree would not be deep enough to cover the left subtree
   1            <---
 /   \
2     3         <---
 \     
  5             <---
[1, 3, 5] 

Level order traversal using BFS(queue)

It should be relatively natural for us think about using level order traversal and we can simply use a queue and always find the right most element when we pop out element from each level.

Level order traversal using DFS(map)

A pre-order traversal way, just like order level traversal using DFS with a map(key is depth, value is node). This method utilizes a depth to value map to record the right most value on each order while performing a pre-order traversal, namely record the node value, go left then go right. This way, right value can always overwrite the left value thus keeping a “right side view”



I personally don't like the leetcode official solution, where in DFS you don’t need two stacks and in
BFS you don’t need the maps.

Follow up, how about implement a left side view? Trivial right?

Solutions

Level order traversal using BFS(queue)


Time Complexity: O(N) since we visit each node once
Space Complexity: O(N), more precisely the number of element on the last level, aka queue size when it’s a complete tree


Level order traversal using DFS(map)

Time Complexity: O(N) since we visit each node once

Space Complexity: O(lgN) because we are using a map and map size should be tree height, which worst case could be O(N)

References

Saturday, September 28, 2019

Leetcode solution 1176: Diet Plan Performance

Problem Statement 

A dieter consumes calories[i] calories on the i-th day. 
Given an integer k, for every consecutive sequence of k days (calories[i], calories[i+1], ..., calories[i+k-1] for all 0 <= i <= n-k), they look at T, the total calories consumed during that sequence of k days (calories[i] + calories[i+1] + ... + calories[i+k-1]):
  • If T < lower, they performed poorly on their diet and lose 1 point; 
  • If T > upper, they performed well on their diet and gain 1 point;
  • Otherwise, they performed normally and there is no change in points.
Initially, the dieter has zero points. Return the total number of points the dieter has after dieting for calories.length days.
Note that the total points can be negative.

Example 1:
Input: calories = [1,2,3,4,5], k = 1, lower = 3, upper = 3
Output: 0
Explanation: Since k = 1, we consider each element of the array separately and compare it to lower and upper.
calories[0] and calories[1] are less than lower so 2 points are lost.
calories[3] and calories[4] are greater than upper so 2 points are gained.
Example 2:
Input: calories = [3,2], k = 2, lower = 0, upper = 1
Output: 1
Explanation: Since k = 2, we consider subarrays of length 2.
calories[0] + calories[1] > upper so 1 point is gained.
Example 3:
Input: calories = [6,5,0,0], k = 2, lower = 1, upper = 5
Output: 0
Explanation:
calories[0] + calories[1] > upper so 1 point is gained.
lower <= calories[1] + calories[2] <= upper so no change in points.
calories[2] + calories[3] < lower so 1 point is lost.

Constraints:
  • 1 <= k <= calories.length <= 10^5
  • 0 <= calories[i] <= 20000
  • 0 <= lower <= upper
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

This is an easy problem but might be a bit hard to code it up correctly in one pass. We can use a sliding window to keep a rolling sum and compare it with upper and lower bound.

A few caveats in implementation:
  • We can simplify using two pointers start and end by keeping an index i (start) and i - k (end)
  • We can only have one loop instead of having two separate loops
sliding window

Solutions

Sliding window


Time Complexity: O(N) visited the array once
Space Complexity: O(1) No extra space is needed

References

  • None

Saturday, September 21, 2019

Leetcode solution 1169: Invalid Transactions

Problem Statement 

A transaction is possibly invalid if:
  • the amount exceeds $1000, or;
  • if it occurs within (and including) 60 minutes of another transaction with the same name in a different city.
Each transaction string transactions[i] consists of comma separated values representing the name, time (in minutes), amount, and city of the transaction.
Given a list of transactions, return a list of transactions that are possibly invalid.  You may return the answer in any order.

Example 1:
Input: transactions = ["alice,20,800,mtv","alice,50,100,beijing"]
Output: ["alice,20,800,mtv","alice,50,100,beijing"]
Explanation: The first transaction is invalid because the second transaction occurs within a difference of 60 minutes, have the same name and is in a different city. Similarly the second one is invalid too.
Example 2:
Input: transactions = ["alice,20,800,mtv","alice,50,1200,mtv"]
Output: ["alice,50,1200,mtv"]
Example 3:
Input: transactions = ["alice,20,800,mtv","bob,50,1200,mtv"]
Output: ["bob,50,1200,mtv"]

Constraints:
  • transactions.length <= 1000
  • Each transactions[i] takes the form "{name},{time},{amount},{city}"
  • Each {name} and {city} consist of lowercase English letters, and have lengths between 1 and 10.
  • Each {time} consist of digits, and represent an integer between 0 and 1000.
  • Each {amount} consist of digits, and represent an integer between 0 and 2000.

Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

It's an implementation problem and it's up to you how fast you can type to determine whether you want to flex your muscles on object oriented design. It's actually quite a small moev, whether you want to convert the comma separated line into a Transaction object or not.

To meet the two conditions, we can simply loop over all the transactions while at the same time, keep a map to store all the transactions under the same name. Then the problem is compare the city for all the transactions under the same name that time is less or equal than 60 mins. At first I was trying to sort the collection based on the time in ascending order. However, that does not really help reduce the O(N^2) complexity since worst case is still all transactions are within the 60 mins bound, and one "bad" transaction can essentially cause all the previous "good" transactions to become bad, so we have to go back and revisit the previous ones. (as shown in below graph)

Example on one transaction make all previous ones invalid


Caveats
  • It's better to use a Set for de-duplication than a list since we might add duplicate transactions to the collection (e.g., a transaction > 1000 amount could also be invalid due to same name with different city under time < 60min)


Solutions


Time Complexity: O(N^2),we have to loop through the transactions in a nested way for worst case all transactions are under the same name and all within 60min.
Space Complexity: O(N),the result list and set and the extra hashmap we used

References

  • None