Friday, August 30, 2019

Leetcode solution 257: Binary Tree Paths

Problem Statement 

Given a binary tree, return all root-to-leaf paths.
Note: A leaf is a node with no children.
Example:
Input:

   1
 /   \
2     3
 \
  5

Output: ["1->2->5", "1->3"]

Explanation: All root-to-leaf paths are: 1->2->5, 1->3
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

Easy problem. Use pre-order traversal from root to leaf and keep the recursion path. The recursion returning condition would be when a node doesn’t have left and right children. Note use a string to keep appending is easier than using a string builder, because we need to pop out and reset the string builder.

Caveat
  • Handle the single node situation when no "->". E.g.,  A vs A->B
There is also an iterative implementation of this using one stack, similar to BST iterator using one stack problem. 

Solutions

Pre-order traversal


Time Complexity: O(N), each node is visited once
Space Complexity: O(N), N is the total nodes since that's the string length you need

References

Friday, August 23, 2019

Leetcode solution 124: Binary Tree Maximum Path Sum

Problem Statement 


Given a non-empty binary tree, find the maximum path sum.
For this problem, a path is defined as any sequence of nodes from some starting node to any node in the tree along the parent-child connections. The path must contain at least one node and does not need to go through the root.
Example 1:
Input: [1,2,3]

       1
      / \
     2   3

Output: 6
Example 2:
Input: [-10,9,20,null,null,15,7]

   -10
   / \
  9  20
    /  \
   15   7

Output: 42
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

When dealing with binary tree related problem, traversals using recursion is our friend. It seems we can perform a post-order traversal, and keep track of the maximum sums.

If the path cannot go across root, then in each post-order step, we will have the max_sum_of_the_left_path, max_sum_of_the_right_path, the current_node_value, we simply return and record

single_path_max = max(the current_node_value, max(max_sum_of_the_left_path, max_sum_of_the_right_path) + current_node_value)

However, the problem allows a path that goes through the root, therefore, we need to also record a max between left + current node value + right, i.e.,

global_max = max(single_path_max, max_sum_of_the_left_path + current_node_value + max_sum_of_the_right_path)

One caveat is in your recursion, we should still return the single_path_max. The reason we should not return the global_max is in that case, it will not be a single node to single node path.

Solutions

Post-order recursion


Time Complexity: O(N), each node is visited once
Space Complexity:No extra space is needed other than the recursion function stack

References

Leetcode solution 173: Binary Search Tree Iterator

Problem Statement 

Implement an iterator over a binary search tree (BST). Your iterator will be initialized with the root node of a BST.
Calling next() will return the next smallest number in the BST.

Example:
BSTIterator iterator = new BSTIterator(root);
iterator.next();    // return 3
iterator.next();    // return 7
iterator.hasNext(); // return true
iterator.next();    // return 9
iterator.hasNext(); // return true
iterator.next();    // return 15
iterator.hasNext(); // return true
iterator.next();    // return 20
iterator.hasNext(); // return false

Note:
  • next() and hasNext() should run in average O(1) time and uses O(h) memory, where h is the height of the tree.
  • You may assume that next() call will always be valid, that is, there will be at least a next smallest number in the BST when next() is called.
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

 

Brute force dump tree into array method

It’s very similar to BST in order traversal. From that train of thought , we can easily have a brute force solution: perform an in-order traversal, dump that order into an array. Then we can have an pointer point to the array and move the pointer forward every time we call next() method. The time complexity of this brute force way is O(N) in the constructor phase (N is the total number of nodes in the tree). Even though we iterate through the tree 2 times, it’s still O(N), hasNext() and Next() would all be O(1). Space complexity is O(N) as well since we need an extra array to store all the elements in the tree.
Brute force way to dump the tree into an array then iterate over the array
Image credit: https://leetcode.com/articles/binary-search-tree-iterator/



In order traversal using stack  

 

In order to reduce memory from O(N) to O(lgN), aka O(h) where h is the height of the tree, an alternative is return the next() inline while we are performing the recursion. The tricky part is if we still perform the in-order traversal using recursion, there is no easier way to plug in the next() method into the recursion call.

We know we can always use stack to simulate recursion (the push function stack part), and by having a tree node in the stack, it’s also easier to track the predecessor  and successor. The starting position should point to the smallest node in the tree, which is the left most node, we keep going left and pushing the nodes we visited to the stack. Whenever next() is called, we return the top of the stack, and we check if there is any right child of the node, if so, continue push the nodes into the stack until we hit the left most one. hasNext() is simply checking if there is still nodes in the stack. The time complexity of next() would be average O(1) even we have a while loop to keep finding the left most child since on high level, each node is still visited only once. Space complexity is O(lgN) since we at most would put the height of tree worth of elements into the stack.

Initial stage
Image credit: https://leetcode.com/articles/binary-search-tree-iterator/
Find the left most child
Image credit: https://leetcode.com/articles/binary-search-tree-iterator/


Solutions

In order traversal using stack

Note in next() method, we don't need to check if there is still element left in the stack. It is the caller's responsibility to check so. It is described in the Java's Iterator interface NoSuchElementException should be thrown.



Time Complexity: next(), hasNext() O(1) on average even with the while loop, think each node is visited essentially only once
Space Complexity: O(lgN), extra stack is needed capped at tree height

References

Sunday, August 18, 2019

Leetcode solution 772: Basic Calculator III

Problem Statement 


Implement a basic calculator to evaluate a simple expression string.
The expression string may contain open ( and closing parentheses ), the plus + or minus sign -non-negative integers and empty spaces .
The expression string contains only non-negative integers, +-*/ operators , open ( and closing parentheses ) and empty spaces . The integer division should truncate toward zero.
You may assume that the given expression is always valid. All intermediate results will be in the range of [-2147483648, 2147483647].
Some examples:
"1 + 1" = 2
" 6-4 / 2 " = 4
"2*(5+5*2)/3+(6/2+8)" = 21
"(2+6* 3+5- (3*14/7+2)*5)+3"=-12

Note: Do not use the eval built-in library function.
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

This is an extension problem to Basic Calculator and Basic Calculator II. The differences are this time it really looks like a real calculator where it can do "+-*/" and with brackets.

It's still the same thought process with exactly the same idea before: having two stacks,  one stack for the operator and the other for the operands, and with exactly the same calculating preferences before: calculate * and / before + and -. When seeing right bracket, continue popping till you see the left bracket.

This works fine assuming all the integers are non-negative, which they are supposed to be based on the problem description, and that's what most of the existing online leetcode solutions did as well.

As of 08/18/2019, this doesn't seem to be the case because leetcode decides to include "non-negative" integer test cases. See below case "-1+4*3/3/3"

An "non-negative" integer case


Since we are practicing interviews, so let's also address the non-negative integer case as well

A few caveats
  • (New) Handle negative integer starting case:
    "-1 + 2" or "-(2+3)*4"
  • (New) Handle negative integer in between expression case:
    "1 - (-7)"
  • (New) Calculate whenever you can calculate since division order matters
    15 / 3 / 5 should be 1, but it won't work 3 / 5 = 0,  then 15/0 invalid
  • Pay attention to the order when popping out operands and calculate, the order matters. 
  • The number might not be just single digit, so need to read the char and convert the numbers

Solutions

Two stacks for non-negative integer solution

Keep two stacks, operator and operands as explained in the above "Thought Process"



Time Complexity: O(N), N is the length of the string
Space Complexity: O(N), extra stack is needed

Two stacks for negative integer solution

Essentially how do address the below two situations given the current code structure
  • (New) Handle negative integer starting case. I simply just check if the first char in trimmed string is "-", and push a -1 into operands and "*" into operator. (e.g.,  -(a+b) = -1 * (a+b) and -a = -1 * a)
    "-1 + 2" or "-(2+3)*4"
  • (New) Handle negative integer in between expression case. This is a bit hacky because a  "-" in the middle of the expression could mean two things: a normal minus sign or a negative sigh denoting negative integer. Luckily "1 - - 2" would not be a valid case which means there should always be a left bracket before the negative sign for negative integer. What I did was using isNegativeNumberFollowingLeftBracket to find those cases and put a -1 and * into the operator and operands respectively
    "1 - (-7)"


Time Complexity: O(N), N is the length of the string
Space Complexity: O(N), extra stack is needed

References

Friday, August 9, 2019

Leetcode solution 227: Basic Calculator II

Problem Statement 

Implement a basic calculator to evaluate a simple expression string.
The expression string contains only non-negative integers, +-*/ operators and empty spaces . The integer division should truncate toward zero.
Example 1:
Input: "3+2*2"
Output: 7
Example 2:
Input: " 3/2 "
Output: 1
Example 3:
Input: " 3+5 / 2 "
Output: 5
Note:
  • You may assume that the given expression is always valid.
  • Do not use the eval built-in library function.

Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

This problem is very similar to Basic Calculator. The difference is there is no parenthesis in this one, but there is * and / . We can use similar thought process by having two stacks,  one stack for the operator and the other for the operands. We just need to pay attention the differentiate the operator's priority. For example, when you currently see a "+" or "-"  and previous operator is "*" or "/", you need to pop up the operator stack and calculate. When you currently see a "*" or "/" and previous operator is "+" or "-", you should just keep pushing operator to stack. Else, they are at the same priority, we just do the normal calculation.


A few caveats
  • Notice number overflow. "0- 2147483648". I don't think leetcode has this test case but it is a valid one. We should use Long.
  • Pay attention to the order when popping out operands and calculate, the order matters. 
  • The number might not be just single digit, so need to read the char and convert the numbers

Solutions

Standard generic way

Keep two stacks, operator and operands as explained in the above "Thought Process"


Time Complexity: O(N), N is the length of the string
Space Complexity: O(N), extra stack is needed

Use one stack with two passes

Another neat and clean way to solve this problem is also similar to the "Use the sign method with one stack" in Basic Calculator. The idea is in the first pass, only calculate "*" and "/" and push the values into the stack, then have a 2nd pass to do the "+" and "-" calculations.

Time Complexity: O(N), N is the length of the string
Space Complexity: O(N), extra stack is needed

References

Thursday, August 8, 2019

Leetcode solution 224: Basic Calculator

Problem Statement 

Implement a basic calculator to evaluate a simple expression string.
The expression string may contain open ( and closing parentheses ), the plus + or minus sign -non-negative integers and empty spaces .
Example 1:
Input: "1 + 1"
Output: 2
Example 2:
Input: " 2-1 + 2 "
Output: 3
Example 3:
Input: "(1+(4+5+2)-3)+(6+8)"
Output: 23
Note:
  • You may assume that the given expression is always valid.
  • Do not use the eval built-in library function.
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

A generic way for solving those calculating numbers problems is to use stack. More specifically, use two stacks: one stack for the operator and the other for the operands.

A few caveats
  • Pay attention to the order when popping out operands and calculate, the order matters. 
  • The parenthesis matters, 2 - 1 + 2 = 3, while 2 - (1+2) = -1 = 2 - 1 - 2 if you want to remove the bracket you need to change the sign 
  • "1000" is a valid expression without any operators 
  • The number might not be just single digit, so need to read the char and convert the numbers

Solutions

Standard generic way

Keep two stacks, operator and operands. When we see left bracket, keep pushing to the stack. We calculate the values as normal within the inner most bracket. When we see right bracket, calculate and pop out the left bracket

Time Complexity: O(N), N is the length of the string
Space Complexity: O(N), extra stack is needed

Use the sign method with one stack

The thought process is very similar to use the stacks, in this method, the clever part is it uses only one stack and also pushed a sign. +1 for "+" and -1 for "-1". Whenever there is a left bracket, you push the existing result and the sign into the stack. Whenever there is a right bracket, you pop up the sign and the value. Pretty neat!

Time Complexity: O(N), N is the length of the string
Space Complexity: O(N), extra stack is needed

References

Monday, August 5, 2019

Leetcode solution 110: Balanced Binary Tree

Problem Statement 

Given a binary tree, determine if it is height-balanced.
For this problem, a height-balanced binary tree is defined as:
a binary tree in which the depth of the two subtrees of every node never differ by more than 1.
Example 1:
Given the following tree [3,9,20,null,null,15,7]:
    3
   / \
  9  20
    /  \
   15   7
Return true.

Example 2:
Given the following tree [1,2,2,3,3,null,null,4,4]:
       1
      / \
     2   2
    / \
   3   3
  / \
 4   4
Return false.
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

As described in the problem, it is intuitive to solve this problem recursively, especially given this is a tree related problem. What we can do is get the height of the left sub tree, compared with the right sub tree, then do the logics to see if it’s balanced or not. If at certain level either the left or right sub tree is not balanced, then entire Tree is not balanced. Classic usage for post order traversal.

Solutions

Post order traversal using recursion

Calculate the height of the left subtree, calculate the height of the right subtree, then compare. If it's already not balanced, return -1 and directly return

Time Complexity: O(N), N is the total tree nodes since it is visited once and only once
Space Complexity: O(1), No extra space is needed

References

Wednesday, July 24, 2019

Leetcode solution 258: Add Digits

Problem Statement 

Given a non-negative integer num, repeatedly add all its digits until the result has only one digit.
Example:
Input: 38
Output: 2 
Explanation: The process is like: 3 + 8 = 11, 1 + 1 = 2. 
             Since 2 has only one digit, return it.
Follow up:
Could you do it without any loop/recursion in O(1) runtime?
Problem link

 

Video Tutorial

You can find the detailed video tutorial here

Thought Process

Just do a simple brute force simulation. Unless you know what a digital root is, beforehand solving it in O(1) cannot collect any useful signals from the candidate.

                                   
The triple bar equal is quite interesting. It means congruence. E.g., if you have two exactly same figures but one is rotated or clock points to 13 and 1, they are considered congruence (just like 13 and 1 mod by 12 are equal). In our example,  it's about the mod result, e.g.,

Side note:
I am sharing with everyone on this problem just to express how useless this problem is... Honestly if someone ask you to solve this problem in O(1) time in a real interview, rethink joining that company. If you are an interviewer, please do yourself a favor not asking your candidate to solve it in O(1). It's an okay question to ask as a warm up just to calm your candidate down for the brute force solution. 

Solutions

Brute force

Keep summing the each digit until the final number is < 10

Time Complexity: O(N), N is the length of the digit
Space Complexity: O(1), No extra space is needed

Use the digital root formula

                                              return (num - 1) % 9 + 1
Time Complexity: O(1)
Space Complexity: O(1)

References

Sunday, July 21, 2019

分布式Streaming Data Processing - Samza

作者:包子黑旋风老师

现在的主流的互联网应用越来越依赖streaming data来提供用户一些interesting statistics insights。以linkedin为例,最近90天有多少人看过你的linkedin profile。看过你profile的人都是什么job title,他们都在那些公司工作。如下图,你应该如何实现这个功能呢?

相信大家都听说过page view event,就是用户每次打开网站上的某个页面发出来的tracking event,各个大公司一般用这些event来做一些统计分析,business analysis。大家一般会利用一些吞吐量大的分布式消息系统来存储这些event,例如kafka。这是因为对于一些popular的网站,每天可能会有上亿或者10亿的page view event。我们可以利用对这个event的处理来实现我们之前提到的功能。


通常有两种方法可以实现以上的功能,一个是通过hadoop map reduce job,或者更抽象的hive pig query来实现这样的统计功能。但是这个方法有一个明显的劣势,就是处理速度慢,很难做到事实更新。对于我们以上的功能要求或许这个方法没有任何问题,因为我们只关注过去90天的统计信息而且不要求显示当天信息。但是今天我们要探讨另一个实现方法,利用多streaming data processing做到实时统计更新。其实有好多功能是需要事实更新的,例如search index update,twitter或者facebook一些hot topic/trent的发现。
  1. Stream Data Repartition

我们可以通过对streaming data的repartition来实现同一个用户的page view events都聚集到了同一个机器上去处理,这样我们可以做到每个用户的统计数据都是准确的。这个功能基本所有主流的streaming data处理框架都支持,例如,kafka + samza,aws kinesis,storm。

  1. Streaming Data Join

我们可以看到我们需要根据viewer的职位名称或者公司名称来做统计,但是我们的page view event只有viewer的id,没有职位或者公司这些信息,那我们改怎么实现呢?

一个非常简单的思路就是让我们的streaming processor去call profile的api来拿到职位或者公司名称的信息。这样子做有几个非常明显的劣势。1. 如果streaming processor停止工作半个小时或者更长时间,在重启streaming processor的时候由于积累了大量的未处理的events,streaming processor会flood我们之前说过的profile api。2. Streaming processor每次通过network来call另外一个api会增加额外的latency。3. 很难做到online和offline的isolation,因为这个统计功能还是属于offline或者nearline data processing,我们不希望因为这个功能影响了用户查询或者修改profile信息。比如第一个case发生的时候。

另一个思路就是可以加cache,来cache profile的查询request。但是这样子也有一个劣势,如果TTL设的很大,很难做到cache的数据是事实更新的,如果TTL设的特别短,cahe又基本不起什么作用,而且增加额外的network cost。

这里我们介绍一个samza引进的一个新功能,stream joining。我们可以join page view event和profile edit event,然后解决以上两个方案的劣势。我们的stream processor需要同时听两种events(PageViewEvent and ProfileEditEvent),然后对这两种event进行同样的partition both by viewer id,对于profile edit events,我们可以在stream processing机器上建立一个小的数据库来存储profile的实时数据,这样子我们可以对viewer进行快速查询来enrish page view event with viewer job title和company information。然后我们再将enriched的page view event重新partition by user id。然后进行统计。这样子我们就做的了profile数据的isolation,也解决了network call的latentcy cost。