Sunday, April 21, 2019

Leetcode solution 10: Regular Expression Matching

Problem Statement 

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Note:
  • s could be empty and contains only lowercase letters a-z.
  • p could be empty and contains only lowercase letters a-z, and characters like . or *.
Example 1:
Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".
Example 5:
Input:
s = "mississippi"
p = "mis*is*p*."
Output: false
Problem link

Video Tutorial

You can find the detailed video tutorial here


Thought Process

The thought process is very similar to Leetcode Solution 44: Wildcard Matching, you can find the blog here and the video tutorial here.

Step 0 really is to understand what the "*" really means.
'*' Matches zero or more of the preceding element.
A few examples,
  • "" and "a*": True, since * could match zero occurrence of the preceding char 'a', not here zero means deleting the original 'a'
  •  "a" and "a*": True, since * could match "one" occurrence in total
  • "aa" and "a*": True since * could match an extra one occurrence of the preceding char 'a', you can also think of it as "two" occurrences of 'a' in total
  • "" and "*": A very tricky case, my code below returns True but leetcode expects False. My code still passes OJ so this is really not a valid case. I argue this should be True since it is one occurrence of empty "" in total

The first attempt (if you haven't seen this question before) is trying to solve it using recursion. The idea is simple. While comparing the s and p char by char, as soon as you see a "*", you will start recurse
  • remove the preceding char and the "*", recurse and compare 
  • for every char equals to the preceding char in the pattern, as long as they are equal to the current char in s, recurse and compare
The time complexity in a ballpark looks like exponential to the P length, but a more detailed analysis from leetcode's original solution can be found here.

If you recall what problems can be solved using DP, DP (Dynamic Programming) is normally a great way when trying to answer below two types of questions, and string related questions are a majority use case of DP
  • Only Yes/No, True/False question without returning detailed step by step results 
  • Extreme values, e.g., Max, Min, Best optimal value
 I have posted several videos on how to solve DP problems using a template, essentially
  1. Initialize your lookup array, could be 2 dimensional or 1 dimensional using rolling arrays
  2. Initialize the init values in your array
  3. Get the math formula for a[i] = a[i-1] * m - n or whatever condition you need
  4. Code it up
This problem can be solved using DP by having formulas like below when a "*" is seen


Once you have the formula, coding up DP problem should be a piece of cake. You can find the video tutorial on this problem here

Solutions

DP solution

Time Complexity: assuming S length is M, P length is N, this is O(M*N) because all we need to do is build up that lookup matrix
Space Complexity: assuming S length is M, P length is N, this is O(M*N), again, that lookup matrix

References

No comments:

Post a Comment