Problem Statement
Given an input string (s
) and a pattern (p
), implement regular expression matching with support for '.'
and '*'
.
'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
Note:
s
could be empty and contains only lowercase letters a-z
.
p
could be empty and contains only lowercase letters a-z
, and characters like .
or *
.
Example 1:
Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:
Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:
Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:
Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".
Example 5:
Input:
s = "mississippi"
p = "mis*is*p*."
Output: false
Problem
link
Video Tutorial
You can find the detailed
video tutorial here
Thought Process
The thought process is very similar to Leetcode Solution 44: Wildcard Matching, you can find the blog
here and the video tutorial
here.
Step 0 really is to understand what the "*" really means.
'*' Matches zero or more of the preceding element.
A few examples,
- "" and "a*": True, since * could match zero occurrence of the preceding char 'a', not here zero means deleting the original 'a'
- "a" and "a*": True, since * could match "one" occurrence in total
- "aa" and "a*": True since * could match an extra one occurrence of the preceding char 'a', you can also think of it as "two" occurrences of 'a' in total
- "" and "*": A very tricky case, my code below returns True but leetcode expects False. My code still passes OJ so this is really not a valid case. I argue this should be True since it is one occurrence of empty "" in total
The first attempt (if you haven't seen this question before) is trying to solve it using recursion. The idea is simple. While comparing the s and p char by char, as soon as you see a "*", you will start recurse
- remove the preceding char and the "*", recurse and compare
- for every char equals to the preceding char in the pattern, as long as they are equal to the current char in s, recurse and compare
The time complexity in a ballpark looks like exponential to the P length, but a more detailed analysis from leetcode's original solution can be found
here.
If you recall what problems can be solved using DP, DP (Dynamic Programming) is normally a
great way when trying to answer below two types of questions, and
string related questions are a majority use case of DP
- Only Yes/No, True/False question without returning detailed step by step results
- Extreme values, e.g., Max, Min, Best optimal value
I have posted several videos on how to solve DP problems using a template, essentially
- Initialize your lookup array, could be 2 dimensional or 1 dimensional using rolling arrays
- Initialize the init values in your array
- Get the math formula for a[i] = a[i-1] * m - n or whatever condition you need
- Code it up
This problem can be solved using DP by having formulas like below when a "*" is seen
Once you have the formula, coding up DP problem should be a piece of cake. You can find the video tutorial on this problem here
Solutions
DP solution
Time Complexity: assuming S length is M, P length is N, this is O(M*N) because all we need to do is build up that lookup matrix
Space Complexity: assuming S length is M, P length is N, this is O(M*N), again, that lookup matrix
References