Regular Expression Matching - Dynamic Programming Top-Down Memoization - Leetcode 10

Regular Expression Matching - Dynamic Programming Top-Down Memoization - Leetcode 10

Regular Expression Matching

In this video, the presenter explains how to implement regular expression matching in Python. The video covers the special characters used in regular expressions, including the period and star, and how they affect string matching.

Introduction to Regular Expression Matching

  • The goal is to implement regular expression matching in Python.
  • Special characters include the period and star.
  • If there are no special characters, it's just string matching.
  • Examples of string matching with and without special characters.

Understanding Period and Star Characters

  • The period matches any character.
  • The star means that any character before it can be repeated zero or infinite times.
  • Examples of using a period and star for pattern matching.

Brute Force Approach to Regular Expression Matching

  • The complexity of regular expression matching comes from the star character.
  • A decision tree can be used to brute force pattern matching with a star character.
  • Examples of using a decision tree for pattern matching with a star character.

Conclusion

  • Regular expression matching can be complex due to special characters like the period and star. However, by understanding these characters and using a brute force approach when necessary, we can successfully implement regular expression matching in Python.

Matching Regular Expressions with Dynamic Programming

In this section, the speaker explains how to match regular expressions using dynamic programming. They demonstrate how to use a decision tree to find all possible matches and explain how caching can be used to improve efficiency.

Using a Decision Tree for Matching Regular Expressions

  • The speaker demonstrates how to use a decision tree to match regular expressions.
  • Each time a star is encountered in the pattern, there are two possible decisions: repeat the previous character or don't repeat it.
  • The decision tree can become very large and inefficient, especially when there are many stars in the pattern.

Improving Efficiency with Dynamic Programming

  • The speaker explains that this is a dynamic programming problem and that caching can be used to improve efficiency.
  • With caching, the time complexity can be reduced from exponential to n squared or n times m where n is the length of the input string and m is the length of the pattern.
  • There are two ways to implement dynamic programming: top-down memoization with a cache or bottom-up full dynamic programming solution.

Example of Matching Regular Expressions

  • The speaker provides an example of matching regular expressions using dynamic programming.
  • They use two variables i and j to keep track of where they are in each string.
  • When encountering a star in the pattern, there are again two possible decisions: repeat the previous character or don't repeat it.
  • By making these decisions and shifting i and j accordingly, they are able to find all possible matches between the input string and pattern.

Updating Pointers When Using a Star

In this section, the speaker discusses how to update pointers when using a star in regular expressions.

Shifting i and j

  • If we use the star in our regular expression and match a character, we can shift i by 1.
  • We can leave j where it is since we used the star once.
  • When using the star again, we can add 1 to i and leave j as it is.

Choosing to Repeat or Not Repeat

  • When there's still a star after j, we have two choices: repeat or not repeat.
  • If we repeat, we get double "a," if not, then "a" followed by "b."
  • The new character matches what was expected so we can shift i by one and leave j as it is.

Continuing with No Star

  • With no star after j, there's only one choice left.
  • We check if the characters at i and j match. If they do, we shift both pointers by one.
  • If they don't match, then we shift j by 2 since there's no option to use the previous character again.

Matching Perfectly

In this section, the speaker discusses how to know when two strings match perfectly.

Out of Bounds Check

  • When both i and j are out of bounds (i >= length of s and j >= length of p), then they match perfectly.
  • If only j is out of bounds, then they do not match since there's still some string in the input that hasn't been matched.

i Out of Bounds

  • If only i is out of bounds, then it depends on the pattern.
  • If there's a star after the last character in p, then they match perfectly.
  • Otherwise, if there's no star after the last character in p, then they do not match.

[t=0:18:58s] Top-Down Memoization Dynamic Programming Solution

In this section, the speaker explains how to solve a problem using top-down memoization dynamic programming solution. They explain that they will not be using the bottom-up approach as it is more confusing.

Recursive Function

  • The speaker starts by explaining that they will use a nested function depth-first search inside of their "is match" function.
  • They call this backtracking or dfs and pass in i and j which tell us what position we're at in the input string s for i and j is going to tell us where we're at in the input string p.
  • The base case is when i is out of bounds and j is out of bounds, meaning we have found our solution, so we can return true.
  • If i is not out of bounds but j is out of bounds, then there's some string in s that we haven't matched, so we have to return false.

Checking for Matches

  • We want to check if there's a match between the first character of each string.
  • We also check if p[j] has a period in that position meaning the period matches any character.
  • If these conditions are satisfied, then it means the first character of each string matches exactly.
  • We check if p[j+1] matches with "*" because it has higher precedence than other characters.
  • If p[j+1] exists and matches "*", then we have two choices:
  • Don't use "*": leave i as it is and add 2 to j
  • Use "*": increment i by one and leave j where it is
  • If either choice evaluates to true, then return true
  • If neither choice evaluates to true, then return false
  • If there's no "*" character present, then just check if both characters match.
  • If they match, increment i and j by one and return the result.
  • If they don't match, then return false.

Brute Force Solution

  • The speaker explains that this is the brute force solution.
  • They show how inefficient it is by calling their dfs function starting at zero in both strings.

Adding Memoization to Improve Efficiency

In this section, the speaker discusses how to improve the efficiency of a function by adding memoization. The speaker creates a cache using a hash map and checks if the pair i j has already been added to the cache before executing the function. If it has, then it returns the value from the cache instead of repeating work.

Creating a Cache

  • Create a cache using a hash map.
  • Check if the pair i j has already been added to our cache before executing the function.
  • Add values returned by each execution of the function to their corresponding index in the cache.

Improving Efficiency

  • By caching previous results, we can avoid repeating work and improve efficiency.
  • After adding values to the cache, return them instead of repeating work.

Results and Conclusion

In this section, the speaker discusses how much faster their code runs after implementing memoization. They also acknowledge that there are better ways to write this code but hope that viewers were able to understand what was being done.

Results

  • After implementing memoization, code runs much faster (44 milliseconds vs 81 milliseconds).
  • Dynamic programming solution is encouraged for viewers who want an even more efficient solution.

Conclusion

  • The meat of this work is done in two lines of code where we handle star cases.
  • Memoization is an easy way to improve efficiency but there are better ways to write this code.
Video description

šŸš€ https://neetcode.io/ - A better way to prepare for Coding Interviews 🐦 Twitter: https://twitter.com/neetcode1 🄷 Discord: https://discord.gg/ddjKRXPqtk 🐮 Support the channel: https://www.patreon.com/NEETcode Twitter: https://twitter.com/neetcode1 Discord: https://discord.gg/ddjKRXPqtk šŸ’” CODING SOLUTIONS: https://www.youtube.com/playlist?list=PLot-Xpze53leF0FeHz2X0aG3zd0mr1AW_ šŸ’” DYNAMIC PROGRAMMING PLAYLIST: https://www.youtube.com/watch?v=73r3KWiEvyk&list=PLot-Xpze53lcvx_tjrr_m2lgD2NsRHlNO&index=1 🌲 TREE PLAYLIST: https://www.youtube.com/watch?v=OnSn2XEQ4MY&list=PLot-Xpze53ldg4pN6PfzoJY7KsKcxF1jg&index=2 šŸ’” GRAPH PLAYLIST: https://www.youtube.com/watch?v=EgI5nU9etnU&list=PLot-Xpze53ldBT_7QA8NVot219jFNr_GI šŸ’” BACKTRACKING PLAYLIST: https://www.youtube.com/watch?v=pfiQ_PS1g8E&list=PLot-Xpze53lf5C3HSjCnyFghlW0G1HHXo šŸ’” LINKED LIST PLAYLIST: https://www.youtube.com/watch?v=G0_I-ZF0S38&list=PLot-Xpze53leU0Ec0VkBhnf4npMRFiNcB&index=2 Problem Link: https://neetcode.io/problems/regular-expression-matching Code on Github: https://github.com/neetcode-gh/leetcode/blob/main/10-Regular-Expression-Matching 0:00 - Read the problem 3:55 - Drawing Solution 19:00 - Coding Solution leetcode 10 This question was identified as a google interview question from here: https://github.com/xizhengszhang/Leetcode_company_frequency #regular #expression #python Disclosure: Some of the links above may be affiliate links, from which I may earn a small commission.