Regular Expression Matching
To solve this coding challenge, we need to implement a regular expression matching algorithm that supports two special characters:
matches the pattern
entirely, rather than partially. This algorithm can be efficiently solved using dynamic programming.
), where
indicates whether the substring
matches the pattern
. Here, we use bottom-up dynamic programming approach.
- ‘.’ which matches any single character.
- ‘*’ which matches zero or more of the preceding element.
s
p
Explanation
We will utilize a 2D boolean array (dp
dp[i][j]
s[0...i-1]
p[0...j-1]
-
Initialize the
array:
dp -
is
dp[0][0]because an empty string matches an empty pattern.True -
For patterns where `j` is greater than `0` and the pattern contains `
` to match the preceding character zero times, thus setting values for `dp[0][j]`.
, we need to use the property of -
Filling up the
array:
dp -
Iterate through each character in the string
and the pattern
s.p - For each character:
-
If the current characters in both string and pattern match (
or
s[i-1] == p[j-1]), setp[j-1] == '.'todp[i][j].dp[i-1][j-1] -
If the current character in the pattern is
, we check two conditions:
'*' -
If the pattern without this
and its preceding character matches (i.e.,
'*').dp[i][j-2] -
If the preceding character of
in pattern matches the current character in the string or the preceding character itself is a ‘.’ and the substring
'*'excluding this character matches the patterns[0...i-1](i.e.,p[0...j]).dp[i-1][j] - Final check:
-
Return
which shows whether the entire string matches the pattern.
dp[len(s)][len(p)]
# Initialize a 2D array dp where dp[i][j] will be True if s[0...i-1] matches p[0...j-1]
Initialize dp array of size (len(s) + 1) x (len(p) + 1) to False
# An empty string matches an empty pattern
dp[0][0] = True
# Handle patterns with '*' that can match zero elements
for j from 1 to length of p:
if p[j - 1] == '*':
dp[0][j] = dp[0][j - 2]
# Iterate over each character of s and p
for i from 1 to length of s:
for j from 1 to length of p:
# When characters match or when there is a '.'
if p[j - 1] == s[i - 1] or p[j - 1] == '.':
dp[i][j] = dp[i - 1][j - 1]
# When there is a '*' character in pattern
elif p[j - 1] == '*':
# Case when '*' matches zero preceding characters
dp[i][j] = dp[i][j - 2]
# Case when '*' matches one or more preceding characters
if p[j - 2] == s[i - 1] or p[j - 2] == '.':
dp[i][j] = dp[i][j] or dp[i - 1][j]
# Return whether the entire string matches the pattern
return dp[len(s)][len(p)]
Step-by-Step Explanation/Detailed Steps in Pseudocode
Step 1: Initialize the dp array
-
Create a 2D array
of size
dpinitialized to(len(s) + 1) x (len(p) + 1).False -
Set
to
dp[0][0]because an empty string matches an empty pattern.True
Step 2: Handle patterns with '*'
-
Loop through the pattern
from index
pto1:length of p -
If the current character in the pattern is
, it can either match zero elements of the preceding character. This means for
*, checkdp[0][j].dp[0][j - 2]
Step 3: Iterate through each character of s and p to fill dp array
-
Loop through the string
from index
sto1.length of s -
Loop through the pattern
from index
pto1.length of p -
If the character at
matches
s[i-1]orp[j-1]isp[j-1]:. -
Update
as
dp[i][j].dp[i-1][j-1] -
If the character at
is
p[j-1]:* -
Initially, assume
matches zero characters, i.e.,
*.dp[i][j] = dp[i][j-2] -
Check if the preceding character in
matches the current character in
por if it is as, then update.todp[i][j]ordp[i][j].dp[i-1][j]
Step 4: Return final result
-
Return the value
, indicating whether the entire string
dp[len(s)][len(p)]matches the patterns.p