Substring With Concatenation Of All Words
To solve this coding challenge, we need to identify all starting indices in string
from which a concatenated substring of
begins. The concatenated substring must include every string in
exactly once and in any order.
s
words
words
Explanation:
To tackle this problem, we need to employ a sliding window technique combined with hash maps to efficiently check for the required concatenation of substrings.Steps to Solve:
- Input Checks and Early Exit :
-
First, check if the input string
or the list of words
sis empty. If so, immediately return an empty list as there can be no valid substrings.words - Initial Setup :
-
Calculate the length of an individual word (
).
word_len -
Find the total count of words (
).
words_count -
Calculate the length of the entire concatenated substring (
), which is simply
concat_lenmultiplied byword_len.words_count - Frequency Dictionary for Words :
-
Build a frequency dictionary
that stores how many times each word appears in the list
word_freq.words - Sliding Window :
-
Iterate through the string
from index
sto0. This ensures that we only consider starting points of substring windows within bounds.(len(s) - concat_len + 1) -
For each starting point, initialize an empty dictionary
to keep track of the words seen in the current window.
seen - Check Words in Window :
-
For each starting point, check fixed-length segments of the string
that match the length of words in
s.words -
Update the
dictionary with the counts of each segment.
seen - If a segment isn't part of the words list or if any word appears more times than it should, break out of the loop early.
- Validity Check :
- If the segment perfectly matches the required frequency of words, record the start index in the result list.
- Return the Result :
- After we have checked all possible starting points, return the list of starting indices.
-
Start by checking if the input string
and
slist are empty. Return an empty list if true.words -
Calculate the length of the words and the total expected length of the concatenated substring (
).
concat_len -
Build a frequency dictionary to count occurrences of each word in
.
words -
Use a for loop to iterate through the string
from the start to
s.(length of s - concat_len + 1) -
For each starting index
, initialize an empty
idictionary.seen -
Use an inner for loop to check fixed-length segments from the current starting index
.
i - Extract the segment and check if it exists in the frequency dictionary.
-
Update the
dictionary and ensure the word count doesn't exceed what's expected.
seen - If all words match perfectly in the current window, record the starting index.
- Finally, return the resultant list of valid starting indices.
Pseudocode:
# Check if input string or words list is empty
if s is empty or words is empty:
return empty list
# Initialize required variables
word_len = length of the first word in words
words_count = number of words in words
concat_len = word_len * words_count
# Build frequency dictionary for words
word_freq = empty dictionary
for each word in words:
if word is in word_freq:
increment word_freq[word] by 1
else:
set word_freq[word] to 1
result = empty list
# Iterate through each possible starting index in s
for i from 0 to (length of s - concat_len + 1):
seen = empty dictionary
for j from 0 to words_count - 1:
word_start = i + j * word_len
word = substring of s from word_start to word_start + word_len
# Check if word is in word_freq
if word is in word_freq:
if word is in seen:
increment seen[word] by 1
else:
set seen[word] to 1
# If seen word count exceeds expected count in word_freq, break
if seen[word] > word_freq[word]:
break
else:
break
# If we have checked all words and counts match, add start index to result
if j + 1 == words_count:
append i to result
# Return the result list of starting indices
return result