Finding the Smallest Window in a String: A Sliding Window Masterclass
As a seasoned software engineer specializing in algorithm design and optimization, I’ve spent years solving complex string manipulation problems.
One such problem is finding the smallest substring in a string `s` that contains all characters of another string `t.`
The sliding window technique comes in handy in this problem.
The solutions to most substring-related problems use this powerful strategy efficiently.
In this article, I’ll walk you through an optimized Python solution, breaking it down into simple, easy-to-understand steps.
Whether you're preparing for coding interviews or building real-world applications, this guide will help you master the sliding window approach.
The Problem: Minimum Window Substring
Given two strings, `s` and `t`, we need to find the smallest substring in `s` having all characters of `t`. For example:
- If `s = "ADOBECODEBANC"` and `t = "ABC"`, the smallest window is `"BANC"`.
- If `s = "a"` and `t = "aa"`, there’s no valid window, so the answer is an empty string `""`.
A brute-force approach would involve checking all possible substrings of `s`, but this is highly inefficient, with a time complexity of O(N²).
Instead, we’ll use the sliding window technique to solve this problem in O(N) time.
The Solution: Sliding Window with Frequency Counting
In the sliding window, we maintain a window which is a substring of `s`.
That window dynamically shrinks or expands depending on particular conditions at play.
Here’s how it works:
- Frequency Counters: We use two counters, `t_counter` and `s_counter`, to store the frequency of characters in `t` and the current window of `s`, respectively.
- Expand the Window: We expand the window by moving the right pointer (`r`) and adding characters to `s_counter`.
- Shrink the Window: Once the window contains all characters of `t`, we shrink it from the left (`l`) to find the smallest valid window.
- Track the Result: We keep track of the smallest valid window and return it at the end.
Here’s the optimized Python code with detailed comments and docstrings:
from collections import Counter
class Solution:
def minWindow(self, s: str, t: str) -> str:
"""
Find the smallest substring in `s` that contains all characters of `t`.
Args:
s (str): The string to search within.
t (str): The string containing characters to match.
Returns:
str: The smallest valid window or an empty string if no valid window exists.
"""
# Base case: If `t` is longer than `s`, no valid window exists
if len(t) > len(s):
return ""
# If `s` and `t` are the same length, check if they match
if len(t) == len(s) and self.contains_all_chars(Counter(s), Counter(t)):
return s
l = 0 # Left pointer of the window
t_counter = Counter(t) # Frequency counter for `t`
s_counter = Counter() # Frequency counter for the current window in `s`
shortest_substr = "" # Stores the smallest valid window
s_len = float("inf") # Length of the smallest valid window
# Expand the window by moving the right pointer
for r in range(len(s)):
s_counter[s[r]] += 1 # Add the current character to `s_counter`
# Shrink the window if it contains all characters of `t`
while self.contains_all_chars(s_counter, t_counter):
win = s[l:r + 1] # Current window
# Update the smallest window if the current one is smaller
if len(win) < s_len:
s_len = len(win)
shortest_substr = win
# Shrink the window from the left
s_counter[s[l]] -= 1
if s_counter[s[l]] == 0:
del s_counter[s[l]] # Clean up to prevent unnecessary keys
l += 1
return shortest_substr
def contains_all_chars(self, s_counter, t_counter):
"""
Check if `s_counter` contains all characters of `t_counter`.
Args:
s_counter (Counter): Frequency counter for the current window in `s`.
t_counter (Counter): Frequency counter for `t`.
Returns:
bool: True if `s_counter` contains all characters of `t_counter`, False otherwise.
"""
return all(s_counter[char] >= count for char, count in t_counter.items())
Why This Approach Works
- Efficiency: The sliding window ensures we traverse `s` only once. As a result, we have an O(N) complexity in time. Where N is the length of `s`.
- Space Optimization: We use frequency counters to track characters, keeping the space complexity at O(M). Where M is t's unique characters.
- Dynamic-Window Adjustment: We realize the smallest valid substring efficiently without redundant checks. We do so by shrinking or expanding the window.
Next Steps
Try out the following edge cases:
1) `t` > `s` situations.
2) Empty strings
3) Concatenation of substrings
4) Strings/substring with/without repeated characters (longest is also applicable)
So, what are you waiting for? Start coding, experiment with the solution, and watch your problem-solving skills soar!