Word Break: From Brute Force to Optimal Dynamic Programming
When tackling problems involving strings and word dictionaries, the Word Break problem is a common and critical example used in algorithm interviews and real-world natural language processing tasks.
As a domain expert in algorithms and systems engineering, I will walk you through the Word Break problem step-by-step—from naive brute-force to top-down memoization and finally to optimized bottom-up dynamic programming.
Along the way, I will simplify complex concepts without sacrificing depth, to ensure you truly understand how each approach works and how to build intuition for similar problems.
The Word Break problem asks: Given a string s
and a dictionary of words, can s
be segmented into a space-separated sequence of one or more dictionary words?
Brute Force: Trying Every Break Point
The most naive approach is to try splitting the string at every possible point and recursively check if the resulting parts are in the dictionary.
def word_break(s, wordDict):
def backtrack(start):
if start == len(s):
return True
for end in range(start + 1, len(s) + 1):
if s[start:end] in wordDict and backtrack(end):
return True
return False
return backtrack(0)
print(f"Expected:True, Actual:", word_break("leetcode", ["leet", "code"]))
print(f"Expected:True, Actual:", word_break("applepenapple", ["apple", "pen"]))
print(f"Expected:False, Actual:", word_break("catsandog", ["cats", "dog", "sand", "and", "cat"]))
print(f"Expected:False, Actual:", word_break("a"*500 + "b", ["a", "aa", "aaa", "aaaaaaa"]))
print(f"Expected:True, Actual:", word_break("a"*100000, ["a", "aa", "aaa", "aaaaaaa"]))
Why It's Bad:
-
Time Complexity: Exponential,
O(2^n)
in the worst case. -
Problem: It recalculates results for the same substring multiple times.
Top-Down Dynamic Programming with Memoization
We can reduce redundant work by storing the results of subproblems. This approach combines recursion with caching.
def word_break(s: str, wordDict: List[str]) -> bool:
word_set = set(wordDict)
memo = {}
def can_segment(index: int) -> bool:
if index == len(s):
return True
if index in memo:
return memo[index]
for end in range(index + 1, len(s) + 1):
if s[index:end] in word_set and can_segment(end):
memo[index] = True
return True
memo[index] = False
return False
return can_segment(0)
# --- Example Usage ---
# Example 1:
s1 = "leetcode"
wordDict1 = ["leet", "code"]
print(f"'{s1}' with {wordDict1}: {word_break(s1, wordDict1)}") # Expected: True
# Example 2:
s2 = "applepenapple"
wordDict2 = ["apple", "pen"]
print(f"'{s2}' with {wordDict2}: {word_break(s2, wordDict2)}") # Expected: True
# Example 3:
s3 = "catsandog"
wordDict3 = ["cats", "dog", "sand", "and", "cat"]
print(f"'{s3}' with {wordDict3}: {word_break(s3, wordDict3)}") # Expected: False
# Example 4: Long string to highlight memoization benefits
s4 = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab"
wordDict4 = ["a", "aa", "aaa", "aaaa", "aaaaa", "aaaaaa", "aaaaaaa", "aaaaaaaa", "aaaaaaaaa", "aaaaaaaaaa"]
print(f"'{s4[:20]}...' with {wordDict4}: {word_break(s4, wordDict4)}") # Expected: False (due to 'b')
Why It's Better:
-
Time Complexity:
O(n^2)
-
Space Complexity:
O(n)
for the memo and call stack -
Trick Used: Memoization to avoid redundant computations
Bottom-Up Dynamic Programming
Instead of recursion, we build a solution iteratively using a DP array where dp[i]
is True if s[:i]
can be segmented.
def word_break(s, wordDict):
word_set = set(wordDict)
dp = [False] * (len(s) + 1)
dp[0] = True
for i in range(1, len(s) + 1):
for j in range(i):
if dp[j] and s[j:i] in word_set:
dp[i] = True
break
return dp[-1]
Optimization Insight:
-
Time Complexity:
O(n^2)
-
Space Complexity:
O(n)
-
Data Structures:
Set
for fast lookups and adp[]
array
Recognizing the Pattern
This problem belongs to the Dynamic Programming - 1D DP with index tracking pattern. Here's how to spot it:
You're trying to solve a problem where the current solution depends on previous subproblems.
You're asked if something can be constructed from parts (words, substrings, coins, etc.).
Recursive backtracking leads to recalculating the same values repeatedly.
Substrings can be checked for validity independently.
How to Come Up With the Solution
Ask yourself:
-
Can I divide the problem into subproblems?
-
Do those subproblems overlap?
-
Can the solution to the current state be built from past results?
Once you realize the answer is yes, you’re staring at a dynamic programming problem.
Use a dp[]
array to track what you know, and build it step-by-step.
Visual Example
Given s = "applepenapple"
, wordDict = ["apple", "pen"]
Step-by-step dp[]
filling:
Initial: dp = [True, False, False, ..., False]
Check every i:
- i=5, s[0:5] = 'apple' in dict, dp[5]=True
- i=8, s[5:8] = 'pen' in dict and dp[5]=True -> dp[8]=True
- i=13, s[8:13] = 'apple' in dict and dp[8]=True -> dp[13]=True
Final dp = [True, ..., True]
A Real Test of Robustness
Test Case:
s = "a" * 100000
wordDict = ["a", "aa", "aaa", ..., "aaaaaaa"]
-
Brute-force dies
-
Top-down survives due to memo
-
Bottom-up thrives with linear performance
Take This with You
If you're solving problems that involve choosing whether to include parts of a sequence, suspect dynamic programming. If recursion is involved and you're repeating work, think memoization. If you can track progress linearly, think bottom-up.
For Word Break, remember:
-
Convert word list to a
set
for fast lookup -
Use a
dp[]
array to store valid breakpoints -
Avoid string slicing unless necessary
Ready to master similar problems like Coin Change, Decode Ways, or Sentence Segmentation?
Apply this exact flow, and soon you'll solve them on instinct.
What You Should Do Next
-
Try rewriting the
word_break
problem using your own logic. -
Tackle similar problems from LeetCode using the same bottom-up approach.
-
Subscribe to algorithm newsletters or problem-of-the-day challenges.
-
Revisit this code after a week and explain it to someone else (or your rubber duck).
You’re not just memorizing code—you’re training your pattern recognition and algorithmic thinking.
That’s how great engineers grow!
Join the Discussion
No comments yet. Be the first to share your thoughts!