Word Break: From Brute Force to Optimal Dynamic Programming

 

When tackling problems involving strings and word dictionaries, the Word Break problem is a common and critical example used in algorithm interviews and real-world natural language processing tasks.

As a domain expert in algorithms and systems engineering, I will walk you through the Word Break problem step-by-step—from naive brute-force to top-down memoization and finally to optimized bottom-up dynamic programming.

Along the way, I will simplify complex concepts without sacrificing depth, to ensure you truly understand how each approach works and how to build intuition for similar problems.

The Word Break problem asks: Given a string s and a dictionary of words, can s be segmented into a space-separated sequence of one or more dictionary words?


Brute Force: Trying Every Break Point

The most naive approach is to try splitting the string at every possible point and recursively check if the resulting parts are in the dictionary.

def word_break(s, wordDict):
    def backtrack(start):
        if start == len(s):
            return True
        for end in range(start + 1, len(s) + 1):
            if s[start:end] in wordDict and backtrack(end):
                return True
        return False
    return backtrack(0)

print(f"Expected:True, Actual:", word_break("leetcode", ["leet", "code"]))
print(f"Expected:True, Actual:", word_break("applepenapple", ["apple", "pen"]))
print(f"Expected:False, Actual:", word_break("catsandog", ["cats", "dog", "sand", "and", "cat"]))
print(f"Expected:False, Actual:", word_break("a"*500 + "b", ["a", "aa", "aaa", "aaaaaaa"]))
print(f"Expected:True, Actual:", word_break("a"*100000, ["a", "aa", "aaa", "aaaaaaa"]))

Why It's Bad:

  • Time Complexity: Exponential, O(2^n) in the worst case.

  • Problem: It recalculates results for the same substring multiple times.


Top-Down Dynamic Programming with Memoization

We can reduce redundant work by storing the results of subproblems. This approach combines recursion with caching.

def word_break(s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)
    memo = {}

    def can_segment(index: int) -> bool:
        if index == len(s):
            return True
        if index in memo:
            return memo[index]

        for end in range(index + 1, len(s) + 1):
            if s[index:end] in word_set and can_segment(end):
                memo[index] = True
                return True

        memo[index] = False
        return False

    return can_segment(0)


# --- Example Usage ---
# Example 1:
s1 = "leetcode"
wordDict1 = ["leet", "code"]
print(f"'{s1}' with {wordDict1}: {word_break(s1, wordDict1)}") # Expected: True

# Example 2:
s2 = "applepenapple"
wordDict2 = ["apple", "pen"]
print(f"'{s2}' with {wordDict2}: {word_break(s2, wordDict2)}") # Expected: True

# Example 3:
s3 = "catsandog"
wordDict3 = ["cats", "dog", "sand", "and", "cat"]
print(f"'{s3}' with {wordDict3}: {word_break(s3, wordDict3)}") # Expected: False

# Example 4: Long string to highlight memoization benefits
s4 = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab"
wordDict4 = ["a", "aa", "aaa", "aaaa", "aaaaa", "aaaaaa", "aaaaaaa", "aaaaaaaa", "aaaaaaaaa", "aaaaaaaaaa"]
print(f"'{s4[:20]}...' with {wordDict4}: {word_break(s4, wordDict4)}") # Expected: False (due to 'b')

Why It's Better:

  • Time Complexity: O(n^2)

  • Space Complexity: O(n) for the memo and call stack

  • Trick Used: Memoization to avoid redundant computations


Bottom-Up Dynamic Programming

Instead of recursion, we build a solution iteratively using a DP array where dp[i] is True if s[:i] can be segmented.

def word_break(s, wordDict):
    word_set = set(wordDict)
    dp = [False] * (len(s) + 1)
    dp[0] = True

    for i in range(1, len(s) + 1):
        for j in range(i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break

    return dp[-1]

Optimization Insight:

  • Time Complexity: O(n^2)

  • Space Complexity: O(n)

  • Data Structures: Set for fast lookups and a dp[] array


Recognizing the Pattern

This problem belongs to the Dynamic Programming - 1D DP with index tracking pattern. Here's how to spot it:

  1. You're trying to solve a problem where the current solution depends on previous subproblems.

  2. You're asked if something can be constructed from parts (words, substrings, coins, etc.).

  3. Recursive backtracking leads to recalculating the same values repeatedly.

  4. Substrings can be checked for validity independently.


How to Come Up With the Solution

Ask yourself:

  • Can I divide the problem into subproblems?

  • Do those subproblems overlap?

  • Can the solution to the current state be built from past results?

Once you realize the answer is yes, you’re staring at a dynamic programming problem.

Use a dp[] array to track what you know, and build it step-by-step.


Visual Example

Given s = "applepenapple", wordDict = ["apple", "pen"]

Step-by-step dp[] filling:

Initial: dp = [True, False, False, ..., False]

Check every i:
- i=5, s[0:5] = 'apple' in dict, dp[5]=True
- i=8, s[5:8] = 'pen' in dict and dp[5]=True -> dp[8]=True
- i=13, s[8:13] = 'apple' in dict and dp[8]=True -> dp[13]=True

Final dp = [True, ..., True]


A Real Test of Robustness

Test Case:

s = "a" * 100000
wordDict = ["a", "aa", "aaa", ..., "aaaaaaa"]
  • Brute-force dies

  • Top-down survives due to memo

  • Bottom-up thrives with linear performance


Take This with You

If you're solving problems that involve choosing whether to include parts of a sequence, suspect dynamic programming. If recursion is involved and you're repeating work, think memoization. If you can track progress linearly, think bottom-up.

For Word Break, remember:

  • Convert word list to a set for fast lookup

  • Use a dp[] array to store valid breakpoints

  • Avoid string slicing unless necessary

Ready to master similar problems like Coin Change, Decode Ways, or Sentence Segmentation?

Apply this exact flow, and soon you'll solve them on instinct.


What You Should Do Next

  1. Try rewriting the word_break problem using your own logic.

  2. Tackle similar problems from LeetCode using the same bottom-up approach.

  3. Subscribe to algorithm newsletters or problem-of-the-day challenges.

  4. Revisit this code after a week and explain it to someone else (or your rubber duck).

You’re not just memorizing code—you’re training your pattern recognition and algorithmic thinking.

That’s how great engineers grow!

Join the Discussion

No comments yet. Be the first to share your thoughts!