Data Structures and Algorithms

The Frequency Counter Technique

21 hours, 37 minutes ago ; F(visit_count) + Value(1) views
Share this

Understanding the Frequency Counter Technique in Algorithm Design

The Frequency Counter technique is a powerful and efficient approach to solving problems requiring counting element occurrences in a dataset.

It is beneficial when dealing with arrays, strings, and other collections where we need to track element frequencies.

Instead of using nested loops (resulting in O(n^2) time complexity), this technique leverages hash maps (dictionaries in Python, objects in JavaScript) to achieve an optimal O(n) complexity.

 

How the Frequency Counter Works

  • Use a hash map (dictionary in Python) to count occurrences of elements.
  • Compare frequencies when needed (e.g., checking for anagrams and unique occurrences).
  • Avoid nested loops by utilizing the constant-time lookups of hash maps.

Example 1: Checking if Two Arrays Have the Same Frequency of Elements

Given two lists, check if one is a squared version of the other.

Examples:

  • [1, 2, 3, 2] and [4, 1, 9, 4] → True
  • [1, 2, 3] and [1, 9] → False

Naïve Approach (O(n^2) Complexity)

# Uses nested loops and remove() which makes it inefficient

# Time Complexity: O(n^2)

def same(arr1, arr2):

    if len(arr1) != len(arr2):

        return False

    for num in arr1:

        if num ** 2 in arr2:

            arr2.remove(num ** 2)

        else:

            return False

    return True

Optimized Approach (O(n) Complexity)

from collections import Counter

def same(arr1, arr2):

    if len(arr1) != len(arr2):

        return False
    

    freq1 = Counter(arr1)

    freq2 = Counter(arr2)
    

    # Check if squared numbers have the same frequency

    return all(freq1[num] == freq2[num * num] for num in freq1)

Example 2: Checking for Anagrams

Determine whether two strings are anagrams (contain the same characters in the same frequency).

Examples:

  • "listen" and "silent" → True
  • "hello" and "ollhe" → True
  • "rat" and "car" → False

Optimized Solution (O(n) Complexity)

def is_anagram(str1, str2):

    if len(str1) != len(str2):

        return False

    freq = {}

    for char in str1:

        freq[char] = freq.get(char, 0) + 1
    

    for char in str2:

        if char not in freq or freq[char] == 0:

            return False

        freq[char] -= 1
    

    return True

Example 3: Finding the First Non-Repeating Character

Find the first character in a string that does not repeat.

Examples:

  • "aabbccdee" → 'd'
  • "aabb" → None

Optimized Solution (O(n) Complexity)

def first_unique_char(s):

    freq = {}

    for char in s:

        freq[char] = freq.get(char, 0) + 1
    

    for char in s:

        if freq[char] == 1:

            return char

    return None

Example 4: Longest Substring with K Distinct Characters

Consider a string ,'s', and an integer, 'k', find the length of the longest substring having at most k distinct characters.

Optimized Solution (Sliding Window + Frequency Counter)

The optimized solution results in a time complexity of O(n)

def longest_substr_with_k_distinct_chars(s, k):

    """
    Finds the length of the longest substring in `s` having at most `k` distinct characters.

    Args:
        s (str): The input string.
        k (int): The maximum number of distinct characters allowed in the substring.

    Returns:
        int: The length of the longest substring with at most `k` distinct characters.

    Example:

        >>> longest_substr_with_k_distinct_chars("eceba", 2) 3 

        >>> longest_substr_with_k_distinct_chars("aabbcc", 2) 4 

        >>> longest_substr_with_k_distinct_chars("abcabcabc", 3) 9

    """

    freq = {} # Dictionary to store the frequency of characters in the current window

    left = 0 # Left pointer of the sliding window

    max_len = 0 # Variable to store the maximum length of the valid substring


    for right in range(len(s)):

        # Add the current character to the frequency dictionary
        freq[s[right]] = freq.get(s[right], 0) + 1


        # If the number of distinct characters exceeds `k`, move the left pointer
        while len(freq) > k:
            freq[s[left]] -= 1
            if freq[s[left]] == 0:
                del freq[s[left]] # Remove the character if its frequency drops to 0
            left += 1

        # Update the maximum length if the current window is valid
        max_len = max(max_len, right - left + 1)

    return max_len

#Example 1
print(f"Expected: 9->'abcabcabc' , Actual:", longest_substr_with_k_distinct_chars("abcabcabc", 3))


#Example 2
print(f"Expected: 3 ->'ece', Actual:",longest_substr_with_k_distinct_chars("eceba", 2))


#Example 3
print(f"Expected: 2->'aa' , Actual:", longest_substr_with_k_distinct_chars("aa", 1))


#Example 4
print(f"Expected: 4->'aabb' or 'bbcc' , Actual:", longest_substr_with_k_distinct_chars("aabbcc", 2))

 

Solution Walk-through

We aim to establish the length of the longest substring with at most k characters that are unique in a given string "s."

The method employs a sliding window technique with two pointers (left and right) to preserve the string's character window.

The window dynamically expands and contracts to guarantee that it has no more than k unique characters.

Key Steps

Frequency Dictionary:

  • A dictionary freq stores the frequency of characters in the current window.
  • This helps in tracking the number of distinct characters in the window.

Expand the Window:

  • The right pointer moves forward, adding characters to the window and updating their frequency in the dictionary.

Contract the Window:

  • If the number of distinct characters in the window exceeds k, the left pointer moves forward to shrink the window.
  • The frequency of the character at the left pointer is decremented, and if it drops to 0, the character is removed from the dictionary.

Update Maximum Length:

  • After ensuring the window is valid (contains at most k distinct characters), the length of the current window (right - left + 1) is compared with max_len, and max_len is updated if necessary.

Return the Result:

  • The function returns max_len, which holds the length of the longest valid substring.

Thought Process and Evaluation

Why Use a Sliding Window?

  • The sliding window technique is ideal for problems involving subarrays or substrings with specific constraints (e.g., maximum distinct characters).
  • It ensures the solution is efficient, with an O(n) time complexity. In the latter, n is the length of the string.

Why Use a Frequency Dictionary?

  • The frequency dictionary (freq) helps to track efficiently the count of characters that are distinct in the current window.
  • We can quickly determine if the window is valid by updating the frequency of characters as the window expands and contracts.

Time Complexity

  • The right pointer traverses the string once, and the left pointer moves forward only when necessary.
  • Each character is processed at most twice (once by the right pointer and once by the left pointer), resulting in an O(n) time complexity.

Space Complexity

  • The frequency dictionary stores k + 1 distinct characters at most, so the space complexity is O(k).

Example 5: Minimum Window Substring

Find the smallest substring in s that contains all characters of t.

Optimized Solution (Sliding Window + Frequency Counter)

This optimized version of the solution results in time complexity of O(n)

from collections import Counter

def min_window(s: str, t: str) -> str:

    """
    Finds the minimum window substring in `s` that contains all the characters of `t`.
    Uses 'sliding window' in determining the smallest substring in `s` that efficiently
    includes all characters from `t` (including duplicates).

    Args:
        s (str): The input string we search for the substring.
        t (str): The target string containing the required characters.

    Returns:
        str: The minimum window substring that contains all characters of `t`, or an empty string if no such substring exists.

    Example:
        >>> min_window("ADOBECODEBANC", "ABC")
        'BANC'
        >>> min_window("a", "a")
        'a'
        >>> min_window("a", "aa")
        ''
        >>> min_window("abc", "d")
        ''
        >>> min_window("aa", "aa")
        'aa'
        >>> min_window("ABAACBAB", "ABC")
        'ACB'


    Edge Cases Considered:
        - If `t` contains characters not in `s`, return `""`.
        - If `s` or `t` is empty, return `""`.
        - If `s` and `t` are identical, return `s`.
        - Handles cases where characters in `t` are repeated.

    """    

    if not s or not t:
        return ""

    target_freq = Counter(t) # Frequency of characters in t
    window_freq = {} # Frequency of characters in the current window
    left = 0 # Left boundary of the sliding window
    min_len = float('inf') # Track minimum window length
    min_window = "" # Store the minimum window substring
    required = len(target_freq) # Number of unique characters needed

    formed = 0 # Number of characters matched with required frequency

    for right, char in enumerate(s):
        # Expand the window by including the current character
        window_freq[char] = window_freq.get(char, 0) + 1
        
        # If the character is in t and its count in the window matches t's count, increase formed
        if char in target_freq and window_freq[char] == target_freq[char]:
            formed += 1

        # Try to shrink the window when all characters match
        while formed == required:
            # Update minimum window if a smaller valid substring is found
            if (right - left + 1) < min_len:
                min_len = right - left + 1
                min_window = s[left:right+1]            

            # Remove the leftmost character and move the window
            window_freq[s[left]] -= 1

            if s[left] in target_freq and window_freq[s[left]] < target_freq[s[left]]:
                formed -= 1 # Reduce matched character count

            left += 1 # Move the left boundary to shrink the window


    return min_window


assert min_window("ADOBECODEBANC", "ABC") == "BANC" # Standard case

assert min_window("a", "a") == "a" # Single character match

assert min_window("a", "aa") == "" # Not enough 'a' characters in `s`

assert min_window("xyzxyzxyzxyz", "xyz") == "xyz" # Repeated patterns in `s`

assert min_window("aaaaaaaaaaaabbbbbcdd", "abcdd") == "abbbbbcdd" # Large input case

assert min_window("abc", "d") == "" # `t` contains a character not in `s`

assert min_window("aa", "aa") == "aa" # `s` and `t` are identical

assert min_window("ABAACBAB", "ABC") == "ACB" # Smallest valid window



print("All test cases passed!")

Summary Table

Problem                                                                                           Approach                                                         Complexity

Longest Substring with K Distinct Chars                                   Sliding Window + Frequency Counter        O(n)

Checking for Anagrams                                                                Frequency Counter                                        O(n)

Finding First Non-Repeating Character                                     Frequency Counter                                        O(n)

Minimum Window Substring                                                       Sliding Window + Frequency Counter        O(n)

Conclusion

The Frequency Counter technique is a crucial optimization strategy that transforms brute-force O(n^2) solutions into more efficient O(n) solutions.

We can efficiently count occurrences, compare datasets, and solve complex problems like anagrams, unique elements, substrings, and windowed searches by leveraging hash maps.

Mastering this technique, sliding windows, prefix sums, and in-place hashing will significantly improve your problem-solving skills and optimize your algorithmic approach!

 

Become a member
Get the latest news right in your inbox. We never spam!

Read next

Implementing a Min Stack in Python

Implementing a Min Stack in Python: From Basic to Optimized &nbs;… Read More

1 day, 6 hours ago . 12 views

Python's any() Function

1 day, 7 hours ago . 10 views