Mastering the Sliding Window: Count Distinct Elements in Every Window of Size K
When it comes to efficient algorithm design, few concepts are as powerful and widely used as the sliding window pattern. In this article, we break down the problem of counting distinct elements in every window of size K using a brute-force approach, and then we optimize it for real-world scalability. Let’s get to the code — and the clarity.
Problem Statement
Input: An array of integersarr
, and an integerk
.
Output: For each contiguous subarray (or window) of sizek
, return the number of distinct elements in that window.
Example:
Input: arr = [1,2,1,3,4,2,3], k = 4
Output: [3, 4, 4, 3]
Brute Force Approach (Simple but Slow)
from typing import List
def count_distinct_elements(nums: List[int], k: int) -> List[int]:
res = []
for i in range(len(nums) - k + 1):
number = len(set(nums[i:i+k])) # create a set to count distincts in each window
res.append(number)
return res
Time Complexity:
-
Creating a new
set
in every iteration takesO(k)
time. -
Loop runs
n - k + 1
times --- Overall: O(n * k)
Why it's inefficient:
-
Recounts from scratch in each window.
-
Not suitable for large arrays (slow for
n ~ 10^6
).
Optimized Sliding Window with Hash Map
Instead of re-counting everything, we slide the window by 1 element and update a hash map (dictionary) to keep track of frequencies efficiently.
from typing import List
def count_distinct_elements(nums: List[int], k: int) -> List[int]:
res = []
freq = {}
# Initialize the first window
for i in range(k):
freq[nums[i]] = freq.get(nums[i], 0) + 1
res.append(len(freq))
# Slide the window
for r in range(k, len(nums)):
left_elem = nums[r - k] # element leaving the window
right_elem = nums[r] # element entering the window
# Remove the left element
freq[left_elem] -= 1
if freq[left_elem] == 0:
del freq[left_elem]
# Add the right element
freq[right_elem] = freq.get(right_elem, 0) + 1
res.append(len(freq))
return res
print(count_distinct_elements([1,1,1,1], 2))
print(count_distinct_elements([1,2,1,3,4,2,3], 4))
print(count_distinct_elements(list(range(10**5)), 100))
print(count_distinct_elements([1]*500000 + list(range(500000)), 1000))
print(count_distinct_elements(list(range(10**6)), 100))
Time Complexity:
-
Each insert and delete is O(1) using a hash map.
-
Total: O(n)
Trigger Questions to Recognize This Pattern
-
Are you asked to compute something in every window of size K?
-
Can the result be updated by adding 1 item and removing 1 item?
-
Are duplicate elements involved (e.g. count, frequency)?
If YES to the above = Sliding Window + Hash Map is likely the way to go.
Toolbox: Why the Optimal Solution Works
1. Key Data Structure:
-
A dictionary (hash map) to track counts of elements inside the window.
2. Pattern Used:
-
Sliding Window — adjust a fixed-size window while updating state incrementally.
3. How to Recognize It:
-
Any question that needs a moving summary/statistic across
K
elements.
4. How to Come Up With This:
-
Think: "Can I avoid recomputing the whole thing each time by just adjusting for one in and one out?"
A Better Way to Count Every Time
The brute-force method teaches us what’s happening — but the optimized hash-map approach shows us how to scale. Mastering the sliding window and hash frequency maps is key to efficient subarray analysis.
Next time you see a problem asking about subarrays of size k
, ask yourself: "Can I slide my way to the answer?"
What to do next
-
Implement this sliding window pattern in 3 similar problems.
-
Memorize the hash map update logic.
-
Practice with variations like max sum in window, longest unique substring, or count of anagrams.
You’re now sliding smarter!
Join the Discussion
No comments yet. Be the first to share your thoughts!