Data Structures and Algorithms

The Prefix Sums Technique

1 week, 5 days ago ; F(visit_count) + Value(1) views
Share this

The Prefix Sums Technique

When dealing with large datasets, computing range sums efficiently is always a challenge.

The Prefix Sums technique is a powerful algorithmic approach.
 
It precomputes cumulative sums to speed up queries from linear time O(n) to constant time O(1).

The prefix sums technique is essential in  financial data analysis, optimizing database queries and competitive programming.

As an expert in performance optimization and algorithmic problem-solving, I’ll break down Prefix Sums in a simple, structured way and provide optimized implementations in Python with clear explanations.

What is the Prefix Sums Technique?

Prefix Sums answer range sum queries efficiently. 

We store a prefix sum array, where each index holds the sum of all previous elements. 

As a result, no recalculating sums. 

That means, we can retrieve any subarray sum in O(1).

How it Works

Given an array array, we define the prefix sum array as:

# "to in the expression means ..."
# minus means -
prefix[j]=array[0]+array[1]+ to +array[j minus 1]\text{prefix}

The sum of any subarray from the leftmost index (L) to the rightmost index (R) is computed as follows:

sum(L,R)=(prefix[R+1]) minus (prefix[L])

Implementing Prefix Sums in Python

Let's dive in.

Step 1: Building the Prefix Sum Array

from typing import List

def compute_prefix_sums(arr: List[int]) -> List[int]:

    """
    Computing the prefix sums of a given array.
    
    Args:
      arr(List): List of integers
    Returns: 
      List of prefix sums
    """

    n = len(arr)
    prefix = [0] * (n + 1) # add an extra space for easier calculations
  
    for i in range(n):
       prefix[i + 1] = prefix[i] + arr[i]
  
    return prefix

Step 2: Answering Range Sum Queries

def get_range_sum(prefix: List[int], L: int, R: int) -> int:

   """
   Calculate the sum of elements from index L to R (inclusive).
   
   Args:
     prefix(int): Precomputed prefix sums array
     L(int): Starting index of the range
     R(int): Ending index of the range

   Returns: 
     The sum of elements from index L to R
   """

   return prefix[R + 1] - prefix[L]

 

Step 3: Testing the Implementation

# Example Usage
arr = [8, 1, 7, 0, 1, 8, 2, 3]

prefix = compute_prefix_sums(arr)

# Query sum of elements from index 2 to 5
result = get_range_sum(prefix, 2, 5)

print(result) # Output: 16 (7 + 0 + 1 + 8)

Applications of Prefix Sums

1. Efficient Range Sum Queries

Use the prefix technique in financial applications. Especially in situations where cumulative transactions need fast aggregation.

2. Finding Subarrays with a Given Sum

The aim is to establish how many of the subarrays sum to a specific target  provided.

from collections import defaultdict

def count_subarrays_with_sum(arr: List[int], target: int) -> int:

   """
   Counts the number of subarrays whose sum equals the target value.

   Args:
     arr(List): List of integers
     target(int): Target sum
   Returns: 
     Number of subarrays with the given sum
  """

   prefix = 0
   prefix_count = defaultdict(int)
   # Base case
   prefix_count[0] = 1 
   count = 0
  
   for num in arr:
       prefix += num
       count += prefix_count[prefix - target]
       prefix_count[prefix] += 1
  
   return count

 

3. Checking for Equilibrium Index

The equilibrium index is when the sum of the left half equals the sum of the right half.

def find_equilibrium_index(arr: List[int]) -> List[int]:

   """
   Finds all indices where the left sum equals the right sum.

   Args:
      arr(List): List of integers
  
   Returns: 
      List of equilibrium indices
   """

   total_sum = sum(arr)
   left_sum = 0
   equilibrium_indices = []
  
   for i, num in enumerate(arr):
       if left_sum == total_sum - left_sum - num:
          equilibrium_indices.append(i)
          left_sum += num
  
   return equilibrium_indices

 

Advantages of Using Prefix Sums?

  • Pre-processing Time is O(n)
  • Query Time is O(1)
  • Efficient for Large Datasets: It is ideal for handling multiple queries on large data arrays.

Where to Go from Here?

The Prefix Sums technique is at the core of optimizing array computations. 

Now that you understand its power, try applying it to problems like range updates, histogram analysis, and rolling averages.

Next Steps: Explore Fenwick Trees (Binary Indexed Trees) and Segment Trees for even more advanced query handling!

 

Become a member
Get the latest news right in your inbox. We never spam!

Read next

Finding the Maximum Number of Vowels in a Substring

Finding the Maximum Number of Vowels in a Substring   … Read More

6 days, 17 hours ago . 196 views