PythonHaven | Removing Duplicates from a Sorted Array

Removing Duplicates from a Sorted Array in Python — From Brute Force to Optimal

As a seasoned software engineer and systems designer, I’ve solved thousands of algorithm problems and designed scalable systems across industries.

One deceptively simple problem that often appears in coding interviews and real-world data preprocessing is removing duplicates from a sorted array in-place.

Today, I’ll walk you through brute-force to optimized approaches — step-by-step — so that even a junior developer or a curious learner can grasp it easily.

Let’s dive into the best techniques for removing duplicates from a sorted array using Python.

Brute Force: Start Simple, Think Later

from typing import List

def remove_duplicates(nums: List) -> List:
    for i in range(len(nums) - 1, 0, -1):
        if nums[i] == nums[i - 1]:
            nums.pop(i)
    return nums

What it does

This code walks backward through the array and removes any element that matches its previous neighbor. Because the array is sorted, duplicates are always adjacent — perfect!

Why it's not great:

• Time Complexity: O(n²) in worst case due to repeated use of pop(i), which shifts elements.
• Not scalable: It will hang or slow down significantly for large arrays.

A More Optimized Version
Let’s make this better by avoiding costly pop() operations.

def remove_duplicates(nums: List) -> List:
    """
    Removes duplicates from a sorted list in-place and returns the new list of unique elements.
    
    Args:
        nums (List): A sorted list of elements.
    
    Returns:
        List: The list with duplicates removed in-place.
    """
    if not nums:
        return []

    # Pointer for where the next unique element should go
    j = 1

    for i in range(1, len(nums)):
        if nums[i] != nums[j - 1]:
            nums[j] = nums[i]
            j += 1

    return nums[:j]

How it works

    • We maintain a j pointer to track where the next unique element should go.
    • Every time we see a new number (i.e., not equal to the last inserted unique one), we copy it forward.
    • This ensures all unique values are moved to the beginning of the list, preserving order and using constant space.

Benefits

    • Time Complexity: O(n)
    • Space Complexity: O(1) extra space
    • In-place: No extra memory used for duplicate removal

Comprehensive Testing Using unittest

import unittest

class TestRemoveDuplicates(unittest.TestCase):
    
    def test_1(self):
        self.assertEqual([1, 2], remove_duplicates([1, 1, 2]))

    def test_2(self):
        self.assertEqual([1, 2, 3, 4], remove_duplicates([1, 2, 3, 4]))

    def test_3(self):
        self.assertEqual([2], remove_duplicates([2] * 10**5))

    def test_4(self):
        self.assertEqual(list(range(0, 10**6)), remove_duplicates(list(range(0, 10**6))))

    def test_5(self):
        self.assertEqual(list(range(10**6)), remove_duplicates([i // 2 for i in range(2 * 10**6)]))

if __name__ == "__main__":
    unittest.main()

What these tests check

    • Repeating numbers
    • Already unique arrays
    • Extremely large datasets (to stress-test performance)

Visual Breakdown of the Optimal Algorithm

Step                      i                 j                         nums                                        Action
1                           1                1                        [1,1,2]                                     Duplicate, skip
2                           2                1                        [1,1,2]                                     Unique, move 2 to index 1
Final                     -                 -                         [1,2]                                        Return first 2 elements

What to Do Next

Now that you’ve mastered removing duplicates in a sorted array:

    • Apply this logic to other in-place array manipulations.
    • Practice problems like "Remove Element", "Move Zeroes", or "Merge Sorted Arrays".
    • Optimize your algorithms for interviews and large-scale systems.

This pattern is foundational — not just for coding challenges, but also for clean, efficient data processing in real-world applications.

Removing Duplicates from a Sorted Array