Algorithms

Longest Substring Without Duplicates

7 months, 4 weeks ago ; 108 views
Share this

Longest Substring Without Duplicates

In this question, I hope to help you find the longest substring without duplicates by first getting the conceptual overview right. I will then proceed to explore how I arrived at the solution and finally present the outcome in the form of a working code.

Conceptual Overview

The goal of this problem is to find the longest substring in a given string without repeating characters. In other words, we want to identify the maximum length substring where each character appears only once.

Approach to Solving the Problem

Sliding Window Technique

Use a sliding window approach to keep track of the current substring without duplication.  Maintain two pointers, startIdx and i, to represent the start and end of the current substring.

Hash Map (lastSeen)

Utilize a hash map (lastSeen) to store the index of the last seen occurrence of each character.

Iterate Through the String

Iterate through each character in the string.

If the character is already in lastSeen, update startIdx to be the maximum of its current value and the next index. Subsequently, check if the length of the current substring is longer than the longest found so far. If yes, update the longest substring.

Return the Result

Return the longest substring found.

Here, I present the actual implementation of this solution.

# O(n) time | O(n, a) space -a is length of alphabet represented by the input string
def longest_substring_without_duplication(string):
    """
    Find the longest substring without repeating characters in a given string.
    
    Parameters:
    - string (str): The input string.

    Returns:
    - str: The longest substring without repeating characters.
    
    """
    # Dictionary to store the index of the last seen occurrence of each character
    lastSeen = {}
    # List to store the start and end indices of the longest substring
    longest = [0, 1]
    # Variable to track the start index of the current substring
    startIdx = 0
    
    # Iterate through each character and its index in the string
    for i, char in enumerate(string):
        # Update start index if the character is already seen, ensuring no repetition
        startIdx = max(startIdx, lastSeen.get(char, -1) + 1)
        
        # Check if the current substring is longer than the longest found so far
        if longest[1] - longest[0] < i + 1 - startIdx:
            longest = [startIdx, i + 1]
        
        # Update the last seen index of the current character
        lastSeen[char] = i
    
    # Return the longest substring found
    return string[longest[0]:longest[1]]
    

print(longest_substring_without_duplication("clementisa cap"))

 

Code Overview

lastSeen Dictionary

  • Keeps track of the last seen index of each character to avoid repetition.

longest List

  • Stores the start and end indices of the longest substring found.

startIdx Variable

  • Represents the start index of the current substring without repetition.

Main Loop

  • Iterates through each character in the string.
  • Updates startIdx to avoid repetition.
  • Checks and updates the longest substring if needed.
  • Updates lastSeen with the current index of the character.

Result

  • Returns the longest substring found in the given string.


This solution efficiently finds the longest substring without repeating characters using a sliding window approach. The hash map helps optimize the process of checking for repeated characters

 

Time & Space Complexity

Let's analyze the time and space complexity of the solution above.


Time Complexity

The time complexity of the solution is O(N), where N is the length of the input string. Here's the breakdown:

The solution iterates through each character in the string exactly once, and for each character, it performs constant-time operations. Therefore, the time complexity is proportional to the length of the input string, O(N).

Space Complexity

The space complexity is O(min(N, M)), where N is the length of the input string, and M is the size of the character set (number of unique characters). Let me break it down for you:

lastSeen Dictionary

The space required for the lastSeen dictionary is proportional to the size of the character set (M). In the worst case, where all characters are unique, the size of lastSeen is O(M).

Other Variables

The space required for other variables (longest, startIdx, i, char) is constant and does not depend on the size of the input string.

Since M represents the number of unique characters, it is bounded by a constant, and in practice, it is often smaller than N. Therefore, the space complexity is effectively O(min(N, M)).

This solution is efficient and linear in terms of time complexity, making it suitable for handling strings of various lengths. The space complexity is also reasonable, especially when the number of unique characters is limited.

 

 

Become a member
Get the latest news right in your inbox. We never spam!

Read next

Island Perimeter

&nbsp;This solution seeks to find the perimeter of an island in a grid.&nbsp; The problem considers a grid where each cell represents land (1) or … Read More

Kibsoft 3 months, 3 weeks ago . 76 views

Pacific Atlantic Waterflow

3 months, 3 weeks ago . 82 views