Suffix Trie Construction
This solution implements a data structure called a Suffix Trie, which is used for efficiently storing and searching for substrings within a given string.
Problem Solution
The Suffix Trie is designed to solve the problem of efficiently storing all the suffixes of a given string and quickly determining if a given substring exists within the original string.
class suffixTrie:
def __init__(self, string):
self.root ={}
self.endSymbol="*"
self.populateSuffixTrieFrom(string)
# O(n^2) time | O(n^2) space
def populateSuffixTrieFrom(self, string):
for i in range(len(string)):
self.insertSubstringStartingAt(i, string)
def insertSubstringStartingAt(self, i, string):
node = self.root
for j in range(i, len(string)):
letter =string[j]
if letter not in node:
node[letter] ={}
node =node[letter]
node[self.endSymbol] = True
# O(m) time | O(1) space
def contains(self, string):
node = self.root
for letter in string:
if letter not in node:
return False
node =node[letter]
return self.endSymbol in node
Solution Analysis
I will now walk you through how this solution works.
Initialization (__init__)
Initializes the Suffix Trie with an empty root node and an end symbol to mark the end of each substring. The constructor then calls the populateSuffixTrieFrom method to populate the trie with all the suffixes of the input string.
Populate Suffix Trie (populateSuffixTrieFrom)
This method iterates over each character in the input string and calls the insertSubstringStartingAt method to insert all suffixes starting from that character into the trie.
Insert Substring (insertSubstringStartingAt)
This method inserts a substring starting from a given index i into the trie. It iterates over the characters of the substring and creates new nodes in the trie as necessary.
Contains (contains)
This method checks whether a given substring exists in the trie. It iterates over the characters of the substring and traverses the trie accordingly. If the substring is found in the trie, it returns True; otherwise, it returns False.
Time Complexity
The time complexity of populating the suffix trie(populateSuffixTrieFrom) is O(n^2), where n is the length of the input string. This is because each character of the string is considered as the starting point of a suffix, and for each starting point, all subsequent characters are inserted into the trie.
The time complexity of inserting a substring(insertSubstringStartingAt) into the trie is O(m), where m is the length of the substring being inserted.
The time complexity of searching for a substring(contains) in the trie is O(m), where m is the length of the substring being searched for.
Space Complexity
The space complexity of the suffix trie is O(n^2), where n is the length of the input string. This is because the trie may store up to n^2 nodes in the worst case, where each node represents a character in a substring of the input string.
Application of the Suffix Trie
The Suffix Trie data structure provides an efficient way to store all suffixes of a string and quickly search for substrings within the original string. It is particularly useful in applications such as text processing, pattern matching, and bioinformatics.
Unit Tests
Here are some unit tests to test the robustness of this solution. These tests cover various scenarios including:
Testing if existing substrings are found in the trie.
Testing if non-existing substrings are not found in the trie.
Testing for edge cases such as empty strings, single character strings, and large strings.
import unittest
class TestSuffixTrie(unittest.TestCase):
def setUp(self):
self.trie = suffixTrie("banana")
def test_contains_existing_substrings(self):
self.assertTrue(self.trie.contains("banana"))
self.assertFalse(self.trie.contains("ban"))
self.assertTrue(self.trie.contains("ana"))
self.assertTrue(self.trie.contains("nana"))
self.assertTrue(self.trie.contains("a"))
def test_contains_non_existing_substrings(self):
self.assertFalse(self.trie.contains("apple"))
self.assertFalse(self.trie.contains("bana"))
self.assertFalse(self.trie.contains("ananas"))
self.assertFalse(self.trie.contains(""))
def test_contains_empty_string(self):
empty_trie = suffixTrie("")
self.assertFalse(empty_trie.contains("")) # Empty string should be present in the trie
self.assertFalse(empty_trie.contains("a")) # Any non-empty string should not be present in the trie
def test_contains_edge_cases(self):
# Test for single character
single_char_trie = suffixTrie("x")
self.assertTrue(single_char_trie.contains("x"))
self.assertFalse(single_char_trie.contains("y"))
# Test for large string
large_str_trie = suffixTrie("abcdefghijklmnopqrstuvwxyz")
self.assertTrue(large_str_trie.contains("wxyz"))
self.assertTrue(large_str_trie.contains("xyz"))
self.assertTrue(large_str_trie.contains("z"))
self.assertFalse(large_str_trie.contains("aaaaa"))
if __name__ == '__main__':
unittest.main()