DSA: Trie

Categories: Data Structure

What is a Trie in DSA?

A trie, often known as a prefix tree, is a tree-like data structure used for fast string searching and prefix matching.
It allows for the quick retrieval of words or keys connected with each node.
The word Trie is derived from reTRIEval, which means finding something or obtaining it.
In a trie, each node represents a character, and the edges connecting the nodes represent the characters in the words.
The root node represents an empty string, and each level of the trie corresponds to a character in the words.

Properties of the Trie for a set of the string:

Root Node: The root node of the trie represents the null node or an empty string. It does not store any character.
Sorted Children: Each child of a node in the trie is typically sorted alphabetically. This means that the children nodes are organized based on the characters they represent, usually in ascending order.
Maximum of 26 Children: Since a trie is commonly used for working with lowercase English letters, each node (except the root) can have a maximum of 26 children. These 26 children correspond to the 26 letters of the alphabet (A to Z).
Node Storage: Each node, except the root, stores one letter of the alphabet. As a string is inserted into the trie, each character in the string is represented by a separate node in the trie. The sequence of nodes from the root to a particular node represents a specific word or prefix.

Advantages of Trie Data Structure:

The trie data structure offers several advantages that make it a valuable choice in various scenarios:

Fast String Operations: Tries provide efficient operations for searching, insertion, deletion, and prefix matching.
Prefix Matching: Tries excel at prefix matching. Given a prefix, we can quickly find all the words in the trie that have that prefix.
Space Efficiency: Tries can be memory-efficient when dealing with numerous words that share common prefixes.
Efficient Pattern Matching: Tries can efficiently handle pattern-matching tasks. By representing a set of patterns in a trie, we can quickly search for words or strings that match any of the patterns.
Versatility: Tries are versatile and can handle a wide range of string-related tasks.
Ease of Implementation: Implementing a trie is relatively straightforward and does not require complex algorithms or data structures. The basic structure of a trie is intuitive and can be implemented using arrays, linked lists, or dictionaries.
Time Complexity Stability: The time complexity of trie operations is typically stable and does not depend on the size of the data set being stored.

Basic operations of Trie:

There are three operations in the Trie:

Insertion of a node :

In the Trie_node, each letter of the input key (word) is entered as an individual. It is worth noting that children point to the next level of Trie nodes.
The key character array serves as a child index.
Set the present node to the referred node if the present node already has a reference to the present letter. Otherwise, make a new node, set the letter to the current letter, and even begin the current node with this new node.
The depth of the trie is determined by the length of the character.

class TrieNode {
    TrieNode[] children;
    boolean isEndOfWord;

    public TrieNode() {
        children = new TrieNode[26]; // Assuming lowercase English letters
        isEndOfWord = false;
    }
}

class Trie {
    TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    public void insert(String word) {
        insertRecursive(root, word, 0);
    }

    private void insertRecursive(TrieNode node, String word, int index) {
        // Base case: If all characters of the word are processed
        if (index == word.length()) {
            node.isEndOfWord = true;
            return;
        }

        char currentChar = word.charAt(index);
        int childIndex = currentChar - 'a';

        // If the current character is not present in the trie, create a new node
        if (node.children[childIndex] == null) {
            node.children[childIndex] = new TrieNode();
        }

        // Recursively insert the remaining characters
        insertRecursive(node.children[childIndex], word, index + 1);
    }
}

The time complexity of inserting a word into a trie is O(L), where L is the length of the word being inserted. This is because we need to iterate through each character of the word and perform constant-time operations to traverse or create nodes in the trie.
The space complexity of the trie depends on the total number of nodes created to represent the inserted words. In the worst-case scenario, where no common prefixes exist among the words, the space complexity of the trie is O(N*M), where N is the number of inserted words and M is the average length of the words.

Searching a node:

The searching operation checks if a given word exists in the trie.
It involves traversing the trie from the root, following the edges corresponding to the characters in the word being searched.
If at any point a character is not found in the current node’s children, or the traversal reaches the end of the word but the flag indicating the end of the word is not set, it means that the word does not exist in the trie.
Otherwise, if the traversal reaches the end of the word and the end-of-word flag is set, the word is found.

class TrieNode {
    TrieNode[] children;
    boolean isEndOfWord;

    public TrieNode() {
        children = new TrieNode[26]; // Assuming lowercase English letters
        isEndOfWord = false;
    }
}

class Trie {
    TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    public boolean search(String word) {
        return searchRecursive(root, word, 0);
    }

    private boolean searchRecursive(TrieNode node, String word, int index) {
        // Base case: If all characters of the word are processed
        if (index == word.length()) {
            return node.isEndOfWord;
        }

        char currentChar = word.charAt(index);
        int childIndex = currentChar - 'a';

        // If the current character is not present in the trie, the word does not exist
        if (node.children[childIndex] == null) {
            return false;
        }

        // Recursively search for the remaining characters
        return searchRecursive(node.children[childIndex], word, index + 1);
    }
}

The time complexity of searching a word in a trie is O(L), where L is the length of the word being searched. This is because we need to traverse the trie by following the edges corresponding to the characters in the word. In the worst case, we need to visit each character in the word and perform constant-time operations for each traversal.
The space complexity of the trie searching code is O(1) because it does not use any additional space that scales with the input size. The space usage is constant regardless of the number of words or the length of the words being searched.

Deletion of a node:

A word is removed from the trie through the deletion procedure.
It entails traversing the trie to discover the node matching to the word’s last character. When a word is found, the end-of-word flag is unset to signify that it is no longer present.
If the nodes of the deleted word do not lead to any new words, they can be pruned (removed) to save space.
Pruning is usually done recursively, beginning with the final character of the word and moving up until you find a node with children or another word.

class TrieNode {
    TrieNode[] children;
    boolean isEndOfWord;

    public TrieNode() {
        children = new TrieNode[26]; // Assuming lowercase English letters
        isEndOfWord = false;
    }
}

class Trie {
    TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    public void delete(String word) {
        deleteRecursive(root, word, 0);
    }

    private boolean deleteRecursive(TrieNode node, String word, int index) {
        // Base case: If all characters of the word are processed
        if (index == word.length()) {
            if (!node.isEndOfWord) {
                // The word does not exist in the trie
                return false;
            }
            node.isEndOfWord = false;
            return isEmptyNode(node);
        }

        char currentChar = word.charAt(index);
        int childIndex = currentChar - 'a';

        // If the current character is not present in the trie, the word does not exist
        if (node.children[childIndex] == null) {
            return false;
        }

        boolean shouldDeleteChildNode = deleteRecursive(node.children[childIndex], word, index + 1);

        // If child node should be deleted, remove the reference from the current node
        if (shouldDeleteChildNode) {
            node.children[childIndex] = null;
            return isEmptyNode(node);
        }

        return false;
    }

    private boolean isEmptyNode(TrieNode node) {
        // Check if the node has any non-null children
        for (TrieNode child : node.children) {
            if (child != null) {
                return false;
            }
        }
        return true;
    }
}

The time complexity of deleting a word from a trie is O(L), where L is the length of the word being deleted. This is because we need to traverse the trie by following the edges corresponding to the characters in the word.
The space complexity of the trie deletion code is O(1) because it does not use any additional space that scales with the input size. The space usage is constant regardless of the number of words or the length of the words being deleted.

Note: also read about Trees: Lowest Common Ancestor

Follow Me

If you like my post please follow me to read my latest post on programming and technology.

https://www.instagram.com/coderz.py/

https://www.facebook.com/coderz.py

Rabecca Fatima

Staying up to the mark is what defines me. Hi all! I’m Rabecca Fatima a keen learner, great enthusiast, ready to take new challenges as stepping stones towards flying colors.