Chapter 9: Trees

Introduction to Hierarchical Data

In our journey through data structures, we have primarily focused on linear arrangements like arrays and linked lists. While these are excellent for simple sequences, they struggle to represent the complex, multi-level relationships we see in the real world. A tree is a non-linear data structure that organizes information into a hierarchy. Imagine a family tree, a company's organizational chart, or the folder structure on your computer; these are all natural examples of trees. In computer science, a tree consists of entities called nodes, which are connected by links called edges. Every tree starts with a single starting point called the root, and from there, it branches out into children, grandchildren, and so on, forming a structure that looks like an upside-down biological tree.

Trees are essential because they combine the best of both worlds: the ordered nature of an array and the flexibility of a linked list. When properly organized, trees allow us to search for data, insert new items, and delete old ones much faster than a simple list would. For instance, in a well-balanced tree containing a million items, you might only need to look at 20 nodes to find exactly what you are looking for. This efficiency is why trees are the backbone of modern databases, file systems, and the complex algorithms used in artificial intelligence and network routing.

Essential Tree Terminology

To work effectively with trees, you must master a specific vocabulary that describes the relationships between nodes. The Root is the absolute top node of the tree and is the only node that has no parent. Any node directly below another node is called its Child, and the node above it is its Parent. Nodes that share the same parent are known as Siblings. As we move down the tree, we eventually reach nodes that have no children at all; these are called Leaf Nodes.

Beyond these basic relationships, we also measure the size and depth of a tree. The Depth of a node is the number of edges from the root to that specific node. The Height of a node is the number of edges on the longest path from that node down to a leaf. The Height of the Tree itself is the height of the root node. We also talk about a Path, which is a sequence of nodes and edges connecting a starting node to a destination. Understanding these measurements is critical because the performance of tree algorithms often depends directly on the tree's height.

Binary Trees: The Power of Two

A Binary Tree is a specialized type of tree where each node can have at most two children, typically referred to as the "left child" and the "right child." This simple restriction makes binary trees incredibly easy to implement and analyze. In code, a binary tree node is usually represented as an object or a structure containing a data value and two pointers (or references) to its children. If a child doesn't exist, the pointer is set to null.

Binary trees come in several important "shapes." A Full Binary Tree is one where every node has either zero or two children—never just one. A Complete Binary Tree is entirely filled at every level, except possibly the very last level, which must be filled from left to right. Finally, a Perfect Binary Tree is a mathematical beauty where all interior nodes have exactly two children and all leaves are at the exact same level. The total number of nodes in a perfect binary tree of height h is always $2^{h+1} - 1$ .

# Python Implementation of a Binary Tree Node
class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

# Building a simple tree:
#       1
#      / \
#     2   3
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)

// TypeScript Implementation
class BinaryNode<T> {
    value: T;
    left: BinaryNode<T> | null = null;
    right: BinaryNode<T> | null = null;

    constructor(value: T) {
        this.value = value;
    }
}

Binary Search Trees (BST): Intelligent Ordering

While a regular binary tree can store data in any order, a Binary Search Tree (BST) enforces a strict rule that makes it incredibly fast for searching. For any given node in a BST, every value in its left subtree must be smaller than the node's value, and every value in its right subtree must be larger. This property must hold true for every single node in the entire tree. Because of this ordering, searching for a value in a BST is very similar to using a physical dictionary: you look at the middle, decide if your target is "smaller" or "larger," and immediately discard half of the remaining possibilities.

The efficiency of a BST is directly tied to its "balance." If you insert numbers in a sorted order (like 1, 2, 3, 4, 5), the tree will grow into a long, thin line that acts just like a slow linked list. However, if the tree is "Balanced"—meaning the left and right sides are roughly equal in height—operations like searching, inserting, and deleting all take $O(\log n)$ time. This means that even with a billion items, you can find your target in about 30 steps.

Core Operations: Search and Insert

Searching in a BST is a recursive process. You compare the value you want with the root. If they match, you're done. If your value is smaller, you repeat the search in the left subtree; if it's larger, you go right. This continues until you either find the node or reach a null pointer, which means the value isn't in the tree. Insertion follows the exact same logic: you "search" for the value until you hit an empty spot (null), and that is exactly where the new node should be placed to maintain the BST property.

// JavaScript Implementation of BST Search and Insert
class BST {
    constructor() {
        this.root = null;
    }

    insert(value) {
        const newNode = { value, left: null, right: null };
        if (!this.root) {
            this.root = newNode;
            return;
        }
        let current = this.root;
        while (true) {
            if (value < current.value) {
                if (!current.left) {
                    current.left = newNode;
                    break;
                }
                current = current.left;
            } else {
                if (!current.right) {
                    current.right = newNode;
                    break;
                }
                current = current.right;
            }
        }
    }

    search(value) {
        let current = this.root;
        while (current) {
            if (value === current.value) return current;
            current = value < current.value ? current.left : current.right;
        }
        return null;
    }
}

The Complexity of Deletion

Deleting a node from a BST is more complex than inserting one because you must ensure the tree remains a valid BST after the node is gone. There are three scenarios to handle. First, if the node is a Leaf, you can simply remove it. Second, if the node has One Child, you just "skip" the node by connecting its parent directly to its child. The third and most difficult case is when the node has Two Children. In this situation, you cannot just remove it without creating a "hole" in the hierarchy. Instead, you find the In-order Successor (the smallest value in the right subtree), copy its value into the node you wanted to delete, and then delete that successor node instead. This clever swap ensures the ordering rules are never broken.

def delete_node(root, key):
    if not root:
        return root
    
    # 1. Navigate to the node
    if key < root.value:
        root.left = delete_node(root.left, key)
    elif key > root.value:
        root.right = delete_node(root.right, key)
    else:
        # 2. Found the node! Handle the cases:
        if not root.left: # No left child or no children
            return root.right
        elif not root.right: # Only left child
            return root.left
        
        # 3. Two children: Find the smallest in the right subtree
        temp = find_min(root.right)
        root.value = temp.value
        root.right = delete_node(root.right, temp.value)
        
    return root

def find_min(node):
    current = node
    while current.left:
        current = current.left
    return current

Advanced Tree Logic: LCA and Balance

As you become more comfortable with trees, you will encounter problems that require looking at the tree's structure as a whole. The Lowest Common Ancestor (LCA) of two nodes is the deepest node in the tree that has both nodes as descendants. In a BST, finding the LCA is easy: it is the first node you encounter whose value lies between the two nodes you are looking for. Another vital concept is Tree Balancing. Because skewed trees perform poorly ( $O(n)$ ), professional developers use specialized "Self-Balancing" trees like AVL Trees or Red-Black Trees. these structures automatically perform "Rotations" whenever an insertion or deletion makes one side significantly heavier than the other, keeping the tree's height—and your code's performance—at an optimal level.

Array Representation: The Compact Alternative

For a specific type of tree called a Complete Binary Tree, we can actually avoid using nodes and pointers entirely and store the data in a simple Array. This is incredibly memory-efficient. In this format, the root is at index 0. For any node at index $i$ , its left child is always at index $2i + 1$ and its right child is at $2i + 2$ . Its parent can be found at the floor of $(i-1) / 2$ . This clever mathematical mapping is the secret behind the Heap data structure, which is used to implement high-priority queues and the efficient Heap Sort algorithm.

// Finding parents and children in an array-based tree
const getLeftChild = (index: number) => 2 * index + 1;
const getRightChild = (index: number) => 2 * index + 2;
const getParent = (index: number) => Math.floor((index - 1) / 2);

// Array: [10, 20, 30, 40, 50]
// 10 is root (0)
// 20, 30 are children of 10
// 40, 50 are children of 20

Key Takeaway: Trees transform linear lists into efficient hierarchies. By mastering the Binary Search Tree property and understanding the importance of balance, you gain the ability to manage massive amounts of data with lightning-fast performance. In the next chapter, we will learn how to "walk" through these structures using tree traversals.