Content
Trees Binary Trees Tree Traversal Search Binary Search Tree AVL Trees Complexity Analysis <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 05/31/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Marcus Darden and David Paoletti. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
i. Introduction
A graph consist of nodes (sometimes called vertices) connected together by edges.
Each node can contian some data.
A tree is:
a connected graph (nodes + edges) without cycles
a graph where any 2 nodes are connected by a unique shortest path (the two defenitions are equivalent)
In a directed tree, we can identify child and parent relationships between nodes.
In a binary tree, a node has at most two children.
ii. Types of Trees
(Simple) Tree
Acyclic connected graph
Considered undirected
Rooted Tree
A simple tree with a selected node (root)
All edges are directed away from root
Any node could be selected as root.
iii. Tree Terminology
iv. Trees are Recursive Structures
Any subtree is just as much a "tree" as the original!
v. Tree Properties
Height
height(empty) = 0
height(node) = max(height(left_child), height(right_child)) + 1
Size
size(empty) = 0
size(node) = size(left_child) + size(right_child) + 1
Depth
depth(empty) = 0
depth(node) = depth(parent) + 1
i. Introduction
Ordered Tree: linear ordering for the children of each node
Binary Tree: ordered tree in which every node has at most two children
Multiple binary tree implementation styles
ii. Complete Binary Tree Property
Definition: A binary tree with depth d where
Tree depths 1, 2, ..., d - 1 have the max number of nodes possible
All internal nodes are to the left of the external nodes at depth d - 1
That is, all leaves are at d - 1 ot leftmost at depth d
iii. Binary Tree Implementation
Binary Tree Array Implementation
Root at index 1
Left child of node i at 2 * i
Right child of node i at (2 * i) + 1
Some indices may be skipped
Can be space prohibitive for sparse trees
Complexity Analysis
Implementation | Big-O |
---|---|
Insert Key (best case) | $$O(1)$$ |
Insert Key (worst case) | $$O(n)$$ |
Remove Key (worst case) | $$O(n)$$ |
Parent | $$O(1)$$ |
Child | $$O(1)$$ |
Space (best case) | $$O(n)$$ |
Space (worst case) | $$O(2^n)$$ |
Binary Tree Pointer-based Implementation
template <class KEY>
struct Node {
KEY key;
Node *left = nullptr;
Node *right = nullptr;
Node(const KEY &k) : key{k} {}
}; // Node{
A node contains some information, and points to its left child node and right child node
Efficient for moving down a tree from parent to child
Complexity Analysis
Implementation | Big-O |
---|---|
Insert Key (best case) | $$O(1)$$ |
Insert Key (worst case) | $$O(n)$$ |
Remove Key (worst case) | $$O(n)$$ |
Parent | $$O(n)$$ |
Child | $$O(1)$$ |
Space (best case) | $$O(n)$$ |
Space (worst case) | $$O(n)$$ |
Another way to do it (not common)
template <class KEY>
struct Node {
KEY key;
Node *parent, *left, *right;
}; // Node()
If node is root, then *parent
is nullptr
If node is external, then *left
and *right
are nullptr
iv. Translating General Trees into Binary Trees
T: General tree
T': Binary tree
Intuition:
– Recurse from $v_2$ Left: new "generation"; Right: sibling
Systematic method of processing every node in a tree
Preorder
Visit node
Recursively visit left subtree
Recursively visit right subtree
1 template <class KEY>
2 struct Node {
3 KEY key;
4 Node *left = nullptr;
5 Node *right = nullptr;
6 };
7
8 void preorder(Node *p) {
9 if (!p) return;
10 visit(p->key);
11 preorder(p->left);
12 preorder(p->right);
13 } // preorder(
Inorder
Recursively visit left subtree
Visit node
Recursively visit right subtree
1 template <class KEY>
2 struct Node {
3 KEY key;
4 Node *left = nullptr;
5 Node *right = nullptr;
6 };
7
8 void inorder(Node *p) {
9 if (!p) return;
10 preorder(p->left);
11 visit(p->key);
12 preorder(p->right);
13 } // inorder(
Posteorder
Recursively visit left subtree
Recursively visit right subtree
Visit node
1 template <class KEY>
2 struct Node {
3 KEY key;
4 Node *left = nullptr;
5 Node *right = nullptr;
6 };
7
8 void inorder(Node *p) {
9 if (!p) return;
10 preorder(p->left);
11 preorder(p->right);
12 visit(p->key);
13 } // postorder(
Level order
Summary:
Depth-first traversal (use Stack, always start from the left, then right)
Preorder
Inorder
Postorder
Breadth-first traversal (use Queue)
Iterative Travesal
1 void levelorder(Node *p) {
2 if (!p) return;
3 queue<Node *> q;
4 q.push(p);
5 while (!q.empty()) {
6 Node *n = q.front();
7 q.pop();
8 cout << n->key << " ";
9 if (n->left) q.push(n->left);
10 if (n->right) q.push(n->right);
11 }
12 }
Retrieval of a particular piece of information from large volumes of previously stored data
Purpose is typically to access information within the item (not just the key)
Recall that arrays, linked lists are worst-case O(n) for either searching or inserting
Even a hash table has worst-case O(n)
Need a data structure with optimal efficiency for searching and inserting
Symbol Table: ADT
Also may want construct, test if empty, destroy, copy..
The keys in a binary search tree satisfy the Binary Search Tree Property
The key of any node is:
$<$ keys of all nodes in its left subtree
$\ge$ keys of all nodes in its right subtree
insert()
is as easy to implement as search()
Traversal
Sort
Just do a inorder traversal.
Search
1 // return a pointer to node with key k if
2 // one exists; otherwise, return nullptr
3 Node *tree_search(Node *x, KEY k) {
4 while (x != nullptr && k != x->key) {
5 if (k < x->key)
6 x = x->left;
7 else
8 x = x->right;
9 } // while
10 return x;
11 } // tree_search()
To search for a key k, we trace a downward path starting at the root
The next node visited depends on the outcome of the comparison of k with the key of the current node
Complexity Analysis:
Complexity is $O(h)$, where h is the (maximum) height of the tree
Average case complexity: $O(\log n)$ (balanced tree)
Worst case complexity: $O(n)$ ("stick" tree)
Insert
Start at the root, and trace a path downwards, looking for a null pointer to append the node.
When we have duplicates, we need a deterministic policy
<=
, >
) or (>
, >=
). The SLT uses (>
, >=
).Sample Insert Code:
1 void tree_insert(Node *&x, KEY k) {
2 if (x == nullptr)
3 x = new Node(k);
4 else if (k < x->key)
5 tree_insert(x->left, k);
6 else
7 tree_insert(x->right, k);
8 } // tree_insert(
New node inserted at leaf
Note the use of reference-to-pointer-to-Node
Complexity Analysis
The complexity if insert (and many other tree functions) depends on the height of the tree
Average-case (balanced): $O(\log n)$
Random data
Likely to be well-balanced
Worst-case (unbalanced "stick"): $O(n)$ (there are about $2^{n-1}$ possible worst cases, where $n$ is the number of keys)
Remove
To remove node z:
z has no children (trivial)
z has no left child: replace z by right child
z has two children
Replace with a "combined" tree of both
Key observation
All in LHS subtree < all in RHS subtree
Transplant smallest RHS node to root
Called the inorder successor
Must be some such node, since RHS is not empty
New root might have a right child, but no left child
Make new root's left child the LHS subtree
Sample Code
1 template <class T>
2 void BinaryTree<T>::remove(Node *&tree, const T &val) {
3 Node *nodeToDelete = tree;
4 Node *inorderSuccessor;
5 // Recursively find the node containing the value to remove
6 if (tree == nullptr)
7 return;
8 else if (val < tree->value)
9 remove(tree->left, val);
10 else if (tree->value < val)
11 remove(tree->right, val);
12 else {
13 // Check for simple cases where at least one subtree is empty
14 if (tree->left == nullptr) {
15 tree = tree->right;
16 delete nodeToDelete;
17 } // if
18 else if (tree->right == nullptr) {
19 tree = tree->left;
20 delete nodeToDelete;
21 } // else if
22 else {
23 // Node to delete has both left and right subtrees
24 inorderSuccessor = tree->right;
25 while (inorderSuccessor->left != nullptr)
26 inorderSuccessor = inorderSuccessor->left;
27 // Replace value with the inorder successor’s value
28 nodeToDelete->value = inorderSuccessor->value;
29 // Remove the inorder successor from right subtree
30 remove(tree->right, inorderSuccessor->value);
31 } // else
32 } // else
33 } // BinaryTree::remove()
Self-balancing Binary Search Tree
Named for Adelson-Velsky, and Landis
Start with a BST
Add the Height Balanace Property
For every internal node v of T, the heights of the children of c differ by at most 1
Use rotations to correct imbalance
Worst-case search / insert is not $O(\log n)$
Tree Height
Measured upward from leaf nodes; all of which have height equal to 1
Independent from depth
Recursive formula for a recursive data structure
height(empty) = 0
height(node) = max(height(children) + 1)
Proof:
$h$ is the height of tree
$n(h)$ is the minimum number nodes in AVL tree of height $h$
$n(0) = 0$ (empty tree)
$n(1) = 1$ (root-only tree)
For $h > 1$, an AVL tree of height $h$ contains:
Root Node
AVL subtree of height $h - 1$
AVL subtree of height $h - 2$
Thus for $h - 1$, $n(h) = 1 + n(h - 1) + n(h - 2)$
Knowing $n(h - 1) > n(h - 2)$ and $n(h) > 2n(h - 2)$, then by induction, $n(h) > 2^i n(h - 2i)$
Closed form solution, $n(h) > 2^{h/2 - 1}$
Taking logarithms: $h < 2 \log n(h) + 2$
Thus the height of the tree is $O(\log n)$
AVL Tree Algorithm
Search is the same as a BST
Sort (same as BST)
Insert node one at a time
Perform an inorder traversal
Insert
Basic idea:
Insert like a BST
Rearrange tree to balance height
Each node records it height
Can compute a node's balance factor
balance(n) = height(n->left) - height(n->right)
A node that is AVL-balanced:
balance(n) = 0
: both subtree equal
balance(n) = 1
: left taller by one
balance(n) = -1
: right taller by one
|balance(n)| > 1
: node is out of balance
Example:
Rotation
We use rotations to rebalance the binary tree
Swap a parent and one of its children
BUT preserve the BST ordering property
Rotation is a local change involving only three pointers and two nodes
We use rotations to rebalance the binary tree
Interchange the role of a parent and one of its children in a tree ...
Preserve the BST ordering among the keys in the nodes
The second part is tricky
Right rotation: copy the right pointer of the left child to be the left pointer of the old parent
Left rotation: copy the left pointer of the right child to be the right pointer of the old parent
Insert
Four cases
Single left rotation
Single right rotation
Doubld rotation right-left
RR(c)
RL(a)
Double rotation left-right
RL(a)
RR(c)
Checking and Balancing
Pseudocode to check and balance
Algorithm checkAndBalance(Node *n)
if balance(n) > +1
if balance(n->left) < 0
rotateL(n->left)
rotateR(n)
else if balance(n) < -1
if balance(n->right) > 0
rotateR(n->right)
rotateL(n)
Outermost if:
Question: Is node out of balance?
Answer: > +1: left too big; < -1: right too big
Inner ifs:
Question: Do we need a double rotation?
Answer: Only if signs disagree
As insert finishes, after every recursive call, update height of current node, hen call checkAndBalance()
on every node along the insertion path
Time complexity: Constant time at every node
Number of nodes are needed to be fixed after insertion: it depends (sometimes it can be 0 at most 1)
AVL Tree Removal
Remove like a BST
Key observation: All keys in LHS $\le$ all keys in RHS
Rearrange RHS so that its smallest node is root
Must be some such node, since RHS is not empty
New RHS root has a right child, but no left child
Make the RHS root's left child the LHS root
Rearrange tree to balance height
Travel up the tree from the parent of removed node
At every unbalanced node encountered, rotate as needed
This restructuring may unbalance multiple ancestors, so continue check and rebalance up to the root.
Rebalancing: 1 or More?
A “fix” (single or double rotation) shortens a subtree that is too tall
When inserting, the new node is the source of any imbalance, and a single fix will counteract it and repair the entire tree
When removing, a deleted node can only create an imbalance by making subtrees shorter
If a shortened subtree was already shorter than its sibling, a fix is needed at a higher level, so multiple fixes may be required
Binary Search Tree
AVL Tree
Worst case insert or search is $O(\log n)$
Must guarantee height balance property
Operations
Search: $O(\log n)$ (same algorithm as BST, but faster)
Insert: $O(\log n)$ (starts like BST, then may rebalance)
Remove: $O(\log n)$ (starts like BST, then may rebalance)
Sort: $O(n \log n)$ to build the tree, $O(n)$ to do inorder traversal
Rotation
$O(1)$ cost for single rotation
$O(1)$ cost for double rotation