Binary Search Tree: Insert and Contains Implementation video (41 minutes) (Spring 2021)
A Tree is a graph of nodes with edges to child nodes.
Trees are acyclic and hierarchical.
Simple file systems (without symlinks) and family trees are two examples of trees while city intersections and the roads that connect them are not trees.
A binary tree is a tree in which each node has 0, 1, or 2 child nodes (0, 1, or 2 edges further down in the tree).
A linked list can be thought of a unary tree, which each node having a number of 0 or 1 child nodes.
A Binary Search Tree (BST) is a binary tree with a certain organization characteristic.
Everything to the left of a node is smaller. Everything to the right of a node is bigger.
Searching a balanced BST is efficient. In the worst case, you travel to the bottom of a tree. Similar to binary search, since each node in a balanced BST has 2 children, the height of a BST having n elements is log2(n).
If a BST is un-balanced, search is less efficient. If a BST has only right children, the height of the BST with n elements is n. And you would have to visit each element to search in the worst case, making search O(n).
This is a worst case for BST insertion! If you are inserting elements that are pre-sorted, the BST will become an expensive linked list, with all right children, or all left children. The height of the BST having n elements would be n. Both .insert()
and .search()
would be O(n).
A balanced tree has an enforcement that its height, with n elements, is close to log2(n). In other words, the size of a left subtree is roughly the size of a right subtree.
A balanced BST underlies the implementations of std::set
and std::map
. This gives std::set
and std::map
efficient insert and contains operations, that are O(log(n)).
The below code examples are the simplest implementations of BST in this set of videos. This code will be expanded, for Nodes to contain both a Key and a Value (and thus, mimic the API of std::map
), in later videos. This code can be thought to mimic the API of std::set
.
.print()
, .contains()
, and .insert()
follow a common pattern. For the BinarySearchTree class itself, there is an entry point function, that calls a Node-scoped version of the function, on the root node of the tree.
.contains()
and .insert()
use a pre-order processing order. Data in the node is examined first. If needed, recursive calls to the left, then right, subtrees are made.
.print()
prints the values in a BST out in order. Thus, .print()
uses an in-order processing order. First, the left subtree is recursively examined. Then, data in the node is examined. Finally, the right subtree is recursively examined.
#ifndef BST_H_
#define BST_H_
#include <iostream>
using std::cout;
using std::endl;
template <typename T>
class BinarySearchTree {
struct Node {
T value;
Node *left, *right;
Node(const T& value) : value(value), left(nullptr), right(nullptr) {}
void print() {
if (left != nullptr) {
left->print();
}
cout << value << endl;
if (right != nullptr) {
right->print();
}
}
bool contains(const T& v) {
if (v == value) {
return true;
} else if (v < value) {
if (left == nullptr) {
return false;
} else {
return left->contains(v);
}
} else { // if (v > value)
if (right == nullptr) {
return false;
} else {
return right->contains(v);
}
}
}
bool insert(const T& new_value) {
if (new_value == value) {
return false;
} else if (new_value < value) { // Need to insert to the left
if (left == nullptr) {
left = new Node(new_value);
return true;
} else {
return left->insert(new_value);
}
} else { // if (new_value > value) Need to insert to the right
if (right == nullptr) {
right = new Node(new_value);
return true;
} else {
return right->insert(new_value);
}
}
}
};
Node *root;
public:
BinarySearchTree() : root(nullptr) {}
~BinarySearchTree() {
delete root;
}
void print() {
if (root != nullptr) {
root->print();
}
}
bool insert(const T& value) {
if (root == nullptr) {
root = new Node(value);
return true;
} else {
return root->insert(value);
}
}
bool contains(const T& value) {
if (root == nullptr) {
return false;
} else {
return root->contains(value);
}
}
};
#endif /* BST_H_ */
bst_insert_contains_demo.cpp#include <iostream>
#include "bst.h"
using namespace std;
int main() {
BinarySearchTree<int> bst;
bst.insert(11);
bst.insert(55);
bst.insert(99);
bst.print();
cout << endl;
for (int x : {0, 11, 33, 55, 88, 99, 1000}) {
cout << "bst.contains(" << x << ") == " << bst.contains(x) << endl;
}
return 0;
}
$ clang++ -pedantic -Wall -lm -std=c++20 -o bst_insert_contains_demo bst_insert_contains_demo.cpp
$ ./bst_insert_contains_demo
11 55 99 bst.contains(0) == 0 bst.contains(11) == 1 bst.contains(33) == 0 bst.contains(55) == 1 bst.contains(88) == 0 bst.contains(99) == 1 bst.contains(1000) == 0
Thanks to Brian Foster for typing up the rough draft of these notes based on the lecture video.