Binary Search Tree Erase Implementation

Binary Search Tree Erase Implementation video (34 minutes) (Spring 2021)

Review of Cases to Consider

Our BinarySearchTree class mimics the std::map data structure, with value storage/retrieval by key. So, in addition to the 3 cases that were identified in the previous module (deleted node has 0, 1, or 2 children), there is a 4th case, where the key to be deleted is not found.

Cases when deleting a certain key:

Node with this key does not exist
Node has 0 children
Node has 1 child
Node has 2 children

Review of General Algorithm

From the previous module, we will replace a deleted node with the largest element in its left subtree.

By definition, this replacement node won't have any right subtree. (Since it's the largest element in the left subtree)

But it may have a left subtree!

This may lead to recursive deletes and replacements.

Example Tree

Here is an example of recursive deletes/replacements happening, as we recursively replace a delete node with the largest element in its left subtree.

We will delete W, the root node.

The original tree:

   W
 /   \
B     Z
 \
  E
 / \
C   P
   /
  G
   \
    J
     \
      L
     /
    K

In the above tree, several recursive deletes/replacements would happen if we were to delete the root node, W. 1. Delete W. W must be replaced with the largest node in its left subtree, which is P. 2. P must be replaced with the largest node in its left subtree, which is L. 3. L must be replaced with the largest node in its left subtree, which is K. 4. Replace L with K. 5. Replace P with L. 6. Replace W with P.

The resulting tree, after having deleted W:

   P
 /   \
B     Z
 \
  E
 / \
C   L
   /
  G
   \
    J
     \
      K

Example code

Full code files are attached to the module.

First, we have a helper function, to identify the largest node in a subtree.

const Node* largest_node() {
  Node *largest;
  for (largest = this; largest->right != nullptr; largest = largest->right) { }
  return largest;
}

Since we are replacing deleted nodes with the largest node in their LEFT subtree, the below function largest_node() is used as follows in the erase algorithm:

const Node* largest = node->left->largest_node();

The complete erase function:

The initial if/else if/ else if blocks handle key not found cases, as well as recursive calls deeper into the tree.

Once we have found a key match (in the final top-level else block), we need to handle the cases of 0, 1, or 2 child nodes. For the case of 1 child, there is additional branching to identify if the single child is the right or left child.

The last case, 2 children, is the most complex, and involves "largest node in left subtree" logic, with recursive deletes and replacements.

static int _erase(const Key& k, Node** node_ptr) {
  Node* node = *node_ptr;
  if (node == nullptr) {
    return 0;
  } else if (k < node->key) {
    return _erase(k, &node->left);
  } else if (k > node->key) {
    return _erase(k, &node->right);
  } else {  // if (k == node->key)
    if (node->left == nullptr && node->right == nullptr) {
      *node_ptr = nullptr;
      delete node;
      return 1;
    } else if (node->left == nullptr) {
      *node_ptr = node->right;
      delete node;
      return 1;
    } else if (node->right == nullptr) {
      *node_ptr = node->left;
      delete node;
      return 1;
    } else {  // node->left and node->right aren't nullptr
      const Node* largest = node->left->largest_node();
      node->key = largest->key;
      node->value = largest->value;
      return _erase(largest->key, &node->left);
    }
  }
}

Thanks

Thanks to Brian Foster for writing up these notes based on the video lectures.