AVL Trees Big-O video (27 minutes) (Spring 2021)
The Big-O of BST operations correlates with the height of the BST.
"What is the worst case (tallest) height for n nodes?"
This is a difficult question to answer. But, the inverse question can be asked:
"For some height, what is the smallest number of nodes that a tree of that height can have?"
This will give the worst case, or tallest, height for that number of nodes.
To be an AVL tree, each node's subtree must have a left and right subtree that differ in height by no more than 1.
With that constraint in mind, we can plot out a few AVL trees, answering the question that we have posed:
"For some height, what is the smallest number of nodes that a tree of that height can have?"
height = 1 An AVL tree of height 1 can be built with a single node.
O
height = 2 An AVL tree of height 2 can be built with 2 nodes.
O
\
O
height = 2 An AVL tree of height 2 can be built with 2 nodes.
O
\
O
height = 3 An AVL tree of height 3 can be built with 4 nodes.
O
/ \
O O
\
O
height = 4 An AVL tree of height 4 can be built with 7 nodes.
O
/ \
O O
/ / \
O O O
\
O
"For some height, what is the smallest number of nodes that a tree of that height can have?"
We have started to collect data points to answer this question.
height | Nodes for this height
------------------------------
0 | 0
1 | 1
2 | 2
3 | 4
4 | 7
There is also a pattern to be seen in the graphs of these AVL trees. Look at the trees of height 4, 3, and 2.
The tree of height 4 consists of a root node, and a subtree of height 3, and a subtree of height 2. Both subtrees (of height 2 and 3), are the trees that we have graphed, that have the minimum number of nodes to attain this height.
An AVL tree of height h, consists of a root node, an AVL subtree of height h-1, and an AVL subtree of height h-2
Phrasing this as N(h), for, number of nodes for a given height, this can be termed as
N(h) = 1 + N(h - 1) + N(h - 2)
This is a recurrence relation! And it's a hard one to solve.
Solving it for several values:
h | N(h)
---------
0 | 0
1 | 1
2 | 2
3 | 4
4 | 7
5 | 12
6 | 20
7 | 33
8 | 54
There are two things to notice about this trend.
Similarly to the Fibonacci recursive function, the number of nodes for a given height grows by 1.6^h.
(The recurrence relation is expanded to prove/demonstrate this at 18:12 in the lecture video.)
So, n = 1.6^h
So, inversely, h = log(n).
The height of an AVL tree having n nodes is O(log(n)).
Thanks to Brian Foster for writing up these notes based on the video lectures.