Merge Sort: Recurrence Relation

Merge Sort: Implementation and Recurrence Relation video (41 minutes) (Spring 2021)

Two algorithms are described in this video: Merge, which merges two sorted lists into a new sorted list, and Merge Sort, which uses Merge to sort an unsorted list.

Merge

The Merge algorithm merges two sorted lists into a new sorted list.

The heads of the two lists are compared, and lowest value is taken.

This repeats until one of the lists is depleted.

Then, the other list is copied into the result list.

Merge is not a recursive algorithm. It has linear running time, or O(n), where n is "the size of the two lists"

// Overall, this algorithm is O(L + R) = O(n)
vector<string> merge(
    const vector<string>::iterator left_low, const vector<string>::iterator left_high,
    const vector<string>::iterator right_low, const vector<string>::iterator right_high) {
  // O(1)
  vector<string> result;
  auto left_position = left_low;
  auto right_position = right_low;
  // If L is the size of the left range (left_high - left_low)
  // and R is the size of the right range (right_high - right_low),
  // then this loop is O(L + R) = O(n)
  while (left_position != left_high && right_position != right_high) {
    if (*left_position < *right_position) {
      result.push_back(*left_position);
      ++left_position;
    } else {
      result.push_back(*right_position);
      ++right_position;
    }
  }
  // O(L)
  while (left_position != left_high) {
    result.push_back(*left_position);
    ++left_position;
  }
  // O(R)
  while (right_position != right_high) {
    result.push_back(*right_position);
    ++right_position;
  }
  return result;
}

Merge Sort

Merge Sort is a recursive algorithm that uses the Merge algorithm to sort a SINGLE list. It does this by recursively splitting the single list in half, recursively calling itself on the two halves. Then, each call to merge sort (on progressively larger lists, as the recursive stack closes out) merges the left and right halves of its range.

// T(0) = 1
// T(1) = 1
// T(N) = 1 + N + 2 * T(N / 2)
void merge_sort(vector<string>::iterator low, vector<string>::iterator high) {
  // The few lines below are O(1)
  if (high - low <= 1) {
    return;
  }
  auto middle = low + (high - low) / 2;
  // The recursive calls to merge_sort add "2 * T(N / 2)" to T(N)
  merge_sort(low, middle);
  merge_sort(middle, high);
  // The merge step below adds O(N) steps to T(N)
  // Merge the left and right halves.
  vector<string> merged = merge(low, middle, middle, high); // O(N)
  // The copy step below adds O(N) steps to T(N)
  // Copy merged elements into [low, high); O(N)
  auto it = low;
  for (const string& value : merged) {
    *it = value;
    ++it;
  }
}

For an empty or single element list, merge_sort simply returns. so T(0) = T(1) = 1.

In the general recursive case, there is a constant time middle iterator calculation.

There are two recursive calls, to inputs half the size of original input.

There is a linear time section, of calling merge (O(n)) and copying the values returned from merge to the original input range, via iterator dereferencing.

So, T(n) = 1 + n + 2 * T(n / 2)

Base cases

T(0) = 1

T(1) = 1

General case

T(n) = N + 2 * T(n / 2) (** in this relation, we dropped the " + 1" constant term for simplicity - it won't be the dominating term.)

Expanding this relation out a few iterations to spot a pattern:

T(n) = n + 2T(n / 2)
T(n) = n + 2((n / 2) + T(n / 2) / 2)
T(n) = n + n + 4T(n / 4)
T(n) = 2n + 4T(n / 4)
T(n) = 2n + 4((n / 4) + 2T((n / 4) / 2))
T(n) = 2n + n + 8T(n / 8)
T(n) = 3n + 8T(n / 8)
T(n) = 3n + 8((n / 8) + 2T((n / 8) / 2))
T(n) = 3n + n + 16T(n / 16)
T(n) = 4n + 16T(n / 16)
T(n) = 4n + 16((n / 16) + 2T((n / 16) / 2))
T(n) = 4n + n + 32T(n / 32)
T(n) = 5n + 32T(n / 32)

The leading term seems to be increasing with k. The denominator under n seems to be increasing with 2^k.

T(n) = kn + (2^k)T(n / (2^k))

Now we try to write T(n) in terms of things we already know, like T(0) or T(1). We can do this by solving for k such that T(n / (2^k)) = T(0) or T(n / (2^k)) = T(1). Solving for n / (2^k) = 1 is easier, so we do that:

n / (2^k) = 1
n = (2^k)
log2(n) = k

We can substitute this into the relation:

T(n) = kn + (2^k)T(n/(2^k))
T(n) = (log2(n))n + (2^(log2(n)))T(n/(2^(log2(n))))
(Exponential Identity: 2^(log2(n)) = n)
T(n) = (log2(n))n + (n)T(n/(n))
T(n) = (log2(n))n + (n)T(1)
T(n) = (log2(n))n + (n) * 1
T(n) = n * log2(n) + n
O(T(n)) = O(n * log2(n) + n)
O(T(n)) = O(n * log2(n))
O(T(n)) = O(nlog(n))

So merge sort is O(nlog(n)), or linearithmic complexity.

Example code

The complete example code, with merge, merge_sort, and client code:

simple_merge_sort.cpp

#include <iostream>
#include <vector>

using std::cout;
using std::endl;
using std::string;
using std::vector;


// Overall, this algorithm is O(L + R) = O(n)
vector<string> merge(
    const vector<string>::iterator left_low, const vector<string>::iterator left_high,
    const vector<string>::iterator right_low, const vector<string>::iterator right_high) {
  // O(1)
  vector<string> result;
  auto left_position = left_low;
  auto right_position = right_low;
  // If L is the size of the left range (left_high - left_low)
  // and R is the size of the right range (right_high - right_low),
  // then this loop is O(L + R) = O(n)
  while (left_position != left_high && right_position != right_high) {
    if (*left_position < *right_position) {
      result.push_back(*left_position);
      ++left_position;
    } else {
      result.push_back(*right_position);
      ++right_position;
    }
  }
  // O(L)
  while (left_position != left_high) {
    result.push_back(*left_position);
    ++left_position;
  }
  // O(R)
  while (right_position != right_high) {
    result.push_back(*right_position);
    ++right_position;
  }
  return result;
}


// T(0) = 1
// T(1) = 1
// T(N) = 1 + N + 2 * T(N / 2)
void merge_sort(vector<string>::iterator low, vector<string>::iterator high) {
  // The few lines below are O(1)
  if (high - low <= 1) {
    return;
  }
  auto middle = low + (high - low) / 2;
  // The recursive calls to merge_sort add "2 * T(N / 2)" to T(N)
  merge_sort(low, middle);
  merge_sort(middle, high);
  // The merge step below adds O(N) steps to T(N)
  // Merge the left and right halves.
  vector<string> merged = merge(low, middle, middle, high); // O(N)
  // The copy step below adds O(N) steps to T(N)
  // Copy merged elements into [low, high); O(N)
  auto it = low;
  for (const string& value : merged) {
    *it = value;
    ++it;
  }
}


int main() {
  vector<string> words;
  words.push_back("cherry");
  words.push_back("durian");
  words.push_back("apple");
  words.push_back("egg plant");
  words.push_back("banana");

  merge_sort(words.begin(), words.end());
  for (const string& w : words) {
    cout << w << endl;
  }
  cout << endl;
  return 0;
}
$ clang++ -pedantic -Wall -lm -std=c++20 -o simple_merge_sort simple_merge_sort.cpp
$ ./simple_merge_sort
apple
banana
cherry
durian
egg plant

Thanks

Thanks to Brian Foster for writing up the initial draft of these notes.