Truly Understanding Mergesort

How it Works – Overview

Merge-sort is typically implemented as a recursive algorithm. Recursive call sort sub-ranges of the input array.

Initially, merge-sort splits the input array into two ranges of roughly the same length:

    sub-range 1
    vvvvv 
a = 7 2 6 4 3 8 9
          ^^^^^^^
          sub-range 2

Then it calls itself on the sub-range on the left-hand side and on the sub-range on the right-hand side in turn. Recursion magic makes sure each of the two sub-ranges is sorted in increasing order of values:

    sub-range 1 
    vvvvv 
a = 2 6 7 3 4 8 9
          ^^^^^^^
            sub-range 2

We now have two ranges that are sorted and we need to merge them together. This is the main part of merge-sort and it is what gives the algorithm its name.

To merge the two ranges, we use additional memory. We copy each range in its own temporary array. We forget about the contents of the array a, as it will be overwritten.

a = * * * * * * *
l = 2 6 7
r = 3 4 8 9 

We fill in the array a from left to right with the elements in increasing order. To do this, we need three pointers, each into one of the three arrays.

          v
a = x x x * * * *
l = x 6 7
      ^
r = x x 8 9
        ^

Everything strictly before the pointers into the temporary arrays has already been processed and saved into the array a. The pointer into a tells us the next position in a that we should fill in. The number of elements processed in l and in r always adds up to the number of elements in a that have been filled in.

The idea is that all elements that have already been placed into a are smaller than all elements that remain to be processed in both l and r. What remains to be done is to merge the remaining elements of l and r into the space remaining in a.

The current elements in l and r, respectively, compete for their place in the array a. In each step, the smallest of them wins and is placed in the next place avaiable in the array a and the pointers are updated accordingly.

            v
a = x x x 6 * * *
l = x 6 7
        ^
r = x x 8 9
        ^

Initially, all three pointers point to the first element into each array.

    v
a = * * * * * * *
l = 2 6 7
    ^
r = 3 4 8 9
    ^

We compare the current elements in l and r. As l and r are sorted, the first elements are the smallest in l and r, respectively. Therefore the smallest of the two is the smallest value overall and needs to go into the first position in a. We copy the smaller value into a and update the two pointers as needed.

      v
a = 2 * * * * * *
l = * 6 7
      ^
r = 3 4 8 9
    ^

We keep going for as long as the pointers into l and r point at valid elements:

        v          (choose 3 instead of 6)
a = 2 3 * * * * *
l = * 6 7
      ^
r = * 4 8 9
      ^

          v        (choose 4 instead of 6)
a = 2 3 4 * * * *
l = * 6 7
      ^
r = * * 8 9
        ^

            v      (choose 6 instead of 8)
a = 2 3 4 6 * * *
l = * * 7
        ^
r = * * 8 9
        ^


              v    (choose 7 instead of 8)
a = 2 3 4 6 7 * *
l = * * *
          ^
r = * * 8 9
        ^ 

At this point, the pointer into l points to an element just past the contents of l. This means that we have finished processing l. It remains to copy the rest of the values in r into the array a.

                  v 
a = 2 3 4 6 7 8 9
l = * * *
          ^
r = * * * *
            ^

We are done with merging the two sub-ranges and we have obtained the sorted array.

Page 1 – Introduction

Page 2 – How it Works

Page 3 – Implementation

Page 4 – Time Complexity