Modified Quicksort in D


Posted:

In this post , we saw a simple textbook quicksort. Here, we add two improvements. First, we use insertion sort for sorting small subfiles instead of making as many recursive calls to quicksort. Second, we select the pivot using the "median of three" method instead of picking the middle element as the pivot.

import std.algorithm;
import std.stdio;

For the partioning, we first select the median of the first, last, and middle elements of the target array. The steps of doing this affect how the algorithm performs. First, we move the middle element to the next to last location of the array. Then, we swap the elements if needed so that the first is the smallest of the three, the last is the largest, and the next to last is the median.

Then, the partitioning mostly is the same as it was before, except we only partition the subarray between the first and last element, since they are already in valid positions.

We can reuse the old partition code, but we have to remember to offset the return value by one, since we are sending array[1 .. k] into partition, not the full array slice.

This is a place where the array slicing may be more confusing then using explicit indices for the range we want to partition. Trying that approach may be an interesting experiment to learn which is easier.

ulong medianOfThreePartition(T)(T[] array) {
  assert(array.length > 2);

  ulong i = 0;
  ulong j = array.length - 2;
  ulong k = array.length - 1;

  swap(array[array.length/2], array[j]);

  if(array[i] > array[j]) { swap(array[i], array[j]); }
  if(array[j] > array[k]) { swap(array[j], array[k]); }
  if(array[i] > array[j]) { swap(array[i], array[j]); }

  return 1 + partition(array[1 .. k]);
}

Also, we need an insertion sort for the small subfile sorting. This is pretty straightforward, but we are swapping the item to move it into place. If we knew we were sorting something cheap to copy, there may be a more efficient way to do that. Maybe.

void insertionsort(T)(T[] array) {
  foreach(j; 1 .. array.length) {
    // array[0..j] is already sorted, so insert array[j] into the
    // right place.
    ulong i = j;
    while(i > 0 && array[i-1] > array[i]) {
      swap(array[i], array[i-1]);
      --i;
    }
  }
}

Here is the new quicksort function. We have a lower limit on the size of arrays that we will pass on to the medianOfThree function. For anything smaller, we call insertionsort.

T[] quicksort(T)(T[] array) {
  const ulong RECURSION_LIMIT = 30;
  if(array.length > RECURSION_LIMIT) {
    // Move the middle element to the end to act as the pivot.
    swap(array[$/2], array[$-1]);
    auto p = medianOfThreePartition(array);
    // Index p is in position, so now recursively sort the other two
    // halves of the array.
    quicksort(array[0 .. p]);
    quicksort(array[p + 1 .. $]);
  } else {
    insertionsort(array);
  }
  return array;
}

The old test drivers and timing code work well with this code. The only change is renaming the partition function.

This method is slower than the overly simple method of always partitioning about the middle element of the array. I need to look into it more, but I have a few other things to do first. The biggest thing is to implement a fat partitioning function for handling equal items efficiently. Also, making the sort generic on the comparison function looks like it should be easy too.