C quicksort

I have written quicksort in the past mostly to learn how it works and to play with optimization. (Because most of the time... I just want to call the system sort.)

Sometimes, knowing how this works helps with other problems, like sorting vectors, where we can do better than a full comparison on each step.

All of the other sorts I have written have been in a language that has at least some generic support. This time, though, it is in C, using void *'s.

And it wasn't that bad. This code is not fit for a system library, or anything like that, but it was fun(?) to write and test.

The code stops quicksorting when the subfile gets shorter than 10 lines, and it uses a median-of-three pivot selection.

In the testing I did, an iterative version of this algorithm showed no improvement.

void q4(void   *array,
        size_t  length,
        size_t  size,
        int(*compare)(const void *, const void *))
  if(length < 10) {
    for(void *a = array + size; a < array + size*length; a += size) {
      for(void *b = a;
          b > array && compare(b, b-size) < 0;
          b -= size) {
        swap(b, b-size, size);
  } else {
    void *middle = array + (length / 2) * size;
    void *outer = array + (length-1) * size;
    void *pivot = array + (length-2) * size;
    swap(middle, pivot, size);
    median_of_three(array, pivot, outer, size, compare);

    void *a = array - size;
    void * b = pivot;

    while(a < b) {
      a += size;
      for(;a < b && compare(a, pivot) < 0; a += size){}
      b -= size;
      for(;a < b && compare(pivot, b) < 0; b -= size){}
      if(a < b) {
        swap(a, b, size);

    assert(compare(a, pivot) >= 0);
    swap(a, pivot, size);
    q4(array, (a-array)/size, size, compare);
    q4(a + size, length - (a-array)/size-1, size, compare);

The median of three code leaves the smallest value at a, the middle at b, and the largest at c.

void median_of_three(void *a, void*b, void*c, size_t size,
                     int (*compare)(const void*, const void*)) {
  if(compare(a, b) > 0) { swap(a, b, size); }
  if(compare(b, c) > 0) {
    swap(b, c, size);
    if(compare(a, b) > 0) { swap(a, b, size); }
  assert(compare(a, b) <= 0);
  assert(compare(b, c) <= 0);

And the swap is not efficiency minded...

void swap(char *a, char *b, int size) {
  while(size > 0) {
    char t = *a;
    *a = *b;
    *b = t;
    ++a, ++b, --size;

In most cases, the system qsort on my machine crushes this code.

Customizing Nikola

This “blog” is written using Restructured Text and generated by Nikola. Simple customizations to Nikola are done by modifying files in files/assets/. The files in the files directory are moved verbatim into the output.

The CSS file files/assets/css/custom.css is loaded by the default theme, so simple customizations go there.

The files for the theme, set in the conf.py file, are also copied into the assets directory, as documented in the theme reference. Nikola has an interesting theme system, where a theme can have a parent theme. The files from the parent theme are used, unless the current theme defines the same file.

The base.tmpl file is where the html pages are defined, then the other specialized templates fill in the details.

To get information from the specific templates, like the post.tmpl, into the base, we can put a block in base.tmpl, and then set the values in the block in post.tmpl.

In base.tmpl:

<%block name="extra_information"></%block>

In post.tmpl:

<%block name="extra_information">
% if post.meta('use_extra_data'):
   <h2>  We have extra data!! </h2>
   Doing some special stuff here.
% endif

The data in post.meta comes from the meta data in the post file:

.. use_extra_data: Yes!

Something that tripped me up, is that the if test in the block above is testing if the string exists, not if it is equal to True.

The main reason I was interested in this was to customize on a post-by-post basis.

Pretty print logging in Python

From the Python logging documentation, we can send an arbitrary object into a log message call.

import pprint

class PrettyLog():
    def __init__(self, obj):
        self.obj = obj
    def __repr__(self):
        return pprint.pformat(self.obj)

Then, in the code, we can use it with logging.

import logging

# ...


The string isn't created until it is needed, so if the logging level is not high enough, the only method call is the creation of the PrettyLog object.

My blog process with org

I keep forgetting the steps I need to do for the blogging from this org file, so I am logging the steps here.

The directory organization looks like this:

  • posts

    • 2011

    • 2012

    • 2013

  • pages

  • static

The dates are mainly an organizational tool for me. Inside the dated folders, there are files named like 2012-01-01-an-example-post.org. I am currently using the date in the filename as the date of the post, without a time variable. The rest of the text in the name is again mainly for my organization. The title of the post is kept inside the org file.

The pages folder is for stuff that I don't really consider a post, things that are references that I may go back and update, or eventually get rid of.

For now, I am using some elisp in my emacs config file to create the post index from the files containing the posts. The code creates the listing on demand, instead of every time I publish -- that way I can review the changes before publishing.

Here is the code I use for creating the indices. It takes as a parameter the path to the directory that I want to index. Then, it finds all of the .org files, grabs their titles, and inserts them into the current buffer.

(defun org-dblock-write:my-get-posts (params)
  (let ((path (plist-get params :path)))
    (let ((files
           (mapcar (lambda (f)
                     (concat (file-name-as-directory path)
                   (directory-files path nil "\.*org$"))))
      (setq files (nreverse files))
      (dolist (file files)
        (let ((title (org-publish-find-title file)))
          (message file)
          (message title)
          (insert (format " - [[file:%s][%s]]" file title))

To use the code, I put blocks that look like this one into my index.org file.

** 2013
#+BEGIN: my-get-posts :path "posts/2013"

To run the snippet, inside of the block I use the command org-dblock-update, bound to C-c C-x C-u. Manually calling this is a pain, but then I can verify that the links/text look the way I want.

When I am all done, I call org-publish-project to create the output. Then, next, I look in the publishing directory, with git status to see what has changed. Finally, I commit, push to github, and view.

Stuff to do:
  • make the generated timestamp match the file name, instead of constantly updating the stamp.

  • get more details of the config it here.

Updating Emacs packages

I use ELPA, Marmalade, and MELPA to manage Emacs extensions.

This block of elisp in my config file adds all of the package repositories I use.

(setq package-archives
      '(("gnu" . "http://elpa.gnu.org/packages/")
        ("marmalade" . "http://marmalade-repo.org/packages/")
        ("melpa" . "http://melpa.milkbox.net/packages/")))

Occasionally, I must run M-x package-list-packages to update then pacakge listing. This creates a *Packages* buffer that shows the available and installed packages. In this buffer, type U to mark the upgradeable packages. (note, plan to restart emacs after this process...) Emacs will ask for confirmation on the installation of the new packages, then confirmation on the deletion of the old packages.

Modified Quicksort in D

In this post , we saw a simple textbook quicksort. Here, we add two improvements. First, we use insertion sort for sorting small subfiles instead of making as many recursive calls to quicksort. Second, we select the pivot using the "median of three" method instead of picking the middle element as the pivot.

import std.algorithm;
import std.stdio;

For the partioning, we first select the median of the first, last, and middle elements of the target array. The steps of doing this affect how the algorithm performs. First, we move the middle element to the next to last location of the array. Then, we swap the elements if needed so that the first is the smallest of the three, the last is the largest, and the next to last is the median.

Then, the partitioning mostly is the same as it was before, except we only partition the subarray between the first and last element, since they are already in valid positions.

We can reuse the old partition code, but we have to remember to offset the return value by one, since we are sending array[1 .. k] into partition, not the full array slice.

This is a place where the array slicing may be more confusing then using explicit indices for the range we want to partition. Trying that approach may be an interesting experiment to learn which is easier.

ulong medianOfThreePartition(T)(T[] array) {
  assert(array.length > 2);

  ulong i = 0;
  ulong j = array.length - 2;
  ulong k = array.length - 1;

  swap(array[array.length/2], array[j]);

  if(array[i] > array[j]) { swap(array[i], array[j]); }
  if(array[j] > array[k]) { swap(array[j], array[k]); }
  if(array[i] > array[j]) { swap(array[i], array[j]); }

  return 1 + partition(array[1 .. k]);

Also, we need an insertion sort for the small subfile sorting. This is pretty straightforward, but we are swapping the item to move it into place. If we knew we were sorting something cheap to copy, there may be a more efficient way to do that. Maybe.

void insertionsort(T)(T[] array) {
  foreach(j; 1 .. array.length) {
    // array[0..j] is already sorted, so insert array[j] into the
    // right place.
    ulong i = j;
    while(i > 0 && array[i-1] > array[i]) {
      swap(array[i], array[i-1]);

Here is the new quicksort function. We have a lower limit on the size of arrays that we will pass on to the medianOfThree function. For anything smaller, we call insertionsort.

T[] quicksort(T)(T[] array) {
  const ulong RECURSION_LIMIT = 30;
  if(array.length > RECURSION_LIMIT) {
    // Move the middle element to the end to act as the pivot.
    swap(array[$/2], array[$-1]);
    auto p = medianOfThreePartition(array);
    // Index p is in position, so now recursively sort the other two
    // halves of the array.
    quicksort(array[0 .. p]);
    quicksort(array[p + 1 .. $]);
  } else {
  return array;

The old test drivers and timing code work well with this code. The only change is renaming the partition function.

This method is slower than the overly simple method of always partitioning about the middle element of the array. I need to look into it more, but I have a few other things to do first. The biggest thing is to implement a fat partitioning function for handling equal items efficiently. Also, making the sort generic on the comparison function looks like it should be easy too.

Quicksort in D

I have finally gotten around to learning about the D Programming Language . Here is a toy quicksort, with some uninformed comments about D.

First, we need to import the std.algorithm library for the swap function, and the stdio libraray for printing stuff out.

import std.algorithm;
import std.stdio;

Next, we have the partition function. This shows several things. First, it is templatized on type, though not on comparison function. The (T) in partition(T)(...) marks the function as a template on T.

This code also shows is how D handles arrays. D is aware of array slices, so we can recurse on parts of the array, instead of passing in an array and begin / end indices. This may only be sugar, but it allows stricted error checking in non-release mode, as the D arrays are bounds checked.

The partition function uses the last element of the array as the pivot. The partition function walks two indices, i and j, from the beginning and the end of the array. The condition is that array[x] is less than or equal to the pivot when x is less or equal to i, and array[x] is greater than the pivot when x is greater or equal to j. At the end, we swap the pivot out to where it belongs.

ulong partition(T)(T[] array) {
  if(array.length < 2) { return 0; }

  auto pivot = array[$-1];

  // We are going to increment i, and decrement j in each iteration of
  // the while loop before using them.  So, they must start at the
  // location before where we really want them to be.
  // This seems inconvenient, but it cleans up the possible ending
  // conditions of the while loop.  The ulong cannot really hold a
  // negative number, so this -1 requires special care in the while
  // loop.
  ulong i = -1;
  ulong j = array.length-1;

  // If i == -1, it will wrap around to a really big postive number.
  // So, include a test for it here.  The only time it should be
  // tested is the first iteration, while i still is -1, and the last
  // iteration, when i >= j.
  while(i < j || i == -1) {
    // Find the next element from the start that is larger or equal to
    // the pivot element.
    // The pivot at the the end of the array is a sentinel, so we
    // needn't check for overrunning this array in direction.
    i += 1;
    while(array[i] < pivot) {
      i += 1;

    // Find the next element from the end that is less than or equal
    // to the pivot.  We must guard against running past the i'th
    // element, since we could have an array that was all greater than
    // the pivot.  If j was 0 before this, things would be bad.  That
    // is why the function asserts that the length be more than 1.
    j -= 1;
    while(i < j && array[j] > pivot) {
      j -= 1;

    if(i < j && j < array.length) {
      // Swap the values pointed at by i and j.  Now, array[x] for x
      // <= i is <= pivot, and array[x] for x >= j is >= pivot.
      swap(array[i], array[j]);

  // We know i >= j.
  // 1) If i was incremented to be equal to j, then
  //  i = j = array.length - 1, and array[i] is the only element >=
  //  pivot.
  // 2) Or, i was incremented to j, and array[j] >= pivot from the
  //    previous iteration.  In this case, array[j] >= pivot.
  // 3) Or, i was incremented so that array[i] >= pivot, then j was
  //    decremented till j == i.  So, array[i] >= pivot.
  /// So, array[i] is always greater than or equal to the pivot after
  // the while loop.  And, array[i-1] is less than or equal to pivot,
  // if i > 0.

  // Swap the pivot into the right place.  Everything to the left is
  // <= pivot.  Every thing to the right is >= pivot.
  swap(array[i], array[$-1]);
  // Return the pivot location
  return i;

D has some nice unittest capabilities. Code in a unittest section is only compiled with the -unittest compile flag, and it is run every time the code is executed. There is no way to forget to run it then. Unless, of course, it isn't compiled with the -unittest flag...

For example,

dmd quicksort.d -unittest
./quicksort # runs the tests

So, lets add some tests for the partition function. Imagine, if you wish, that the tests were written before the code.

unittest {
  /// Check that the partitioning is correct.
  auto check_partition = function(int[] array) {
    auto p = partition(array);
    foreach(i; 0 .. p) {
      assert(array[i] <= array[p]);
    foreach(i; p + 1 .. array.length) {
      assert(array[i] >= array[p]);

  auto data = [9,2,3,4,5,6,7,8,5];
  data = [1];
  data = [1, 2];
  data = [2, 1];
  data = [2, 1,1,1,1,1];
  data = [2, 1, 2, 1, 2, 1, 2];
  data[] = 10; // set all values to 10
  writeln("partition tests passed");


These lightweight tests are nice to have inline with the code. More involved tests might not be appropriate here, but tests that are so easy to write make it hard to justify skipping tests. Not that we should justify skipping tests anyways.

A nice statement in those tests was an array expression. data[] = 10 sets all of the entries of data to 10. Arrays in D are much more first class citizens then vectors are in C++.

After the partition function, the rest of quicksort is easy. We partition the data, and recursively sort each partition. At each step, we are removing the partition element from the array, so we know that the total length of the subarrays is always decreasing.

T[] quicksort(T)(T[] array) {
  if(array.length > 1) {
    // Move the middle element to the end to act as the pivot.
    swap(array[$/2], array[$-1]);
    auto p = partition(array);
    // Index p is in position, so now recursively sort the other two
    // halves of the array.
    quicksort(array[0 .. p]);
    quicksort(array[p + 1 .. $]);
  return array;

Again, we can write some unittests to make sure that things seem fine.

void checkSorted(T)(T[] data) {
  foreach(i; 1 .. data.length) {
     assert(data[i] >= data[i-1]);

unittest {
  auto check = function(int[] data) {

  auto data = [1,2,3,4,5];
  data = [5,4,3,2,1];
  data = [];
  data = [1];
  data = new int[1000];
  foreach(i; 0 .. data.length) {
    data[i] = cast(int)(data.length - i);

Finally, we would like some timings to see how we are doing. In the process of writing this code, it was wrong several times. The results were correct, but a bug made the code run in quadratic time. Timing code isn't always for optimization; unexpected timing results can point to subtle bugs.

Until I understand D better, I am commenting / uncommenting code to vary the tests. (Yuck)

Testing with reversed data is kind of a double-edged sword. Reversed data is an optimistic case for the simple "pick the middle element" pivoting, because the middle element is always close to the median, reducing the depth of the recursive calls. When implementing median of three, partitioning, though, the reversed order can find bugs causing quadratic performance.

The =StopWatch= class is convenient here for some coarse timing information. These tests are one shot timing tests, without any statistical validity. Another convenience here is the literal number notation. Underscores can be used in a number to make it read easier, without affecting its value.

import std.datetime;

void timeSort(T)(T[] array) {
  StopWatch sw;
  // sort(array);
  writeln(array.length, "  ", sw.peek().msecs);

  // Also, make sure its right!

int[] MakeOrderedData(int N) {
  auto data = new int[N];
  foreach(i; 0 .. data.length) {
    data[i] = cast(int)(i); // cast(int)(-i);
  return data;

int[] MakeReversedData(int N) {
  auto data = MakeOrderedData(N);
  return data;

void runTimings() {
  auto sizes = [1000, 10_000, 100_000,
                1_000_000, 10_000_000];
  foreach(s; sizes) {
    auto data = MakeReversedData(s);

Finally, a main function.

void main() {

And here are some timing results.













A* in Python

A* is a directed search algorithm. Given a good heuristic function, it can perform much better than a blind breadth-first search.

A* code

First, the main algorithm itself. I like using Python for lots of reasons. One that stands out, though, is that functions are first class. This allows us to write a generic A* function that works on many problems.

(Python is not unique in this. But it is still fun.)

Here is the main solving code.

import heapq
import math
import random
def solve(start, finish, heuristic):
    """Find the shortest path from START to FINISH."""
    heap = []

    link = {} # parent node link
    h = {} # heuristic function cache
    g = {} # shortest path to a node

    g[start] = 0
    h[start] = 0
    link[start] = None

    heapq.heappush(heap, (0, 0, start))
    # keep a count of the  number of steps, and avoid an infinite loop.
    for kk in xrange(1000000):
        f, junk, current = heapq.heappop(heap)
        if current == finish:
            print "distance:", g[current], "steps:", kk
            return g[current], kk, build_path(start, finish, link)

        moves = current.get_moves()
        distance = g[current]
        for mv in moves:
            if mv not in g or g[mv] > distance + 1:
                g[mv] = distance + 1
                if mv not in h:
                    h[mv] = heuristic(mv)
                link[mv] = current
                heapq.heappush(heap, (g[mv] + h[mv], -kk, mv))
        raise Exception("did not find a solution")

solve takes two positions, a start and finish, and a heuristic function. The heuristic function must always return a distance that is less or equal to the actual distance between two positions. The whole algorithm rests on that assumption.

g[mv] holds the length of the shortest known path to mv, and h[mv] holds the estimated distance from mv to finish according to the heuristic function. Then, g[mv] + h[mv] is the current best estimate of the distance from start to finish through mv.

We keep a the nodes to search next in a priority queue, ordered by the the current best estimates of the distance to finish.

solve is a generic function, and it realized on duck-typing to work. The position objects must provide a get_moves method that returns neighboring positions, a hash function __hash__, and an equality operator __eq__.

Whenever the goal position is at the top of the heap we are done. This is because the heap is ordered by the best possible score from a position to the goal. If there is a shorter path to the goal, it would have already been explored, since we explore by the best possible path first. Since it hasn't been seen before, this path is the best.

Note that this isn't true about positions other than the goal. That is because the heuristic provides an optimistic guess for the shortest path from a position to the goal, not the shortest path between any two positions. This is the reason that in the inner loop, we check if we are in the shortest path seen so far to a node. A node may be visited multiple times by A*, with shorter paths to the node each time.

The function build_path reconstructs the path to the solution for us. This is useful for debugging, and because we often want to know the path, not just the path length.

def build_path(start, finish, parent):
    Reconstruct the path from start to finish given
    a dict of parent links.

    x = finish
    xs = [x]
    while x != start:
        x = parent[x]
    return xs

Here is a heuristic that does nothing, just for testing. Using this heuristic will lead to Dijkstra's algorithm, which is a good, but uninformed search algorithm.

def no_heuristic(*args):
   """Dummy heuristic that doesn't do anything"""
   return 0

A grid example

Here is a simple test class where the program must find a path on a rectangular grid. We will assume the grid is infinite, but will also hope that our search doesn't go that far off course.

class GridPosition(object):
    """Represent a position on a grid."""
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __hash__(self):
        return hash((self.x, self.y))

    def __repr__(self):
        return "GridPosition(%d,%d)" % (self.x, self.y)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

    def get_moves(self):
        # There are times when returning this in a shuffled order
        # would help avoid degenerate cases.  For learning, though,
        # life is easier if the algorithm behaves predictably.
        yield GridPosition(self.x + 1, self.y)
        yield GridPosition(self.x, self.y + 1)
        yield GridPosition(self.x - 1, self.y)
        yield GridPosition(self.x, self.y - 1)
grid_start = GridPosition(0,0)
grid_end = GridPosition(10,10)
def grid_test_no_heuristic():
   solve(grid_start, grid_end, no_heuristic)

This gives us:

distance: 20 steps: 840

Now, we can add a heuristic. An obvious one is to Euclidean distance, since the shortest path between two points is a straight line.

def euclidean_h(goal):
    def f(pos):
        dx, dy = pos.x - goal.x, pos.y - goal.y
        return math.hypot(dx, dy)
    return f
def grid_test_euclidean_heuristic():
    solve(grid_start, grid_end, euclidean_h(grid_end))
distance: 20 steps: 134

That result is significantly better.

We can do even better. Since our grid movements are restricted to left, right, up and down, we can use Manhattan distance as the heuristic. In this simple case, Manhattan distance is a perfect heuristic. Adding obstacles, or changing the cost of moving through grid points would keep Manhattan from being perfect.

def manhattan_h(goal):
   def f(pos):
       dx, dy = pos.x - goal.x, pos.y - goal.y
       return abs(dx) + abs(dy)
   return f
def grid_test_manhattan_heuristic():
    solve(grid_start, grid_end, manhattan_h(grid_end))
distance: 20 steps: 20

We found the path without exploring any unnecessary nodes.

Block Puzzle

In the Stanford AI class offered online, they discussed A* in the context of the classic fifteen puzzle, but simplified to just eight pieces.

Here is a class for the block puzzle. Usually the 15 puzzle is used, but the 8 puzzle is a lot faster to solve.

class BlockPuzzle(object):
    def __init__(self, n, xs=None):
        """Create an nxn block puzzle

        Use XS to initialize to a specific state.
        self.n = n
        self.n2 = n * n
        if xs is None:
            self.xs = [(x + 1) % self.n2 for x in xrange(self.n2)]
            self.xs = list(xs)
        self.hsh = None
        self.last_move = []

    def __hash__(self):
        if self.hsh is None:
            self.hsh = hash(tuple(self.xs))
        return self.hsh

    def __repr__(self):
        return "BlockPuzzle(%d, %s)" % (self.n, self.xs)

    def show(self):
        ys = ["%2d" % x for x in self.xs]
        xs = [" ".join(ys[kk:kk+self.n]) for kk in xrange(0,self.n2, self.n)]
        return "\n".join(xs)

    def __eq__(self, other):
        return self.xs == other.xs

    def copy(self):
        return BlockPuzzle(self.n, self.xs)

    def get_moves(self):
        # Find the 0 tile, and then generate any moves we
        # can by sliding another block into its place.
        tile0 = self.xs.index(0)
        def swap(i):
            j = tile0
            tmp = list(self.xs)
            last_move = tmp[i]
            tmp[i], tmp[j] = tmp[j], tmp[i]
            result = BlockPuzzle(self.n, tmp)
            result.last_move = last_move
            return result

        if tile0 - self.n >= 0:
            yield swap(tile0-self.n)
        if tile0 +self.n < self.n2:
            yield swap(tile0+self.n)
        if tile0 % self.n > 0:
            yield swap(tile0-1)
        if tile0 % self.n < self.n-1:
            yield swap(tile0+1)

We also need a way to create a shuffled puzzle. Here is a generic method for shuffling.

def shuffle(position, n):
    for kk in xrange(n):
        xs = list(position.get_moves())
        position = random.choice(xs)
    return position

Now, we need a heuristic. The empty heuristic approach will take too long here.

The first, and simplest heuristic is to count how many tiles are out of place.

def misplaced_h(position):
    """Returns the number of tiles out of place."""
    n2 = position.n2
    c = 0
    for kk in xrange(n2):
        if position.xs[kk] != kk+1:
            c += 1
    return c

Here is a sample run

def test_block_8_misplaced(num_tests):
    for kk in xrange(num_tests):
        p = shuffle(BlockPuzzle(3), 200)
        print p.show()
        solve(p, BlockPuzzle(3), misplaced_h)
 0  2  5
 8  3  7
 1  6  4
 distance: 19 steps: 872
 2  7  0
 6  8  3
 1  5  4
 distance: 19 steps: 958
 1  5  4
 7  2  8
 0  3  6
 distance: 25 steps: 13027
 6  8  7
 2  5  4
 0  1  3
 distance: 27 steps: 26762
 3  8  4
 7  0  6
 2  1  5
distance: 21 steps: 3008

Now, another heuristic is to measure the distance that each tile must move. This heuristic is ok, because we only move one tile at a time, and we know that each tile must move at least this many steps.

def distance_h(position):
    n = position.n
    def row(x): return x / n
    def col(x): return x % n
    score = 0
    for idx, x in enumerate(position.xs):
        if x == 0: continue
        ir,ic = row(idx), col(idx)
        xr,xc = row(x-1), col(x-1)
        score += abs(ir-xr) + abs(ic-xc)
    return score

And another sample run.

def test_block_8_distance(num_tests):
    for kk in xrange(num_tests):
        p = shuffle(BlockPuzzle(3), 200)
        print p.show()
        solve(p, BlockPuzzle(3), distance_h)
 4  7  2
 1  0  5
 6  8  3
distance: 22 steps: 941
 6  5  1
 4  0  3
 2  7  8
distance: 16 steps: 59
 1  3  4
 2  0  5
 7  8  6
distance: 16 steps: 235
 4  5  0
 3  8  2
 6  1  7
distance: 24 steps: 1038
 6  7  2
 5  8  3
 0  1  4
distance: 24 steps: 705

For similar sizes, the results from this heuristic are much better. This testing methodology was pretty poor, though. A more detailed analysis would be better, but this writeup is getting long as it is.

Here is a short sample of one to one comparisons.

def test_block(steps=100, count=5):
    for kk in xrange(count):
        p = shuffle(BlockPuzzle(3), steps)
        print p.show()
        print "misplaced",
        x = solve(p, BlockPuzzle(3), misplaced_h)
        print "distance",
        x = solve(p, BlockPuzzle(3), distance_h)
 2  6  1
 4  8  5
 0  3  7
misplaced distance: 24 steps: 14962
distance distance: 24 steps: 862

Sieve of Eratosthenes in Racket

To learn some Racket I ported this this Ocaml code to Scheme.

Racket has lots of nice features that I want to learn about. The pattern matching brings something to lisp code that I would miss from the ML / Haskell languages. There is a batteries included approach, which helps with the usual knock against scheme that there isn't enough library support. There are also list comprehensions, but they are even more than just list comprehensions.

The comprehensions can walk over lists, vectors, and strings. The type can be specified to make the performance faster. This is my favorite part so far -- they can generate vectors instead of lists, which we will see in this code.

Without further delay, here is a learning attempt with Racket.

#lang racket

(define (sieve n)
      ((limit (round (/ (+ 1 (inexact->exact (round (sqrt n))))
       (m (round (/ (+ 1 n) 2)))
       (ar (make-vector m 1)))

    ;; element 0 is really for number 1, so we do not
    ;; want to drop its multiples.
    (for ([i (in-range 1 limit)])
         (when (= (vector-ref ar i) 1)
               ((p (+ (* 2 i) 1)))
             (for ([j (in-range (* (+ p 1) i) m p)])
                  (vector-set! ar j 0)))))

    ;; Collect all of the non-zero elements
    (let ((result
            ([(x i) (in-indexed (in-vector ar))]
             #:when (> x 0))
            (+ 1 (* 2 i)))))
      ;; now, set the first element to 2, since it is
      ;; currently holding 1.
      (vector-set! result 0 2)

As it stands, this code runs about five times slower than the compiled OCaml, but I am not sure if I am running it in under the JIT compiler yet by doing everything in the REPL.

Edit Outside of the REPL this appears to only take twice as long as the compiled OCaml code.