## 2007-09-26

### On Quicksort

An array is a data-type complying with the following interface:

```module type ARRAY =
sig
type 'a t
val create : int -> 'a -> 'a t
val length : 'a t -> int
val get : 'a t -> int -> 'a
val set : 'a t -> int -> 'a -> unit
end```

together with a number of predicates that characterize the array as a data structure that, in fact, stores data. The OCaml built-in `Array` module satisfies the definition:

```module Arr : ARRAY with type 'a t = 'a array =
struct
type 'a t = 'a array
include Array
end```

Also, `Bigarray`s or dynamic vectors can be fitted easily to this interface.

It is well-known that to sort the elements a.(li < h) of an array `a` between indexes `l` and `h` we can use the in-place Quicksort:

```let rec quicksort a l h =
if h - l > 1 then
let m = partition a l h in begin
quicksort a l m;
quicksort a m h
end```

for a suitable definition of `partition` ("suitable", here, means that after the call, a.i ≤ a.m < a.j for li < mj < h). With this in mind, we would like to write a generic (on the actual data structure with array semantics), in-place Quicksort, with the following declaration:

```module Quicksort : functor (A : ARRAY) ->
sig
val qsort : 'a A.t -> unit
val quicksort : 'a A.t -> int -> int -> unit
end```

with the intention of using `qsort a` as a shortcut for `quicksort a 0 (A.length a)`.

There are many ways to find the pivot a.m that allows partitioning the array a in disjoint pieces. The ultimate goal is to make those pieces as close as "halves" of a as possible; for heavily skewed partitions, Quicksort devolves into a very poorly performing algorithm.

Aside: This is why it is a better a priori engineering decision to use Heapsort, and not Quicksort, as the standard library sort routine. Even if the best-case cost of the former is more than twice that of the latter, Heapsort has guaranteed cost Θ(n⋅log.n).

An often-recommended choice of pivot is the median of elements a.l, a.h and a.m, for some index l < m < h. Any choice of m does the trick equally well (or rather, there is no objective choice of one over any other); for symmetry, we settle for m = ⌊ (l + h) / 2 ⌋.

The first task is to actually select the median element. Since at all times `partition` must satisfy the ordering condition, there is no penalty to sort the elements used to find the pivot. The usual imperative algorithm is a sequence of three compare-and-swap operations among the three array elements:

```let sort3 a l h =
let m = (l + h) lsr 1 in
if A.get a l > A.get a m then swap a l m;
if A.get a m > A.get a h then swap a m h;
if A.get a l > A.get a m then swap a l m;
m```

where `swap` is:

```let swap a i j =
let t = A.get a i in
A.set a i (A.get a j);
A.set a j t```

In the best case, 6 reads to the array are needed. In the worst case, 12 reads and 6 writes are necessary to sort the three-element subarray. In any case, exactly 3 comparisons are performed every time. No more than three comparisons are necessary; this is because 22 < 3! < 23. However, we can sometimes do better than that; transitivity of the order relation allows us to save one comparison that could otherwise be deduced from the other two.

A comparison tree is an ordered (plane) binary tree that has comparisons between elements as (inner) nodes and permutations as leaves:

A path from the root to a leaf indicates the sequence of comparisons necessary to identify that particular permutation. Note that the comparisons are conceptually done in the same order as the previous code fragment (low with middle, then middle with high, and finally low with middle), but implicitly reflect the required swaps to put the permutation in order. Once determined, the permutation requires between 0 and 3 writes to be put in order. Storing the array elements in temporaries gives the following code:

```let sort3 a l h =
let m = (l + h) lsr 1 in
let al, am, ah = A.get a l, A.get a m, A.get a h in
if al > am then begin
if al > ah then begin
if am > ah then begin
A.set a l ah;
A.set a h al;
am
end else begin
A.set a l am;
A.set a m ah;
A.set a h al;
ah
end
end else begin
A.set a l am;
A.set a m al;
al
end
end else if am > ah then begin
if al > ah then begin
A.set a l ah;
A.set a m al;
A.set a h am;
al
end else begin
A.set a m ah;
A.set a h am;
ah
end
end else
am```

No path sets more than 3 variables, and there is a path that doesn't set any, as I claimed before. Now, the invariant for `partition` must be established by putting elements less than a.m to the left of m, and those greater than it to its right. We have a choice about what to do about those elements that are equal to a.m; Bentley moves them to the left partition using the traditional "pointer crossing" algorithm. A more equitable alternative would be to use Dijkstra's Dutch National Flag algorithm to move them to their own block:

```let swap3 a x l h =
let i = ref (h+1)  (* x < a.(i ≤ p ≤ h) *)
and j = ref (l-1)  (* x > a.(l ≤ p ≤ j) *)
and k = ref l in   (* x = a.(j < p < k) *)
while !k != !i do
let y = A.get a !k in
(* x : a.(k <= p < i) *)
if x > y then begin
incr j;
A.set a !k (A.get a !j);
A.set a !j y;
incr k
end else if y > x then begin
decr i;
A.set a !k (A.get a !i);
A.set a !i y
end else
incr k
done;
!i, !j```

The result of the algorithm, as the invariants show, are the indices delimiting each partition. The advantage of this slightly more complicated program is that, if the array contains repeated elements, they need not be considered for comparison purposes on the recursive invocations. So we see that `partition` is completely specified by `swap3 a (sort3 a l h) l h`. Everything is in place to recursively sort the partitioned elements. It is standard to recur on the smaller partition and iterate on the larger to guarantee a maximum Ω(log.n) stack depth. Hence, the following:

```let rec qsort a l h =
if h - l > 0 then begin
let i, j = swap3 a (sort3 a l h) l h in
if j - l <= h - i
then begin qsort a l j; qsort a i h end
else begin qsort a i h; qsort a l j end
end```

Putting all the functions together in a module `Quicksort` is left as an exercise for the reader.