Bitonic Sorting

The riffle shuffle I've written about is not a mere card trick. It is a basic building block in the art of devising parallel networks for efficient use of SIMD computers. A typical application is the so-called bitonic sort. An elementary introduction to the algorithm with animations might make its working clear, but I find Knuth's exposition transparent:

Let us say that a sequence ⟨z1, … zp⟩ of p numbers is bitonic if z1 ≥ … ≥ zk ≤ … ≤ zp for some k, 1 ≤ kp […] A bitonic sorter of order p is a comparator network capable of sorting any bitonic sequence of length p into non-decreasing order […] [W]e can construct a bitonic sorter of order p […] by first sorting the bitonic subsequences ⟨z1, z3, z5, …⟩ and ⟨z2, z4, z6, …⟩ independently, then comparing and interchanging z1:z2, z3:z4, …

(Knuth, Sorting and Searching, p 232)

In other words, Knuth describes the shape of a bitonic sequence as that of an inverted vee, but in most general terms it is the concatenation of two monotonic (that is, ascending or descending) sequences, any of which can be empty. This means that a sequence with the shape of a vee is also monotonic bitonic, and this is the definition I'll be using, for reasons that will be apparent later.

A comparator is a box with two inputs and two (ordered) outputs such that the upper, or first one is the lesser of the two inputs, and the lower, second one is the greater of said inputs. In other words, given a pair (x, y), a comparator outputs (xy, xy). Knuth denotes above the comparator by a colon; note that, strictly speaking, the inputs to it are unordered since x:y = y:x. A comparator can be straightforwardly defined as:

let swap (x, y) = if x <= y then x, y else y, x

The problem with sorting networks is that they are most naturally applied to inputs that are powers of two in length; any other than that, the definitions lose their pretty symmetry. Consider two lists l = ⟨l1, l2, …⟩ and r = ⟨r1, r2, …⟩ of possibly unequal length. Given a binary operation ⊕ its pointwise extension is ⟨l1r1, l2r2, …⟩. What of left-over elements? Well, in that case ⊕ better have a unit ε to pad the remainder. If it is a left and a right unit of ⊕, that is, ε ⊕ x = x = x ⊕ ε, then ⊕ must have type α×α→α. In practical terms, the pointwise extension of ⊕ must operate between lists of the same type and must result in a list of that very type, with left-over stragglers passing unchanged into the result. This is not as general as it could be.

Fortunately sort is regular enough. Unfortunately, it results in a pair, and while a list of pairs can be flattened easily enough, it wouldn't mix with the leftover singleton. A specially written function would do, but I'd prefer something more composable. Monads to the rescue! I'll use just the necessary machinery to work in the list monad, without all the generic scaffolding. First the unit:

let unit x = [x]

Then the join:

let join = List.concat

With this, pairing is straightforward:

let rec pairup = function
| [], l | l, [] -> List.map unit l
| a :: l, b :: r -> [a; b] :: pairup (l, r)

Note that the extra elements appear as singletons at the end of the resulting list; that is, they are left-aligned. Zipping two lists together is as easy now as:

let zip p = join % pairup $ p

With a little lifting:

let lift2 f = function [x; y] -> f (x, y) | l -> l

and some injections:

let inj2 (x, y) = [x; y]

(note that lift2 inj2id) exchange is a point-free one-liner:

let exchange p = join % List.map (lift2 $ inj2 % swap) % pairup $ p

Note how lift2 and inj2 take care of the argument and the return type of swap separately, so that either in isolation makes sense. Note also that, since the unpaired elements are last, the net result is as if the first sequence was padded with negative infinities (so that the maxima pass unscathed to the right), and the right with positive infinities (symmetrically, so that the minima remain on the left). The unzip function I wrote the last time isn't quite correct, as it will put the odd last element of a list in the even sublist! Yes, it is buggy. Here's the correct version:

let rec unzip = function
| []           -> [], []
| [x]          -> [x], []
| x :: y :: xs ->
  let l, r = unzip xs in
  x :: l, y :: r

To facilitate the recursive application of the procedure to both halves of a sequence, let me introduce a combinator:

let (>>>) f (x, y) = (f x, f y)

Now to the point of this post. By Knuth's definition, the bitonic sort of a bitonic sequence is two half-length bitonic sorts sandwiched between an even-odd shuffle and a sorting network. This translates straightforwardly into:

let rec bitonic = function
| []  -> []
| [x] -> [x]
| l   -> exchange (bitonic >>> unzip l)

It is clear that if exchange is to be feed the result of unzip (directly or indirectly), the right half will always be the shorter one, and the padding infinities will go there. This presents me with a problem, since Knuth's procedure to sort a bitonic sequence in the shape of an inverted vee will make the descending right-hand side padded with an infinity, and thus not bitonic at all. It is necessary to ensure that all bitonic sequences fed to the network are first descending and then ascending:

let merge (p, q) = bitonic (List.rev_append p q)

Assuming that p and q are sorted, List.rev_append p q does exactly that. Now, to sort an arbitrary list it only suffices to split it, sort both halves and merge them as a bitonic sequence:

let rec sort = function
| []  -> []
| [x] -> [x]
| l   -> merge (sort >>> unzip l)

This is not a very efficient serial sorting algorithm, as it makes more comparisons than are strictly necessary. However, the unzip is not a parallel operation but the connection topology of the network and so "free"; the same can be said about rev_append. The recursive calls to sort on both halves can be done in parallel, as can be the recursive calls to bitonic. Finally, exchange can be performed simultaneously on pairs of elements; this makes Batcher's bitonic sort extremely attractive for parallel sorting networks.

In closing, let me do a quick check to see if I didn't made a mistake. Knuth proves (in an exercise!) that a sorting procedure is correct if it sorts correctly every binary sequence. Recursively generating all binary sequences is easy:

let rec bin_sequences n =
  if n = 0 then [[]] else
  let s = bin_sequences (n - 1) in
  List.map (fun l -> 0 :: l) s @ List.map (fun l -> 1 :: l) s

As is checking if a sequence is ascending:

let rec is_ascending = function
| [] | [_] -> true
| x :: y :: l -> x <= y && is_ascending (y :: l)

With that, a check to see if, for instance, all binary sequences of length 10 or less are correctly sorted is a one liner:

# List.for_all is_ascending % List.map sort % bin_sequences $ 10 ;;
- : bool = true


The Riffle Shuffle

Take a deck of cards. Split it into halves having the same number of cards each, and put them together so that the first card of the first half goes first, the first card of the second half goes second, the second card of the first half goes third; and so on interleaving cards from each half until your deck is put together back again. This operation is called the riffle shuffle.

It is easy to even-odd split a list into two halves: just skip down the list by twos, putting the first on the first and the second on the second. Like this:

let rec unzip = function
| []           -> [], []
| [x]          -> [], [x]
| x :: y :: xs ->
  let l, r = unzip xs in
  x :: l, y :: r

Edit: This definition is not quite correct, as the odd element gets put into the even list! Better is:

let rec unzip = function
| []           -> [], []
| [x]          -> [x], []
| x :: y :: xs ->
  let l, r = unzip xs in
  x :: l, y :: r

Let me define, before I go further, some handy combinators: compose, apply and the identity function:

let (%) f g x = f (g x) ;;
let id x = x ;;
let ($) f x = f x ;;

(These are always seen together). Now, it's easy to see by inspection that the split is indeed even-odd:

# unzip % iota $ 11 ;;
- : int list * int list = ([0; 2; 4; 6; 8], [1; 3; 5; 7; 9; 10])

(iota is the usual suspect). Now, the name unzip points to the fact that there is an inverse operation, zip that puts back both halves again:

let rec zip = function
| l, []          -> l
| [], r          -> r
| a :: l, b :: r -> a :: b :: zip (l, r)

A proof that zip % unzipid is made tedious by the need to split (!) it into an even and an odd case analysis in the length of the list, but is otherwise straightforward. A quick check that I haven't made a blunder in assigning the odd element to the right list:

# zip % unzip % iota $ 11 ;;
- : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

(Rest assured that a full Quickcheck would have passed). Now, how to split a list into halves such that the elements are kept in order; that is, the first sub-half has the first ⌊N/2⌋ elements, and the second has the remaining ⌈N/2⌉? The standard idea is to traverse the list once to count the elements, and a second time to select the first half. But this can be done, however, in one pass. The trick is to use two "pointers" to the list, one running at twice the speed of the other. When the fast pointer reaches the end, the slow pointer has reached the midpoint, and the list can be "split" there:

let split l =
  let rec walk l r = match r with
  | []  |  [_]  -> [], l
  | _ :: _ :: r -> match l with
    | []        -> assert false
    | a :: l    ->
      let l', r' = walk l r in
      a :: l', r'
  in walk l l

In walk, r is the fast pointer. Note that the case where l, the slow pointer, has nowhere else to go but r still does must be explicitly disclaimed, even though by construction that case never arises because the walk is over the same list. It does what it should, as we can see:

# split % iota $ 11 ;;
- : int list * int list = ([0; 1; 2; 3; 4], [5; 6; 7; 8; 9; 10])

And, of course, the inverse of split is append. To see that, we need the uncurrying combinator

let uncurry f (x, y) = f x y ;;

Quickcheck would validate any number of cases; I'm content with just one:

# uncurry (@) % split % iota $ 11 ;;
- : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]

As I've pointed before, since all applications are monomorphic, I can use point-free style without fear of the value restriction.

All this is a necessary prelude to the point of this post: the riffle shuffle is just the zip of the split:

let riffle l = zip % split $ l ;;

(Now the value restriction bytes and I'm forced to η-expand). A natural question is now how many shuffles returns the deck to its original position? Let's find out:

let find_fixpoint f e =
  let rec find i f x =
    if x = e then i else
    find (succ i) f (f x)
  in find 1 f (f e)

Of course, if there's no fix-point to be had, find_fixpoint would look into the infinite and beyond, but the properties of the riffle shuffle assures us that I don't need a full-fledged fix-point search algorithm (Pollard's Rho). Let's see how many riffles we need to sort stacks of one, two… twenty cards:

# map (find_fixpoint riffle) % map (iota % succ) $ iota 20 ;;
- : int list = [1; 1; 1; 2; 2; 4; 4; 3; 3; 6; 6; 10; 10; 12; 12; 4; 4; 8; 8; 18]

It's maybe not immediately clear, but the counts are repeated: the first 1 is for a deck of length 1, the second for one of length 2, and so on. Maybe it's a bit more conspicuous by pairing the results with the length of the deck giving rise to the count. Let me define yet some more combinators:

let ( *** ) f g x = f x, g x ;;
let ( &&& ) f g (x, y) = f x, g y ;;

(These are sometimes called the diag or Δ and the cross or ×). So, to pair stack lengths with riffle cycles:

# map (id &&& find_fixpoint riffle) % map (id *** iota % succ) $ iota 20 ;;
- : (int * int) list =
[(1, 1); (2, 1); (3, 1); (4, 2); (5, 2); (6, 4); (7, 4); (8, 3); (9, 3);
 (10, 6); (11, 6); (12, 10); (13, 10); (14, 12); (15, 12); (16, 4); (17, 4);
 (18, 8); (19, 8); (20, 18)]

And it's logical and to be expected: an odd stack is an even stack with a last card that stays in the second half deck after a split and doesn't get moved at all by the zip. So it's not really interesting to shuffle odd decks; let's count the even ones:

let even i = 2 * (i + 1) ;;

With that, it is a simple matter to modify the previous example:

# map (find_fixpoint riffle) % map (iota % even) $ iota 20 ;;
- : int list =
[1; 2; 4; 3; 6; 10; 12; 4; 8; 18; 6; 11; 20; 18; 28; 5; 10; 12; 36; 12]

Now, the standard reference for integer sequences is the eponymous Encyclopedia. I must transform a little the output so that looking it up is feasible:

let oeis l = String.concat "," % map string_of_int $ l ;;

Without it, the following line would have been longer and unwieldy:

# oeis % map (find_fixpoint riffle) % map (iota % even) $ iota 20 ;;
- : string = "1,2,4,3,6,10,12,4,8,18,6,11,20,18,28,5,10,12,36,12"

The sequence is indeed in the database, and the comments are interesting in their own right (as is to be expected of a Number Theoretic sequence). In closing, let me suggest that you might find it useful to add the one-line combinators to your .ocamlinit file. It makes using the OCaml interpreter a little bit more like what the point-free-wizards do in Haskell.


When there's nothing on the other side

Much has been written of how a Maybe monad translates cleanly into generic Java or C#. I have done so in this blog. The fact remains that, more often than not, it is not even necessary to wrap your code in a monad to avoid testing an object reference for null. There is a pattern language of object-oriented programming called null object. Whether it is well-known or not, I can't say; anecdotal evidence suggests it isn't as much as it should be.

The idea behind it is simple: since polymorphic dispatch amounts to a conditional on the run-time type of the receiver, translate a conditional into two subclasses with a method whose body is the respective block in both branches of the original if. This pattern, when specialized to the case of the reference != null case calls for the creation of a special, do-nothing subclass of the object in question with a method whose body is empty. This is entirely analogous to having a sentinel node in a complicated data structure that allows trading time complexity for space complexity. In modern object-oriented languages, tracing just-in-time compilations might or might not make the use of a null object faster than a conditional.

Some well-used data structures lend themselves to this usage without having to code especially around them. If you'd like an example transliterated into a slogan, I could do worse than saying: an empty list is never null. Let me illustrate with a post.

Over at his blog, Dare Obasanjo is raving about how cool is functional programming, C# style. Not that I disagree, far from it; moreover, I must disclose that it is easier to criticize code other people's code than it is writing (and exposing!) code that is above criticism. Some of my earlier posts would have benefited of such a review. The first snippet he presents is:

public void UpdateAllFeeds(bool force_download)
   List<SubscriptionRootNode> rootNodes = this.GetAllSubscriptionRootNodes();
   if (rootNodes != null)
      if (_timerRefreshFeeds.Enabled)
      _lastUnreadFeedItemCountBeforeRefresh = rootNodes.Sum(n => n.UnreadCount);
      FeedSources.Sources.ForEach(s => s.RefreshFeeds(force)); 

What I first noticed was not the intent behind this code, or how it illustrated his main point, or anything else really beyond the big honkin' if (rootNodes != null). On a list reference. Returned from this.GetAllSubscriptionRootNodes(), a method in the receiver! If you ever wondered what could be a poster-example, a paradigm for cargo cult programming, I can think of no better example.

Essentially, the cost of ensuring that this.GetAllSubscriptionRootNodes() never returns null is nil. This can be documented and enforced by an invariant in a language that supports them, by an assertion in the body of a client method, or by one or more tests in the relevant harness. It simply amounts to writing something akin to:

private List<SubscriptionRootNode> GetAllSubscriptionRootNodes()
   List<SubscriptionRootNode> result = new List<SubscriptionRootNode>();
   // fill the list according to some logic...
   return result;

Repeat with me: the result of new is guaranteed in C# and in Java never to be the null reference. Hence, the result of the method is, by trivial dataflow analysis, never null. Hence, the test is never necessary.

Now, you might counter with something along the lines of "yeah, but what if some condition makes frobbing the buzzwits inapplicable? Can't I return null to signal upstream code to not frob the buzzwits?" Well, yes, but we're in "you're better off rethinking that" territory. Back to the original code, maybe GetAllSubscriptionRootNodes is really written like this:

private List<SubscriptionRootNode> GetAllSubscriptionRootNodes()
   if (/* some hairy condition */)
      // we don't want to touch the subscriptions right now...
      return null;

   List<SubscriptionRootNode> result = new List<SubscriptionRootNode>();
   // fill the list according to some logic...
   return result;

Then the original client code would be justified in testing for null. There are at least three ways out of this: the simple one is to, instead of null, return the empty list:

private List<SubscriptionRootNode> GetAllSubscriptionRootNodes()
   List<SubscriptionRootNode> result = new List<SubscriptionRootNode>();

   if (/* some hairy condition */)
      // we don't want to touch the subscriptions right now...
      return result;

   // fill the list according to some logic...
   return result;

The client code would then be something like this:

public void UpdateAllFeeds(bool force_download)
   List<SubscriptionRootNode> rootNodes = this.GetAllSubscriptionRootNodes();

   int totalUnread = rootNodes.Sum(n => n.UnreadCount);
   if (totalUnread != 0)
      if (_timerRefreshFeeds.Enabled)
      _lastUnreadFeedItemCountBeforeRefresh = totalUnread;
      FeedSources.Sources.ForEach(s => s.RefreshFeeds(force)); 

You might now say, "yes, but I really need to stop the feeds before I can count the unread posts in the root feeds, or else all manner of nasty things might happen!" Like, for instance, having an approximate count of unread feeds (or what have you). This might not be bad in itself, but you're entitled to your precisions in your own code. This brings me to the second, not-so-simple solution: Instead of conflating two unrelated results into a single return value ("either a condition makes getting the root nodes inapplicable, or here they are those root nodes"), split the two functionalities into different methods:

private bool CanUpdateSubscriptions()
   return !/* some hairy condition */;

private List<SubscriptionRootNode> GetAllSubscriptionRootNodes()
   List<SubscriptionRootNode> result = new List<SubscriptionRootNode>();
   // fill the list according to some logic...
   return result;

Aside: I'm well aware that I could be embarking in a gigantic straw man, and the original code is simply wrong as opposed to subtly wrong. I mean, it might be that the original intent behind the null test was cargo cultishness. Bear with me, because I feel this point deserves being made.

Now, the client code is:

public void UpdateAllFeeds(bool force_download)
   if (!CanUpdateSubscriptions())
     return; // early, return often
   List<SubscriptionRootNode> rootNodes = this.GetAllSubscriptionRootNodes();
   if (_timerRefreshFeeds.Enabled)
   _lastUnreadFeedItemCountBeforeRefresh = rootNodes.Sum(n => n.UnreadCount);
   FeedSources.Sources.ForEach(s => s.RefreshFeeds(force)); 

As you can see, splitting the intent behind return-list-except-for-null-as-a-flag into a bona-fide predicate lets the client logic to be simpler. In fact, the code testing for applicability could be moved further upstream, to —for example— disable menu items or other UI elements.

The third approach, and the one I feel should be preferred whenever possible, would be to make sure that all methods are idempotent; in this case, if there aren't root subscriptions to update, make sure that the calls to stop synchronization and to refresh the sources succeed without unwanted side-effects. This would in effect make the following code:

public void UpdateAllFeeds(bool force_download)
   List<SubscriptionRootNode> rootNodes = this.GetAllSubscriptionRootNodes();
   if (_timerRefreshFeeds.Enabled)
   _lastUnreadFeedItemCountBeforeRefresh = rootNodes.Sum(n => n.UnreadCount);
   FeedSources.Sources.ForEach(s => s.RefreshFeeds(force)); 

always correct and applicable, no matter what state the object is in. Now I realize it might not always be possible, or desirable, to enforce such a strong invariant, especially in the early stages of development; however, the general heuristic principle that says that you should be lenient in what you accept and strict in what you emit leads to more robust, widely reusable code.

As a side note, the other heuristic principle that says that you should not break symmetry without good reason to do so makes me think that, after updating all feeds, the refresh timer should be re-enabled. Can it be that the code has a bug?


Random Experiments

I often find it is much more rewarding to tackle an algorithmic or mathematical problem from an experiential perspective rather than from a formal one. Trying to get some test data for an earlier program, I thought about a couple of ways of generating random increasing sequences.

Intent on generating n random points 0 = x0x1 ≤ … ≤ xn-1 = 1 on the unit interval, I thought the most efficient way to do it was to at each step randomly split the remaining interval. This suggested itself:

rnd0[n_Integer?Positive] := 
 Append[NestList[# + RandomReal[1 - #]&, 0., n - 2], 1.]

(NestList generates a list of n+1 elements.) The result was less than satisfying:

ListLinePlot[rnd0[10], PlotRange -> All]

Not good; in fact, more points show that it indeed wasn't what I wanted:

ListLinePlot[rnd0[100], PlotRange -> All]

It is apparent that the not after long a pretty large chunk of the interval is carved off, and the remaining points compete for less and less room, giving rise to a knee. Well, back to the drawing board. It then occurred to me that instead of dividing the remainder I could step away from the last point a random distance. The problem with this is that I couldn't know in advance how far I'd end; hence, I needed a way to normalize the resulting random walk:

scale[l_List] := With[{h = l[[-1]] - l[[1]]}, (l - l[[1]])/h]

The sequence could be generated simply with:

rnd1[n_Integer?Positive] := 
 scale[Accumulate[Array[RandomReal[1]&, n]]]

(Accumulate gives the list of partial running sums.) To see how I fared, I plotted some points:

ListLinePlot[rnd1[10], PlotRange -> All]

Not bad; however, 100 points showed that the result was a bit too smooth:

ListLinePlot[rnd1[100], PlotRange -> All]

What to do now? Sometimes the most obvious solution glares so intently to you that you can't help but notice it the last, if at all. Since the difference between a random sequence and a monotone random sequence is order, why not try to sort a plain old bunch of random points?

rnd2[n_Integer?Positive] := scale[Sort[Array[RandomReal[1] &, n]]]

(I used scale to ensure that the first and last points where 0 and 1 respectively. I could have added them to a sequence n-1 points long just as well.) Few points showed an encouraging picture:

ListLinePlot[rnd2[10], PlotRange -> All]

Indeed, a few big jumps nicely interspersed with smaller increments. More points still maintain the picture of satisfying roughness:

ListLinePlot[rnd2[100], PlotRange -> All]

Now I wasn't satisfied with this. Is there a way to generate a just-rough-enough sequence like this without having to do an O(n log n) sorting pass? Time to investigate a bit what exactly is the origin of the roughness. One thing I found that distracted me from seeing this quality was the monotonicity. Maybe by looking at the detrended sequences could shed some light into this. The detrending is simply achieved by subtracting the line between the first and last points:

norm[l_List] := 
 With[{h = l[[-1]] - l[[1]], n = Length[l] - 1}, 
  l - l[[1]] - Table[i*h/n, {i, 0, n}]]

(This is more general than I needed, as by construction the trend is the line y = i/n.) I didn't feel that investing analysis on rnd0 was necessary. This is what rnd1 looks like detrended:

ListLinePlot[norm[rnd1[100]], PlotRange -> All]

And this is rnd2:

ListLinePlot[norm[rnd2[100]], PlotRange -> All]

If there was a difference I couldn't see it. But maybe there is some other way to look at the data. Maybe this roughness is a quality of the differences, the jumps between values in the sequence. Now the difference themselves would also be random, albeit with a different underlying distribution. And that's exactly the right idea; since I'm after all accumulating differences, what should their underlying distribution be to achieve the right degree of roughness? I needed first a way to compute histograms of the differences:

hist[k_Integer?Positive, l_List] := With[{d = Differences[l]},
  With[{hi = Max[d], lo = Min[d]}, 
   With[{st = k/(hi - lo)}, 
    Module[{h = Array[0&, k + 1]}, 
     Do[h[[Floor[st (x - lo)] + 1]]++, {x, d}]; h]]]]

(The fact that With binds in parallel, like ML's let is not very practical, as Mathematica's syntax is a bit heavy.) The function hist counts frequencies in k bins. Now, after seeing this:

ListLinePlot[hist[50, rnd1[10000]], PlotRange -> All]

it struck me as obvious: by construction, rnd1 differences are uniformly distributed. The mean 200 is simply 10,000 points divided by 50 bins. The last bin is empty as, since it's evident from the code above, I used a last value as a sentinel for the maximum possible value. Here is rnd2:

ListLinePlot[hist[50, rnd2[10000]], PlotRange -> All]

Now I was getting somewhere. The histogram looked like an exponential distribution. I could have read the article and tried to determine λ to within some confidence interval, in effect testing the hypothesis that the differences were indeed exponentially distributed; or I could try fitting an exponential to the data to see if it would lead me to somewhere. This is what I finally tried:

j = hist[100, rnd2[10000]];

FindFit[j, a k Exp[-k x], {a, k}, x]
{a -> 10420.3, k -> 0.0990637}

A reasonable fit: the a⋅e is approximately 10,000, the number of points. To really see how good the fit was I needed, however, a picture:

Plot[Evaluate[ a k Exp[-k x] /. %], {x, 0, 50}, 
 Epilog -> MapIndexed[Point[{#2[[1]], #1}]&, j]]

Very tight fit, indeed. With this, I could directly generate a monotone sequence with the right distribution of jumps, in one pass without having to sort:

rnd3[n_Integer?Positive] := 
 scale[Accumulate[Array[-Log[1 - RandomReal[1]]/n&, n]]]

(I used the standard algorithm to generate an exponentially distributed variate.) Graphically, there was no appreciable qualitative difference between rnd2 and rnd3:

ListLinePlot[rnd3[100], PlotRange -> All]

And the histogram, as I expected it would be by construction, was likewise similar:

ListLinePlot[hist[50, rnd3[10000]], PlotRange -> All]

I was initially quite satisfied after this hands-on approach to discovering the right algorithm. This satisfaction would prove to be ephemeral, as I realized none of this explained the why. Maybe it's time to consult The Art of Computer Programming, but that would be for another post.


Monotone Cubic Interpolation

(20090728 omission corrected.) The Wikipedia article on monotone cubic interpolation is, regrettably, not of very good quality. The idea is to precondition the derivatives (control points) for the cubic Hermite splines interpolating a monotonic data set so that the resulting piecewise cubic is also monotonic. The algorithm given is extracted from a paper which is not publicly available online. Furthermore, the only example given is in the form of a graphic, without a data set or a resulting array of derivatives, which would have been great to validate an implementation. Even so, I managed to convince myself by visually inspecting a number of trials that the following should do the job.

The first step in the algorithm is to compute the tangents using three-point differences:

let hermite_tangents x y =
  let n = Array.length x in
  if Array.length y != n then invalid_arg "hermite_tangents" else
  let m = Array.make n 0. in
  let d = ref ((y.(1) -. y.(0)) /. (x.(1) -. x.(0))) in
  m.(0) <- !d;
  for i = 1 to n - 2 do
    let d' = (y.(i+1) -. y.(i)) /. (x.(i+1) -. x.(i)) in
    m.(i) <- 0.5 *. (!d +. d');
    d := d'
  m.(n - 1) <- !d;

Note that I've used a mutable reference to hold the value of the previous divided difference. Note also that this is the standard calculation for Hermite interpolation; this routine can be used as is. Now steps 2 to 5 precondition the tangents to achieve the desired monotonic interpolants:

let monotone_hermite_tangents x y =
  let n = Array.length x in
  let m = hermite_tangents x y in
  for i = 0 to n - 2 do
    let d = (y.(i+1) -. y.(i)) /. (x.(i+1) -. x.(i)) in
    if d = 0. then begin
      m.(i  ) <- 0.;
      m.(i+1) <- 0.
    end else
      let a = m.(i  ) /. d
      and b = m.(i+1) /. d in
      let s = a *. a +. b *. b in
      if s > 9. then begin
        let t = 3. *. d /. sqrt s in
        m.(i  ) <- t *. a;
        m.(i+1) <- t *. b

In the article's description, the Δk appear to be array elements; since no step uses previous values of the variable, it is effectively a loop local. Also, I took the liberty to eliminate some (obvious) common subexpressions. That done, the interpolation code is straightforward, and standard. First, given a point e, I need some way to determine which of the intervals [xixi+1] contains it. This naturally calls for a binary search:

let bin_search e x =
  let n = Array.length x in
  let l = ref 0
  and h = ref n in
  while !l + 1 != !h do
    let m = (!l + !h) / 2 in
    if e < x.(m) then h := m else l := m

For values outside the data range, I return the first or last value, as appropriate. Otherwise, as the page on cubic interpolation explains, the interval containing the point e is normalized to the unit interval [0…1], interpolated there using the Hermite basis, and rescaled to the original interval size via h:

let hermite_interpolation x y m e =
  let n = Array.length x in
  if e <= x.(0)   then y.(0) else
  if e >= x.(n-1) then y.(n-1) else
  let i = bin_search e x in
  let l = x.(i) in
  let h = x.(i+1) -. l in
  let t = (e -. l) /. h in
  let u = t *. t in
  let h00 =  ( 2. *. t -. 3.) *. u +. 1.
  and h10 = ((       t -. 2.) *. t +. 1.) *. t
  and h01 =  (-2. *. t +. 3.) *. u
  and h11 =  (       t -. 1.) *. u in
  y.(i) *. h00 +. y.(i+1) *. h01 +. h *. (m.(i) *. h10 +. m.(i+1) *. h11)


A Circular Triviality

(Minor corrections) This is my own take at the corollary to the Intermediate Value theorem on the unit circle. It is not substantially different to the proof linked to, so its value is zero everywhere (maybe except for a countable subset). In particular, this is a special case of the much more general Borsuk-Ulam theorem.

Consider f a continuous function with domain on the unit circle. Then there exists at least a pair of points x0, x1 ∈ [0…2π) such that f(x0) = f(x0 + π) and f(x1) = f(x1 + π).

This is a purely existential statement, so it would be somewhat pointless (!) to strive to identify (or construct) a value x0 witnessing to the theorem. The only obvious possibility is to work with f by itself; that is, by focusing on the range and not on the domain of f. I must head towards something to which I can apply the IVT, and this lets me fix a number of things.

First, f is real-valued, as required by the IVT. Second, f is periodic, since if there were an x for which f(x) ≠ f(x + 2π), the difference |f(x + 2π) - f(x + ε)| would be large for |x + 2π - x - ε| = |ε| arbitrarily small on the unit circle. Third, since I see that the hypothesis mentions f(x) and f(x + π) being equal, I'm forced to consider the difference g(x) = f(x) - f(x + π). Fourth, g is continuous as f is.

Now, I have a choice in stating the proof; or rather, I have two different ways to show the existence of an interval satisfying the IVT.

The first one is to consider an arbitrary x ∈ [0…2π), and to see that g(x + π) = f(x + π) - f(x + 2π) = f(x + π) - f(x) = -g(x). Since -|g(x)| ≤ 0 ≤ |g(x)| and g is continuous, there must be some x0 ∈ [-|g(x)| … |g(x)|] such that g(x0) = 0.

The other possibility is to note that there must be a pair of values x+ and x- such that g(x+) ≥ 0 and g(x-) ≤ 0, or else f would be monotonic, hence discontinuous. Since g(x-) ≤ 0 ≤ g(x+) and g is continuous, there exists an x0 ∈ [x-x+] such that g(x0) = 0.

Both things amount to the same; namely, there is a zero of g in the unit circle. As the domain is periodic, there must be another zero in the interval "across π" with respect to the original one (that is, the one having the endpoints listed in the opposite order, modulo 2π), call it x1. But this is precisely the posited pair of points that make f(x0) = f(x0 + π) and f(x1) = f(x1 + π).

Note that both choices are not equally potent: the first interval [-|g(x)| … |g(x)|] has the endearing property of being a function of an arbitrary x on the unit circle. This not only means that I can trade two existentials for one universal, making the theorem more constructive; it also permits a direct application of a root-finding method to hone in the zero.

The usual formulation of this is that there exists a pair of antipodal points where the function takes the same value. We see there is more than one, as the inventors of the Gömböc show. Also, as opposed to the proof by contradiction stated there, this is a direct proof.