## 2008-06-25

### The Riffle Shuffle

Take a deck of cards. Split it into halves having the same number of cards each, and put them together so that the first card of the first half goes first, the first card of the second half goes second, the second card of the first half goes third; and so on interleaving cards from each half until your deck is put together back again. This operation is called the riffle shuffle.

It is easy to even-odd split a list into two halves: just skip down the list by twos, putting the first on the first and the second on the second. Like this:

```let rec unzip = function
| []           -> [], []
| [x]          -> [], [x]
| x :: y :: xs ->
let l, r = unzip xs in
x :: l, y :: r
```

Edit: This definition is not quite correct, as the odd element gets put into the even list! Better is:

```let rec unzip = function
| []           -> [], []
| [x]          -> [x], []
| x :: y :: xs ->
let l, r = unzip xs in
x :: l, y :: r
```

Let me define, before I go further, some handy combinators: compose, apply and the identity function:

```let (%) f g x = f (g x) ;;
let id x = x ;;
let (\$) f x = f x ;;
```

(These are always seen together). Now, it's easy to see by inspection that the split is indeed even-odd:

```# unzip % iota \$ 11 ;;
- : int list * int list = ([0; 2; 4; 6; 8], [1; 3; 5; 7; 9; 10])
```

(`iota` is the usual suspect). Now, the name `unzip` points to the fact that there is an inverse operation, `zip` that puts back both halves again:

```let rec zip = function
| l, []          -> l
| [], r          -> r
| a :: l, b :: r -> a :: b :: zip (l, r)
```

A proof that `zip % unzip``id` is made tedious by the need to split (!) it into an even and an odd case analysis in the length of the list, but is otherwise straightforward. A quick check that I haven't made a blunder in assigning the odd element to the right list:

```# zip % unzip % iota \$ 11 ;;
- : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
```

(Rest assured that a full Quickcheck would have passed). Now, how to split a list into halves such that the elements are kept in order; that is, the first sub-half has the first ⌊N/2⌋ elements, and the second has the remaining ⌈N/2⌉? The standard idea is to traverse the list once to count the elements, and a second time to select the first half. But this can be done, however, in one pass. The trick is to use two "pointers" to the list, one running at twice the speed of the other. When the fast pointer reaches the end, the slow pointer has reached the midpoint, and the list can be "split" there:

```let split l =
let rec walk l r = match r with
| []  |  [_]  -> [], l
| _ :: _ :: r -> match l with
| []        -> assert false
| a :: l    ->
let l', r' = walk l r in
a :: l', r'
in walk l l
```

In `walk`, r is the fast pointer. Note that the case where l, the slow pointer, has nowhere else to go but r still does must be explicitly disclaimed, even though by construction that case never arises because the `walk` is over the same list. It does what it should, as we can see:

```# split % iota \$ 11 ;;
- : int list * int list = ([0; 1; 2; 3; 4], [5; 6; 7; 8; 9; 10])
```

And, of course, the inverse of `split` is `append`. To see that, we need the uncurrying combinator

```let uncurry f (x, y) = f x y ;;
```

Quickcheck would validate any number of cases; I'm content with just one:

```# uncurry (@) % split % iota \$ 11 ;;
- : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10]
```

As I've pointed before, since all applications are monomorphic, I can use point-free style without fear of the value restriction.

All this is a necessary prelude to the point of this post: the riffle shuffle is just the `zip` of the `split`:

```let riffle l = zip % split \$ l ;;
```

(Now the value restriction bytes and I'm forced to η-expand). A natural question is now how many shuffles returns the deck to its original position? Let's find out:

```let find_fixpoint f e =
let rec find i f x =
if x = e then i else
find (succ i) f (f x)
in find 1 f (f e)
```

Of course, if there's no fix-point to be had, `find_fixpoint` would look into the infinite and beyond, but the properties of the riffle shuffle assures us that I don't need a full-fledged fix-point search algorithm (Pollard's Rho). Let's see how many riffles we need to sort stacks of one, two… twenty cards:

```# map (find_fixpoint riffle) % map (iota % succ) \$ iota 20 ;;
- : int list = [1; 1; 1; 2; 2; 4; 4; 3; 3; 6; 6; 10; 10; 12; 12; 4; 4; 8; 8; 18]
```

It's maybe not immediately clear, but the counts are repeated: the first 1 is for a deck of length 1, the second for one of length 2, and so on. Maybe it's a bit more conspicuous by pairing the results with the length of the deck giving rise to the count. Let me define yet some more combinators:

```let ( *** ) f g x = f x, g x ;;
let ( &&& ) f g (x, y) = f x, g y ;;
```

(These are sometimes called the diag or Δ and the cross or ×). So, to pair stack lengths with riffle cycles:

```# map (id &&& find_fixpoint riffle) % map (id *** iota % succ) \$ iota 20 ;;
- : (int * int) list =
[(1, 1); (2, 1); (3, 1); (4, 2); (5, 2); (6, 4); (7, 4); (8, 3); (9, 3);
(10, 6); (11, 6); (12, 10); (13, 10); (14, 12); (15, 12); (16, 4); (17, 4);
(18, 8); (19, 8); (20, 18)]
```

And it's logical and to be expected: an odd stack is an even stack with a last card that stays in the second half deck after a `split` and doesn't get moved at all by the `zip`. So it's not really interesting to shuffle odd decks; let's count the even ones:

```let even i = 2 * (i + 1) ;;
```

With that, it is a simple matter to modify the previous example:

```# map (find_fixpoint riffle) % map (iota % even) \$ iota 20 ;;
- : int list =
[1; 2; 4; 3; 6; 10; 12; 4; 8; 18; 6; 11; 20; 18; 28; 5; 10; 12; 36; 12]
```

Now, the standard reference for integer sequences is the eponymous Encyclopedia. I must transform a little the output so that looking it up is feasible:

```let oeis l = String.concat "," % map string_of_int \$ l ;;
```

Without it, the following line would have been longer and unwieldy:

```# oeis % map (find_fixpoint riffle) % map (iota % even) \$ iota 20 ;;
- : string = "1,2,4,3,6,10,12,4,8,18,6,11,20,18,28,5,10,12,36,12"
```

The sequence is indeed in the database, and the comments are interesting in their own right (as is to be expected of a Number Theoretic sequence). In closing, let me suggest that you might find it useful to add the one-line combinators to your .ocamlinit file. It makes using the OCaml interpreter a little bit more like what the point-free-wizards do in Haskell.

Anonymous said...

Nice post.
As a general comment for many of your posts, I would appreciate if you added a line in the beginning to give a summary of hat will be discussed.
I find myself often discarding your posts in my aggregator, but a line describing the contents could catch my attention.

Unknown said...

François,

Thank you for taking the time to write. I'll take your suggestion into account.