## 2010-06-11

### Diagonalize, Now!

It is easy to append two lists, and the method is well-known:

let rec append xs ys = match xs with
| []      -> ys
| x :: xs -> x :: append xs ys


It is also easy to merge two lists, so that elements from the first are interleaved with those of the second:

let rec merge xs ys = match xs with
| []      -> ys
| x :: xs -> x :: merge ys xs


A simple matter of swapping two arguments! The similarities between both functions hide deep differences. In a strict setting, the equation append xs ys = xs has just a solution ys = [] if xs is finite; but in a lazy setting this equation has infinite solutions for ys if xs is infinite, because AωA* ≡ Aω. By contrast, merge xs ys = xs has as its only solution ys = [] independently of whether xs is finite or not (in a lazy language).

What this all means is that merge is fairer than append, in this sense: merge xs ys includes all the elements of both xs and ys. This is, in transfinite terms, one half of the constructive proof that ω + ω > ω but 2⋅ω = ω, the other being an effective procedure for un-merge-ing an infinite list (edit: thanks Andrej Bauer for your thorough comments!). In practice, it is the reason why you cannot enumerate all the integers by first counting the nonnegatives 0, 1, 2, 3… and then the negatives -1, -2, -3… but you can if you interleave them 0, -1, 1, -2, 2, -3, 3…

The downside to this is that append is associative but merge is not, so it doesn't form a monoid over lists (even though [] is its left and right identity) and thus neither endows the lists with monadic structure.

Can we generalize to lists of lists? Sure we can:

let rec concat = function
| []        -> []
| xs :: xss -> append xs (concat xss)


(a simple consequence of lists being a monoid), or:

let concat xss = List.fold_right (fun xs ys -> append xs ys) xss []


If we try to do the same with merge:

let rec diagonalize = function
| []        -> []
| xs :: xss -> merge xs (diagonalize xss)


again, a simple change (I omit the folded version for brevity). The problem is that, since merge is not associative, the result of diagonalize [xs0; xs1; xs2; …] is not very fair: it includes every other element of xs0 followed by every fourth element of xs1 followed by every eighth element of xs2…, so that you have to wait 2n + 1 elements before finding the next element in xsn! The result does include every element in every list even if there are an infinite number of infinite lists, because ½ + ¼ + ⅛ + … = 1, but you have to wait exponentially long for them. Georg Cantor devised a clever way out of this: read the (possibly doubly infinite) matrix:

$\begin{matrix} x_{00} & x_{10} & x_{20} & x_{30} & \cdots \\ x_{01} & x_{11} & x_{21} & x_{31} & \cdots \\ x_{02} & x_{12} & x_{22} & x_{32} & \cdots \\ x_{03} & x_{13} & x_{23} & x_{33} & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{matrix}$

(thank you codecogs for the wonderful LATEX rendering service!) along ascending diagonals, as if it were rotated 45° to the right: x00, x01, x10, x02, x11, x20, x03, x12, x21, x30, … You get the idea. Now you only have to wait quadratically long before you see an element. However, how should we proceed in order to flatten a list of lists in this order? By induction, of course!

Suppose every column in the matrix is a list, and that we can "half-transpose" such a list of lists so that each list in the result corresponds to a diagonal in the original matrix: [[x00]; [x01; x10]; [x02; x11; x20]; [x03; x12; x21; x30]; …]. Call this operation rotate. Then, by elementwise consing the head list with each list in the rotation of the rest we diagonalize the entire list of lists. The code for pairing up these lists is:

let rec pairup xs yss = match xs, yss with
| [], yss -> yss
| xs, []  -> List.map (fun x -> [x]) xs
| x :: xs, ys :: yss -> (x :: ys) :: pairup xs yss


(being careful not to lose any element!). The code to perform the rotation is:

let rec rotate = function
| []        -> []
| xs :: xss -> pairup xs ([] :: rotate xss)


Note that the corner element gets to be by itself. Then the diagonalization is simply the flattening of this rotation:

let diagonalize xss = concat (diag xss)


The question now is, how to generalize this to higher dimensions (lists of lists of lists of …)? Since nowhere to be seen is the nice symmetry between appending and merging here I don't know what the guiding principle could be.

Andrej Bauer said...

I like your observations very much, but there seems to be some uneccesary confusion about mahtematics and mathematical terminology.

The part where you say "This is, in transfinite terms, the constructive proof that 2⋅ω = ω + ω = ω but ω⋅2 ≠ ω (even if both have cardinality ℵ0)" is confused because you're using exact mathematical words and notation to probably mean something informal. If by ω you mean the ordinal number ω, i.e., "an infinite list 0,1,2,3,4,...", then ω + ω = ω is false. In fact, we have ω⋅2 = ω + ω > ω = 2⋅ω, constructively and otherwise.

If by ω you mean the least infinite cardinal, why are you also using ℵ0 for it? Lastly, ω has cardinality ℵ0, so I don't understand the point of saying "if both ω and ω have cardinality ℵ0". How can they have different cardinality when they are the same thing? I think at the very least you should use different name for the two lists, perhaps α and β. Having ω refer to two different things is weird.

If you do want to have a computational proof that ω = 2⋅ω, you have to do more than what you did. You have to program two order-preserving functions, one from
ω to 2⋅ω, and one the other way, such that they are inverses of each other. I think your merge is one of them, but then you also need the inverse (some sort of split or unzip). If on the other hand you're talking about cardinalities, not order types, then to show ℵ0+ℵ0=ℵ0 you would have to find just computable bijections which need not preserve the order.

Matías Giovannini said...

@Andrej: First of all, thank you for taking the time to reply so thoroughly. You're right, I made a mess of my misremembered Cantor! I'll amend the post with a reference to your explanation, that is far better than any patch I could make to the text.