## 2007-10-09

### Iterative Tree Traversals

The point of this post is to illustrate with a concrete, step-by-step derivation, the thesis in Sobel and Friedman's "Recycling Continuations":

Link-inversion algorithms [...] are a standard means of traversing data structures

For that, I must begin at the beginning, with a concrete case. Consider the type of rose trees, those whose inner nodes have a list of children:

```type 'a tree = L of 'a | N of 'a tree list
```

A breadth-first visit of a tree visits first all the children of the root before visiting their corresponding grandchildren. In order to do that, it maintains a queue of nodes waiting to be visited:

```let fold_bfs f e t =
let rec fold e q = match Queue.get q with
| None          -> e
| Some (L x, q) -> fold (f e x) q
| Some (N t, q) -> fold e (List.fold_left Queue.put q t)
in fold e (Queue.put Queue.empty t)
```

for a hypothetical ADT `Queue`, not the OCaml one. Two things to note are:

1. the traversal is tail recursive: every call to `fold` is in tail position. This is analogous to the usual imperative BFS traversal, which is usually written while the queue is not empty do…
2. the list of children is being destructured with a `fold_left`, which is tail-recursive

Aside: the fact that this is a rose tree makes the traversal easier, not harder. A binary tree with node constructor `N of 'a tree * 'a tree` leads to the temptation of forgoing tail recursion by visiting the right child with the result of visiting the left. The use of a queue would be slightly awkward, since the pattern of nested recursion would be replaced by a pattern of nested enqueueing. To me, this is a good example of simplification by generalization.

Hence this traversal can be done in constant stack space, provided that the `Queue` implementation is tail-recursive. Okasaki's functional queues are:

```module Queue = struct
type 'a t = 'a list * 'a list
let empty = [], []
let is_empty = function ([], _) -> true | _ -> false
let check = function ([], r) -> (List.rev r, []) | q -> q
let put (f, r) x = check (f, x :: r)
let get = function ([], _) -> None | (x :: f, r) -> Some (x, check (f, r))
end
```

(`rev` is trivially tail-recursive). What happens here is that, by using the appropriate intermediate data structure, in this case a queue, we trade stack space by heap space.

Can we do better than this? That is, can we avoid the intermediate memory? Before I answer that, let me step back a little.

There is a forgetful morphism between queues and lists; namely, that which maps the empty queue to the empty list, and a queue with head x and tail q to the list with precisely head x and tail the map of q, in that order. This is "forgetful" because a queue has more structure (i.e., properties or laws it must obey) than a list, which is in fact "free" (all this in a precise technical sense).

Consequently, under this mapping, appending a queue q to a queue p is equivalent (by induction) to enqueuing into p the list of elements of q one at a time:

```Queue.append p q ≡ List.fold_left Queue.put p (Queue.to_list q)
```

for suitably defined `Queue.append` and `Queue.to_list`. This means that, in the original traversal, `List.fold_left Queue.put q t` appends to q the queue whose representation as a list is t. This can be written instead as `Queue.append q (Queue.of_list t)`, for suitably defined `Queue.of_list`, inverse to `Queue.to_list`.

Now, by the definition of `List.append` (denoted as usual by `@`), we have

```Queue.to_list (Queue.append p q) ≡ Queue.to_list p @ Queue.to_list q.
```

This is, no more and no less, the condition for `Queue.to_list` to be a morphism from queues to lists; a different one to the one we started with. This morphism maps whole queues under concatenation to whole lists also under concatenation. So, leaving aside for the time being the issue of tail-recursiveness, we can apply this morphism and use lists directly in the BFS traversal:

```let fold_bfs f e t =
let rec fold e q = match q with
| []       -> e
| L x :: q -> fold (f e x) q
| N t :: q -> fold e (q @ t)
in fold e [t]
```

For instance, `fold_bfs (fun () -> print_int) () (N [N [L 1; N [L 2; L 3]; N []]; L 4; N [L 5]])` outputs 41523, as expected.

This function is no longer tail-recursive, since `append` isn't. This might seem a step backwards, but in fact the change from a queue to a list makes considerably simpler to attempt further transformations. I'll show next how to deal with `append`.