The point of this post is to illustrate with a concrete, step-by-step derivation, the thesis in Sobel and Friedman's "Recycling Continuations":

Link-inversion algorithms [...] are a standard means of traversing data structures

For that, I must begin at the beginning, with a concrete case. Consider the type of rose trees, those whose inner nodes have a *list* of children:

type 'a tree = L of 'a | N of 'a tree list

A breadth-first visit of a tree visits first all the children of the root before visiting their corresponding grandchildren. In order to do that, it maintains a queue of nodes waiting to be visited:

let fold_bfs f e t = let rec fold e q = match Queue.get q with | None -> e | Some (L x, q) -> fold (f e x) q | Some (N t, q) -> fold e (List.fold_left Queue.put q t) in fold e (Queue.put Queue.empty t)

for a hypothetical ADT `Queue`

, *not* the OCaml one. Two things to note are:

- the traversal is tail recursive: every call to
`fold`

is in tail position. This is analogous to the usual imperative BFS traversal, which is usually written**while the queue is not empty do…** - the list of children is being destructured with a
`fold_left`

, which*is*tail-recursive

Aside:the fact that this is arosetree makes the traversaleasier, not harder. A binary tree with node constructor`N of 'a tree * 'a tree`

leads to the temptation of forgoing tail recursion by visiting the right child with the result of visiting the left. The use of a queue would be slightly awkward, since the pattern of nested recursion would be replaced by a pattern of nested enqueueing. To me, this is a good example of simplification by generalization.

Hence this traversal can be done in constant *stack* space, provided that the `Queue`

implementation is tail-recursive. Okasaki's functional queues are:

module Queue = struct type 'a t = 'a list * 'a list let empty = [], [] let is_empty = function ([], _) -> true | _ -> false let check = function ([], r) -> (List.rev r, []) | q -> q let put (f, r) x = check (f, x :: r) let get = function ([], _) -> None | (x :: f, r) -> Some (x, check (f, r)) end

(`rev`

is trivially tail-recursive). What happens here is that, by using the appropriate intermediate data structure, in this case a queue, we trade stack space by heap space.

Can we do better than this? That is, can we avoid the intermediate memory? Before I answer that, let me step back a little.

There is a forgetful morphism between queues and lists; namely, that which maps the empty queue to the empty list, and a queue with head `x` and tail `q` to the list with precisely head `x` and tail the map of `q`, in that order. This is "forgetful" because a queue has *more* structure (i.e., properties or laws it must obey) than a list, which is in fact "free" (all this in a precise technical sense).

Consequently, under this mapping, appending a queue `q` to a queue `p` is equivalent (by induction) to enqueuing into `p` the *list* of elements of `q` one at a time:

Queue.append p q ≡ List.fold_left Queue.put p (Queue.to_list q)

for suitably defined `Queue.append`

and `Queue.to_list`

. This means that, in the original traversal, `List.fold_left Queue.put q t`

appends to `q` the queue *whose representation as a list* is `t`. This can be written instead as `Queue.append q (Queue.of_list t)`

, for suitably defined `Queue.of_list`

, inverse to `Queue.to_list`

.

Now, by the definition of `List.append`

(denoted as usual by `@`

), we have

Queue.to_list (Queue.append p q) ≡ Queue.to_list p @ Queue.to_list q.

This is, no more and no less, the condition for `Queue.to_list`

to be a morphism from queues to lists; a different one to the one we started with. This morphism maps *whole* queues under concatenation to *whole* lists also under concatenation. So, leaving aside for the time being the issue of tail-recursiveness, we can apply this morphism and use lists directly in the BFS traversal:

let fold_bfs f e t = let rec fold e q = match q with | [] -> e | L x :: q -> fold (f e x) q | N t :: q -> fold e (q @ t) in fold e [t]

For instance, `fold_bfs (fun () -> print_int) () (N [N [L 1; N [L 2; L 3]; N []]; L 4; N [L 5]])`

outputs `41523`, as expected.

This function is no longer tail-recursive, since `append`

isn't. This might seem a step backwards, but in fact the change from a queue to a list makes considerably simpler to attempt further transformations. I'll show next how to deal with `append`

.

## No comments:

Post a Comment