2008-12-17

Proof-directed Debugging: A Real Example

I was reading this excellent introduction to the K programming language (go ahead, get immersed into it. I'll wait for you) when I got piqued by the strange function where:

  &1 2 3 4     / where: returns the number of units specified
0 1 1 2 2 2 3 3 3 3

  &0 1 1 0 0 1 / where: an important use to get the indices of the 1s
1 2 5

Beyond the need (or not) for such a function, my immediate thought was, How would I write that in OCaml? Sadly, it seems I've become monolingual enough that I have to translate most of what I "hear" into my own internal language. The complaint is neither here nor there, the point is that I reasoned pretty operationally: "I have 1 zeroes, 2 ones, 3 twos and 4 threes", hence:

let where l =
  let rec go n l = function
  | []      -> l
  | x :: xs -> go (succ n) (rep_append x n l) xs

where n is the index of the current element x, and rep_append appends a list of x repeated n times onto l:

  and rep_append e n l =
    if n == 0 then l else
    rep_append e (pred n) (e :: l)
  in
  List.rev (go 0 [] l)

with an accumulating parameter tacked in by tail-recursive reflex. I smelled something wrong when the compiler told me that:

val where : 'a list -> 'a list = <fun>

"Polymorphic?" I thought, and I tried:

# where [1;2;3;4] ;;
- : int list = [2; 3; 3; 4; 4; 4]

Yes, wrong, but…

Of course, I didn't debug the function: the type was too wrong, and it was too obvious a clue that I needed look no further than go. Type-checking by hand, I saw that, at the end, the returned list has type α list; that must be because rep_append returns an α list. Type-checking that, its result type is α list because it appends e to an α list, so that e must have type α instead of int. That variable takes the value from x, the head of the argument list, but that is the number of times to repeat n, that very element's index.

The arguments are swapped: I wrote above rep_append appends a list of x repeated n times, when I should've written rep_append appends a list of n repeated x times. The correct function is:

let where l =
  let rec go n l = function
  | []      -> l
  | x :: xs -> go (succ n) (rep_append n x l) xs
  and rep_append e n l =
    if n == 0 then l else
    rep_append e (pred n) (e :: l)
  in
  List.rev (go 0 [] l)

(the difference is in go's call to rep_append) whose type:

val where : int list -> int list = <fun>

is satisfactory. The function works correctly, too, but that was to be expected.

This is a typical example of proof-directed debugging. That is, a concrete answer to the question, What good is a strong statically-typed language? It's not only that the types prevent you from connecting the pieces together when they don't fit, even though that's 90% of their usefulness. It is also, and especially, the fact that the type of a function is a proof that it does what you think it does. If the type is wrong, the function must be wrong too, automatically, and you don't need to run it to know that.

Much as with a real proof, the trick is to work backwards to see where did an error creep in. A classical example are the trick "proofs" that purport to demonstrate that 1 = -1 or some such, "proofs" that crucially depend on dividing by zero. Working mathematicians everywhere are confronted daily with this, and therein lies the rub.

A crucial objection to the discipline of programming with a strong, statically-typed language is the view that "you need a degree in Mathematics to program in it". It is often repeated that it is "more productive" (easier, in non-obfuscated terms) to program in a dynamic language, and to debug in run-time, or using test-driven development.

The pragmatics of using a statically-typed language are not as onerous as that might suggest. Unfortunately you really have to try it for yourself to see that (but the same is true of the extreme and agile methodologies.) It looks difficult, and it is difficult to program with a strict type system if you're not used to it. But, and this is important, the best languages (i.e., not Java) lend you a very helpful hand: you don't have to prove any theorems, because the compiler does it for you. I checked a proof, I didn't have to write one. That's the compiler's job. And this particular check, that every term has a given type, is not very difficult once you have the final term; you just have to acquire the knack of working back the types of everything until you get to the source.

So, don't just take my word for it, give it a spin. You'll just have to trust me in that strong statically-typed languages are an effective alternative to TDD and agile methodologies.

By the way, that function is ridiculously operational. What would SPJ do? Easy: just be lazy!:

let where l = concat % map (fun (i, x) -> take x $ all i) % index $ l

That is (from right to left): index the list's elements; convert each pair of an index and a repetition count into the corresponding list, and flatten the result. Of course, this requires a couple of definitions:

let (%) f g x = f (g x)

let ($) f x = f x

let rec iota =
  let rec go l i =
    if i == 0 then l else
    let i = pred i in
    go (i :: l) i
  in go []

let all e = let rec l = e :: l in l

let rec take n l =
  if n == 0 then [] else
  match l with
  | []      -> []
  | x :: xs -> x :: take (pred n) xs

let index l = combine (iota (length l)) l

all but the last of which are entirely general. This goes to show that the Haskeller's smug remark that OCaml's standard prelude is rather poor is not really smug at all but painfully true.

Edit: Aaaaargh! I made the same mistake twice. It's obvious that I can't think and code at the same time. The second version of the function where is now correct.

2 comments:

Anonymous said...

Hey Matias,
Taos here, from The Argentine Post. I was just thinking that since you're so into programming, you might like Malcolm Gladwell's new book - Outliers. It's a fascinating tale of what factors contribute to making someone unusually successful. Among other things, Gladwell looks at super programmer Bill Joy and how he got to be the influential programmer that is today. He does the same with Gates. Anyway, just thought I'd pass that along. I'm reading the book now and find it very interesting.
Taos

Matías Giovannini said...

Taos,

thanks for the recommendation. I'll keep an eye open for it.