2010-09-01

Rewriting the Rules

The last article on pa_do sparked an interesting conversation with its author, Christophe Troestler, and reader bluestorm. It's true that pa_infix can establish optional rewrite rules for operators as a side-effect (or as an "added bonus" in the creator's words) of it installing new symbols with given arity, precedence and associativity. Given that rewrite rules are purely syntactical artifacts, there is no function associated with an operator unless explicitely defined. Of course, once defined they can be re-defined, and it is responsability of authors and users to keep both semantics (reduction and rewrite) in sync.

On the other hand, rewrite rules open the door to interesting optimizations before and beyond what the compiler sees and can perform. In the case of an extension for functional application and composition, the optimization is obvious: the identity function can be stripped when encountered in an applicative expression. Here is the syntax extension pa_compose.ml with these built-in optimizations (Edit: I've expanded a bit the check for the identity to catch references to the qualified module):

open Camlp4.PreCast
open Pa_do
open Pa_infix
module L = Level

let is_id = function
| <:ident< $uid:"Compose"$.$lid:"id"$ >>
| <:ident<                 $lid:"id"$ >> -> true
| _                                      -> false

let () =
  let expr x f _loc = match f with
  | <:expr< $id:f$ >> when is_id f -> x
  | f -> <:expr< $f$ $x$ >>
  in infix "|>" ~expr (L.binary (L.Higher L.assignment) ~assoc:L.LeftA)

let () =
  let expr f x _loc = match f with
  | <:expr< $id:f$ >> when is_id f -> x
  | f -> <:expr< $f$ $x$ >>
  in infix "%%" ~expr (L.binary (L.Higher L.assignment) ~assoc:L.RightA)

let () =
  let expr f g _loc = match f, g with
  | <:expr< $id:f$ >>, g when is_id f -> g
  | f, <:expr< $id:g$ >> when is_id g -> f
  | f, g ->
    let x = Delimited_overloading.new_lid () in
    <:expr< fun $lid:x$ -> $f$ ($g$ $lid:x$) >>
  in infix "%" ~expr (L.binary (L.Higher L.disjunction) ~assoc:L.RightA)

The great thing about CamlP4 is that it extends the syntax of OCaml to allow for writing matchings on the AST directly. The rules match explicitly on the lowercase identifier "id" and optimize away the application or the composition. In order to give executable semantics to the rewritten syntax, it is necessary to provide an implementation for the operators. Here is compose.ml:

external id : 'a -> 'a = "%identity"
let ( |> ) x f   = f x
let ( %% ) f x   = f x
let ( %  ) f g x = f (g x)

and here is its interface, compose.mli:

external id : 'a -> 'a = "%identity"
val ( |> )  : 'a -> ('a -> 'b) -> 'b
val ( %% )  : ('a -> 'b) -> 'a -> 'b
val ( %  )  : ('b -> 'c) -> ('a -> 'b) -> ('a -> 'c)

Both are utterly trivial, but necessary in order that the following:

let test9 () =
  [succ; succ; pred; id; succ] |> List.fold_left (%) id

works as intended. The following program, use.ml is a non-exhaustive test of the syntax extension:

(* Test of composition operators *)
open Compose

let test1 () =
  [1; 2; 3] |> List.map succ

let test2 () =
  List.map succ %% [1; 2; 3]

let test3 () =
  let x = ref 7 in
  x := !x |> succ

let test4 () =
  let x = ref 7 in
  x := succ %% !x

let test5 () =
  [1; 2; 3] |> List.map succ |> List.fold_left (+) 0 |> succ

let test6 () =
  succ %% List.fold_left (+) 0 %% List.map succ %% [1; 2; 3]

let test7 () =
  succ % List.fold_left (+) 0 % List.map succ %% [1; 2; 3]

let test8 () =
  [1; 2; 3] |> succ % List.fold_left (+) 0 % List.map succ

let test9 () =
  [succ; succ; pred; id; succ] |> List.fold_left (%) id

let test10 () =
  succ % Compose.id %% 10

let test11 () =
  id % succ %% 10

let test12 () =
  id % succ % id % pred %% 10

let test13 () =
  10 |> id

let test14 () =
  id %% 10

let test15 () =
  id % Compose.id % id %% 10

let () = Printf.printf "%d (should be 9)\n" (test9 () 7)

A quick-and-dirty Makefile to build the extension, its library, and the test program (provided you have findlib and pa_do installed, of course):

OCAMLC=    ocamlfind c   -w A
OCAMLOPT=  ocamlfind opt -w A -inline 10
PA_DO=     -package pa_do -syntax camlp4o

EXE=       .exe
PROG=      use$(EXE) use.opt$(EXE)
PACKAGE=   META compose.cmi compose.cmo compose.cmx pa_compose.cmo

ALL: $(PROG)

use$(EXE): compose.cmo use.cmo
    $(OCAMLC) -o $@ $^
use.opt$(EXE): compose.cmx use.cmx
    $(OCAMLOPT) -o $@ $^

pa_compose.cmo: pa_compose.ml
    $(OCAMLC) -c $(PA_DO) -ppopt q_MLast.cmo $<

pa_compose.cmx: pa_compose.ml
    $(OCAMLOPT) -c $(PA_DO) -ppopt q_MLast.cmo $<

use.cmo: use.ml
    $(OCAMLC) -c $(PA_DO) -ppopt pa_infix.cmo -ppopt pa_compose.cmo $<

use.cmx: use.ml
    $(OCAMLOPT) -c $(PA_DO) -ppopt pa_infix.cmo -ppopt pa_compose.cmo $<

use.pre.ml: pa_compose.cmo use.ml
    camlp4 -o $@ -parser o -parser op -printer o -I `ocamlfind -query pa_do` \
        pa_do.cmo pa_infix.cmo pa_compose.cmo use.ml

# Dependencies

compose.cmo: compose.cmi
compose.cmx: compose.cmi
pa_compose.cmo:
pa_compose.cmx:
use.cmo: pa_compose.cmo compose.cmi
use.cmx: pa_compose.cmo compose.cmi

# Phony targets

clean:
    rm -f *.cm* *.o *.obj *~

distclean: clean
    rm -f use.pre.ml $(PROG)

install: $(PACKAGE)
    ocamlfind install pa_compose $(PACKAGE)

uninstall:
    ocamlfind remove pa_compose

# Rules

.SUFFIXES: .cmo .cmi .cmx .ml .mli

.ml.cmo:
    $(OCAMLC) -c $<

.mli.cmi:
    $(OCAMLC) -c $<

.ml.cmx:
    $(OCAMLOPT) -c $<

(replace the spaces by tabs!) And a META file to package it as a ocamlfind extension:

name = "pa_compose"
version = "1.0"
description = "Rewrite rules for functional composition"
requires = "camlp4 pa_do pa_do.infix"
archive(syntax) = "@pa_do/pa_infix.cmo pa_compose.cmo"
archive(byte,toploop) = "@pa_do/pa_infix.cmo compose.cmo pa_compose.cmo"
archive(byte) = "compose.cmo"
archive(native) = "compose.cmx"

If you build the pre-processed example with make use.pre.ml, you'll see how the ids get rewritten away. Once built and installed, it can be directly used from the top-level:

$ ocaml
        Objective Caml version 3.12.0

Findlib has been successfully loaded. Additional directives:
  #require "package";;      to load a package
  #list;;                   to list the available packages
  #camlp4o;;                to load camlp4 (standard syntax)
  #camlp4r;;                to load camlp4 (revised syntax)
  #predicates "p,q,...";;   to set these predicates
  Topfind.reset();;         to force that packages will be reloaded
  #thread;;                 to enable threads

/usr/local/lib/ocaml/str.cma: loaded
/usr/local/lib/ocaml/dynlink.cma: loaded
# #require "pa_compose";;
C:/cygwin/usr/local/lib/ocaml/camlp4: added to search path
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_do: added to search path
/usr/local/lib/ocaml/camlp4/camlp4o.cma: loaded
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_do/pa_do.cmo: loaded
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_do/pa_do_top.cma: loaded
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_compose: added to search path
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_do/pa_infix.cmo: loaded
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_compose/compose.cmo: loaded
C:/cygwin/usr/local/lib/ocaml/site-lib/pa_compose/pa_compose.cmo: loaded
        Camlp4 Parsing version 3.12.0


# 3 |> succ |> succ |> id |> pred |> id |> succ ;;
- : int = 5

Of course, the functions themselves exist:

# open Compose ;;
# id ;;
- : 'a -> 'a = <fun>
# ( % ) ;;
- : ('a -> 'b) -> ('c -> 'a) -> 'c -> 'b = <fun>
# ( |> ) ;;
- : 'a -> ('a -> 'b) -> 'b = <fun>

Happy rewriting!

4 comments:

bluestorm said...

This technique has been used, for example, in the Catch me if you can paper by David Teller, Arnaud Spiwack and Till Varoquaux. They implement monadic exceptions in OCaml, and use a camlp4 filter to inline and optimize the monad operations, wich has a significant impact on performances (see detailed discussion in the paper).


I still don't like the idea very much. While Camlp4 is very useful for providing syntaxic sugar and generally manipulating OCaml syntax, I think a line is crossed when it is used to change the *semantics* of the program. It may be crossed for good reason (really performance sensitive code, etc. etc.), but it shouldn't become an habit.

Performance-wise, I think that the small critical sections of the program should rather step down one abstraction level and use good old style that plays nice with OCaml limited optimizations. Or, at least, use the camlp4 transformations only locally, instead of also using it to process all the rest of code. pa_do is very good for this (local transformations).

Matías Giovannini said...

@bluestorm: I still think that CamlP4 is an under-appreciated and -documented part of the infrastructure that deserves more use cases and better exposition, beyond adding yet more imperative syntactic sugar. I still find making it all work together rather fragile: I spent half an hour peering at OMake and giving up, and another full hour trying to understand why and how findlib was loading what and in what order (I'm convinced it has a bug). Maybe this is useful to someone else, maybe someone will step up and make a tutorial of how to tie it all together properly, robustly and scalably.

I don't claim this example is in the right direction or necessarily relevant, but I do believe there is much work towards and benefits to be reaped from using a controlled form of staged execution. Perhaps there are tasteful ways to enable manual optimization in a modular fashion by using DSLs and local syntax extensions, in a way that the benefits outweigh your reservations.

ChriS said...

What I think is interesting — as you and bluestorm observe — is that these techniques enable domain specific optimizations. In its full generality, this is more a task for pa_do (rewrite rules associated to a single operator are too limited). For example, for complex numbers or for computing with intervals, it allows to use the usual mathematical notations while not paying any price (actually even gaining in speed!).

[I can send the code for interval arithmetic if you want to study it — but it is research code ATM]

Matías Giovannini said...

@ChriS, not only what you say is true, it is already being put in practice. Beyond the paper bluestorm cites, Camlp4's sources show a number of examples of this put into practice (for instance, the list comprehensions parser has an explicit rewriting to erase an identity map fun x -> x. What's cool is that I didn't know that loading a simple extension lets you write [x, y | x <- 1 -- 10; y <- 1 -- x; x mod y <> 0 ).

I'd love to see the code, no matter what state it's in, and also would love to read any paper related to it.