How to Write a Simple Web Application Using Ocamlnet

This post might seem to be in apparent contradiction: Ocamlnet is a large, very opinionated framework for network programming that solves most, if not all, those infrastructure issues that need to be taken care of when writing a networked, distributed, fault-tolerant server, from process management to string decoding and including protocol parsing and building. The fact is that Ocamlnet is not particularly complicated for all that it does, and it does quite a bit; but there are (for me at least) three hurdles to overcome in order to start using it effectively:

  • It is big. The project includes Posix wrappers, string handling, process management, CGI programming, various protocols, RPC registration, interprocess communication…
  • It is strange. Most of the API is object oriented rather than functional, using inheritance extensively
  • It is underdocumented. While the API documents are complete, and the project page includes some cookbook examples, for most non-trivial needs you have to go deep into the source code

In this instance I'll follow a tutorial, top-down style and I won't necessarily show complete compile-and-run code, which means that you'll have to reconstruct the source code to a compilable state, but I hope it will still be useful as a starting point and guide to writing HTTP services with Ocamlnet. This tutorial assumes you have installed OCaml 4.0 with Findlib, Ocamlnet 3.6, Yojson and its dependencies and PCRE-Ocaml.

Also, I won't attempt to compare it with Ocsigen since I've never used it, not least because I'm not really fond of convention-over-configuration-style frameworks I don't know it at all, and maybe I'm irrationally prejudiced :-). If you want full control of your code and feel comfortable with a large number of tweakable knobs, then Ocamlnet is for you. If you have work that needs to be done quickly and you're used to RoR or Django, Ocamlnet will definitely set your hair on fire.

Setting up a Netplex server: http.ml

The architecture I use is a Web application written in HTML that consumes JSON services published in the back-end. Ocamlnet lets me serve both the static content (HTML, CSS, JavaScript and graphic assets) and route a number of dynamic services. I can use just that for development, and have a production Apache server that serves the static content and proxies the dynamic requests to the Ocamlnet-published services. Another simplification I make is that I manually route the method URLs instead of letting Ocamlnet do it itself, as it is otherwise perfectly capable to. This makes it simpler to configure the services, at the cost of having to explicitely handle routing in code.

Let's begin by taking a look at the main function:

let main () =
  let database f =
    db_file := Io.get_absolute_path f;
    if not (Sys.file_exists !db_file) then raise (Arg.Bad ("File not found: " ^ f))
  let usage = "usage: " ^ (Filename.basename Sys.executable_name) ^ " [options]" in
  let opt_list, cmdline_cfg = Netplex_main.args () in
  let opt_list = Arg.align ([
"-db"        , Arg.String database,                   "<file>  Database file";
  ] @ opt_list) in
  Arg.parse opt_list (fun s -> raise (Arg.Bad ("Invalid argument: " ^ s))) usage;
  Netsys_signal.init ();
    ~late_initializer:(fun _ _container ->
      Netlog.logf `Notice "Starting up")
    (Netplex_mp.mp ())
      service_factory ();

let () = if not !Sys.interactive then main ()

Netplex is the part of Ocamlnet that orchestrates the management and intercommunication between the processes that make up a network service. It has a number of command-line options for configuration, most notably -fg to launch the service in the foreground instead of as a detached dæmon. Netplex_main.args gives back a list of needed options upon which to add program-specific ones. In this case the only option is to pass a database file. Every filesystem resource must be accessed by absolute path, since Netplex changes the working directory to / upon startup. This file is stored in a global reference:

let db_file = ref (Io.get_absolute_path "myfile.db")

Once the command line is parsed, the service is created. First, Ocamlnet has to take over signal handling to give the service an orderly lifecycle (Netsys is the collection of modules providing cross-platform POSIX functionality). Netplex is then started with the multi-process parallelizer, the standard log and workload factories that set up themselves out of the configuration file, and a single custom service factory that will create the HTTP services themselves:

let service_factory = Nethttpd_plex.nethttpd_factory
  ~hooks:(new environment_hooks)
    default_config with
    default_exn_handler = false;
    permitted_input_content_types = [
    ] @ default_config.permitted_input_content_types
    "json_service", Net.json_service ~dyn_uri:"/cgi" [
      "daterange", with_db Daterange.json;
      "calculate", with_db Calculate.json;
      "calendar" , with_db Calendar.json;

The service name must correspond with that defined in the config file. In order to arrange for the workers to have access to the database file I intercept the service creation with hooks to their lifecycle process to open it and close it as needed. Netcgi sets up the environment that each HTTP service requires to function; in this case I use a minimal configuration that augments valid POST MIME types with a type for JSON requests (not otherwise used in this case) and opt out of the standard exception handler in exchange for my own. I configure a single "json_service" handler that will dispatch to the relevant methods of type cgi_activation → Yojson.Basic.json. The Netplex services for this service are the default Nethttpd_plex ones required by the infrastructure in order to manage the lifecycle of the process group: creation, tear-down and IPC. Note well that the factory is a thunk, not a data structure, the resulting type is unit → Netplex_types.processor_factory.

The lifecycle hooks are specified as a subclass of Netplex_kit.empty_processor_hooks. It uses the Netplex environment plug-in to store a shared reference to the database in a way that both thread- and process-based services can access in an uniform manner:

class environment_hooks = object
  inherit Netplex_kit.empty_processor_hooks ()
  method! post_add_hook _ container =
    container#add_plugin Netplex_mutex.plugin;
    container#add_plugin Netplex_sharedvar.plugin
  method! post_start_hook container =
    Netlog.logf `Info "(%s) opening database \"%s\"" (Net.cur_sys_id ()) !db_file;
    try  set_db (DB.open_database !db_file)
    with DE42x.Error (msg) -> container#log `Crit msg
  method! pre_finish_hook _container =
    Netlog.logf `Info "(%s) closing database" (Net.cur_sys_id ());
    match get_db () with
    | Some db -> DB.close_db db
    | None    -> ()

In this case I open and close database connections (represented here as an open file descriptor) which are stored in a per-process environment:

let get_db, set_db =
  let env_id = "MyService.localenv" in
  let module E = Netplex_cenv.Make_var_type (struct type t = DB.t end) in
  (fun () -> try Some (E.get env_id) with Netplex_cenv.Container_variable_not_found _ -> None),
  E.set env_id

Neplex_cenv makes a strongly-typed accessor for shared variables; in this case I have just one keyed by env_id. As a utility I arrange for my service methods to be closed over a reference to the database (cf the handler setup above):

let with_db proc arg = match get_db () with
| None    -> Net.failf "no database!"
| Some db -> proc db arg

A Nethttp JSON framework: net.ml

Every Nethttpd_plex-based service follows the same structure, while the specifics will make up for the bulk of the application. In this example these details have to do with utilities that make consuming and producing JSON data easier. I have a Net module with a number of helpers, of which I've used two already, cur_sys_id and failf:

let failf fmt = Printf.ksprintf failwith fmt
and argf  fmt = Printf.ksprintf invalid_arg fmt

let cur_sys_id () = match Netplex_cenv.current_sys_id () with
| `Process pid -> Printf.sprintf "PID %d" pid
| `Thread  tid -> Printf.sprintf "TID %d" tid

Another useful function is an encoding-safe string wrapper:

let text = Netencoding.Html.encode_from_latin1

Normally, Nethttp sends HTML 4.01-formatted error messages. In a JSON-based application it is preferable to have standardized JSON errors:

let error_json (env : Netcgi.cgi_environment) status fields cause message =
  let json_of_header hdr =
    try `String (env#input_header_field hdr)
    with Not_found -> `Null
    let script_name = env#cgi_script_name in
    let code = Nethttp.int_of_http_status status in
    env#log_error (Printf.sprintf "%s: %s (Status %i)" script_name message code);
    env#set_output_header_fields []; (* reset *)
    env#set_output_header_field "Content-type" "application/json; charset=utf-8";
    env#set_status status;
    env#set_multiple_output_header_field "Cache-control"
      [ "max-age=0"; "must-revalidate" ];
    let secs = Netdate.mk_mail_date (Unix.time ()) in
    env#set_output_header_field "Expires" secs;
    List.iter (fun (n,v) -> env#set_multiple_output_header_field n v) fields;
    if env#cgi_request_method <> "HEAD" then
      Yojson.Basic.to_output env#out_channel (`Assoc [
        "status"       , `Int    code;
        "statusText"   , `String (Nethttp.string_of_http_status status);
        "cause"        , `String cause;
        "message"      , `String message;
        "scriptName"   , `String script_name;
        "requestMethod", `String env#cgi_request_method;
        "queryString"  , `String env#cgi_query_string;
        "referrer"     ,  json_of_header "referer"
    env#out_channel#flush ();
    env#out_channel#close_out ()
  with e ->
    Netlog.logf `Crit "Unexpected exception %s" (Printexc.to_string e)

This is a good example of how to use the cgi_environment to query the CGI execution and to exert maximum control over the HTTP response. I raise standard Ocaml exceptions from the method handlers and translate them into the relevant HTTP status codes by wrapping them in a higher-order protective function:

let protect handler (cgi : Netcgi.cgi_activation) =
  try handler cgi
  | Netcgi.Argument.Oversized ->
    error_json cgi#environment `Request_entity_too_large []
      "Oversized" "A POST parameter exceeds maximum allowed size"
  | Invalid_argument msg ->
    error_json cgi#environment `Bad_request []
      "Bad request" (text msg)
  | Failure msg ->
    error_json cgi#environment `Internal_server_error []
      "Method failure" (text msg)
  | Not_found ->
    error_json cgi#environment `Not_implemented []
      "Not implemented" "The requested operation is not implemented"
  | exn ->
    let msg = Printexc.to_string exn in
    error_json cgi#environment `Internal_server_error []
      "Internal server error" ("Unexpected exception: " ^ text msg)

It is not normally necessary to manipulate the cgi_environment in such a detailed, low-level manner; the cgi_activation does pretty much the same thing in a easier-to-use way:

let send_json json (cgi : Netcgi.cgi_activation) =
  cgi#set_header ~content_type:"application/json; charset=utf-8" ~cache:`No_cache ();
  Yojson.Basic.to_output cgi#out_channel json;
  cgi#output#commit_work ()

Note well that Yojson doesn't provide a streaming interface: you must build the entire JSON value, which gets serialized in bulk; this makes necessary to configure the HTTP service so that the cgi_activations it creates are at least buffered:

let json_service ?dyn_uri handlers =
  let dyn_translator path =
      let len  = String.length path in
      let path = if len != 0 && path.[0] = '/'
        then String.sub path 1 (len - 1)
        else path
      Pcre.replace ~pat:"/" ~templ:"_" path
  and dyn_handler env cgi =
    protect (fun cgi ->
      let h = List.assoc env#cgi_path_translated handlers in
      send_json (h cgi) cgi) cgi
  in Nethttpd_services.({
    dyn_activation = std_activation `Std_activation_buffered;
    dyn_accept_all_conditionals = true;

The dynamic path translator removes the leading slash and turns subsequent slashes into underscores, so that a method in the namespace /cgi, say can serve as a look-up key in the list of handlers: a call to /cgi/my/method/name will turn into a key my_method_name. This is, of course, a completely arbitrary decision. The dynamic handler in turn looks up the method handler (recall, of type cgi_activation → Yojson.Basic.json) by this key, calls it with the cgi_activtion expecting a JSON response and sends it out. Since the handling is protected against exceptions, any missing method, parameter validation error or exceptional condition is sent out as the corresponding HTTP error response.

Speaking of parameter extraction, I don't use anything fancy like parsing combinators, just plain old higher-order functions and regular expressions validating the result of CGI accessor functions:

let with_arg arg f = Io.unwind ~protect:(fun arg -> arg#finalize ()) f arg

let get_arg cgi name =
  try  Some (with_arg (cgi#argument name) (fun arg -> arg#value))
  with Not_found -> None

let parse ?default ~validate ~parse cgi name =
  match default, get_arg cgi name with
  | None  , None    -> argf "Missing parameter \"%s\"" name
  | Some v, None    -> v
  | _     , Some p  ->
    try  parse (Pcre.extract ~rex:validate ~full_match:false p)
    with Not_found -> argf "Invalid parameter \"%s\"" name

Since CGI arguments can be, if large, buffered into a temporary file, Netcgi requires explicit finalization. Every error is signaled with an Invalid_argument exception which protect catches and translates into a HTTP 400 (Bad Request) via error_json. Parsing specific argument types is straightforward:

let re_char     = Pcre.regexp "^(.)$"
let re_bool     = Pcre.regexp "^(true|false)$"
let re_int      = Pcre.regexp "^([-+]?\\d+)$"
let re_float    = Pcre.regexp "^([-+]?\\d+(?:.\\d*))$"
let re_date     = Pcre.regexp "^(\\d{4})-(\\d{2})-(\\d{2})$"
let re_datetime = Pcre.regexp "^(\\d{4})-(\\d{2})-(\\d{2})[ T](\\d{2}):(\\d{2}):(\\d{2})(Z|[-+]\\d{4})$"

let parse_char ?default cgi = parse ?default ~validate:re_char
  ~parse:(fun res -> res.(0).[0]) cgi

let parse_bool cgi = parse ~default:false ~validate:re_bool
  ~parse:(fun res -> bool_of_string res.(0)) cgi

let parse_int ?default cgi = parse ?default ~validate:re_int
  ~parse:(fun res -> int_of_string res.(0)) cgi

let parse_float ?default cgi = parse ?default ~validate:re_float
  ~parse:(fun res -> float_of_string res.(0)) cgi

let parse_date ?default cgi = parse ?default ~validate:re_date ~parse:(fun res ->
  let year   = int_of_string res.(0)
  and month  = int_of_string res.(1)
  and day    = int_of_string res.(2)
  in let dummy = Netdate.create 0. in
  Netdate.({ dummy with year; month; day; })) cgi

let parse_datetime ?default cgi = parse ?default ~validate:re_datetime ~parse:(fun res ->
  let year   = int_of_string res.(0)
  and month  = int_of_string res.(1)
  and day    = int_of_string res.(2)
  and hour   = int_of_string res.(3)
  and minute = int_of_string res.(4)
  and second = int_of_string res.(5)
  and zone   = match res.(6) with
  |"Z" -> 0
  | dt ->
    let hrs = int_of_string (String.sub dt 1 2)
    and mns = int_of_string (String.sub dt 3 2) in
    let off = 60 * hrs + mns in
    if dt.[0] == '+' then off else -off
  in let dummy = Netdate.create 0. in
  Netdate.({ dummy with year; month; day; hour; minute; second; zone; })) cgi

Writing the JSON methods: myservice.ml

That is the infrastructure, in broad strokes. I put each method in its own module, following a simple convention:

  • I define a type t of parsed method arguments:

    type t = {
      lon : float;
      lat : float;
      dt  : Netdate.t;
      jd  : Jd.t;
      tz  : int;
      lim : Jd.t option;

    (in this instance, Jd.t is the type of dates represented as Julian dates)

  • I define a validate function to parse the CGI arguments into a value of type t:

    let validate db cgi =
      let open Net in
      let lon = parse_float    cgi "lon"
      and lat = parse_float    cgi "lat"
      and dt  = parse_datetime cgi "dt" in
      let jd  = jd_of_netdate   dt in
      let tz  = parse_int      cgi "tz"  ~default:dt.Netdate.zone in
      let lim = if parse_bool  cgi "tab"
        then Some (Jd.min db.DB.max_date (jd_of_netdate (Net.parse_date cgi "lim")))
        else None
      if not (-180. <= lon && lon <= 180.) then
        Net.argf "Longitude out of range"
      if not (-66.56 <= lat && lat <= 66.56) then
        Net.argf "Latitude out of range"
      if Jd.compare jd db.DB.min_date < 0 then
        Net.argf "Date too early"
      if Jd.compare jd db.DB.max_date > 0 then
        Net.argf "Date too late"
      { lon = lon /. Trig.radian;
        lat = lat /. Trig.radian;
        lim; }
  • I define and export a json function to generate the actual output:

    let json db cgi =
      let req = validate db cgi in
      let tz  = req.dt.Netdate.zone / 60 in
      let tdt = Jd.dynamic_time req.jd in
      (* … *)
      `Assoc [
        "jd"           , `Float  t;
        "dt"           ,  Net.time   Jd.(tdt <-> req.jd);
        "lst"          ,  Net.time  (lst /. Jd.secs_day);
        "lon"          ,  Net.angle  ~reduce:false req.lon;
        "lat"          ,  Net.angle  ~reduce:false req.lat;
        (* … *)

    (functions Net.time and Net.angle return appropriate JSON values). This exported function goes into the dynamic method map, as seen in the service_factory above.

Configuring the Netplex server: myservice.conf

That is mostly it, code-wise. It remains the detail of configuring Netplex. I use a simple myservice.conf file:

netplex {
  controller {
    max_level = "info";
    logging {
      type = "file";
      file = "/var/log/myservice.log";
      component = "*";
      subchannel = "*";
      max_level = "info";

  service {
    name = "myservice";
    protocol {
      name = "http";
      tcp_nodelay = true;
      address {
        type = "internet";
        bind = "";
    processor {
      type = "nethttpd";
      timeout = 60.0;
      timeout_next_request = 6.0;
      access_log = "enabled";
      suppress_broken_pipe = true;
      host {
        pref_name = "localhost";
        pref_port = 8080;
        names = "";
        uri {
          path = "/";
          service {
            type = "file";
            docroot = "/path/to/static/";
            media_types_file = "/etc/mime.types";
            default_media_type = "application/xhtml+xml";
            enable_gzip = true;
            enable_listings = false;
            index_files = "index.html";
        uri {
          path = "/cgi";
          method {
            allow = "GET";
            service {
              type = "dynamic";
              handler = "json_service";
    workload_manager {
      type = "dynamic";
      max_jobs_per_thread = 1;
      min_free_jobs_capacity = 2;
      max_free_jobs_capacity = 5;
      max_threads = 50;

Note that the Nethttpd_plex section declares two URIs: the root path maps to a file service that will serve the static content, defaulting to XHTML, while the /cgi prefix will map to the dynamic JSON handler. This is useful for development, since it only requires launching myservice -fg and trying it with a Web browser on In production I set up Apache with mod_proxy like this:

Alias /myservice /path/to/static

<Directory /path/to/static>
  Options FollowSymLinks
  AllowOverride All
  Order allow,deny
  Allow from all

<Location /myservice/>
  AuthType Digest
  AuthName "SERVICE"
  AuthDigestDomain /myservice/
  AuthUserFile /etc/httpd/passwd
  Require valid-user

ProxyPass /myservice/cgi

(where /path/to/static and /cgi must match what is configured in myservice.conf). Of course you can map your application to the HTTP root, in this case I have a single Apache instance serving various sub-paths.

Compiling: Makefile

It is not necessary to complicate the build process with anything more than a properly written Makefile. In this case I have one interface and one implementation for each JSON method (which you will note don't quite correspond to the dynamic service set-up I've shown first). Note well the list of PACKAGES required for building:

OCAMLFLAGS        = -thread -w @a -unsafe
OCAMLOPTFLAGS     = $(OCAMLFLAGS) -inline 10000
OCAMLLIBS         = unix.cma
CFLAGS            = -I/opt/ocaml/lib/ocaml -arch x86_64 -O3 -Wall -Wextra
PACKAGES          = -package threads,pcre,yojson,netplex,netcgi2,nethttpd

SRC  =            \
    net.ml        \
    myservice.ml  \

PROGS = myservice

all: $(PROGS)

myservice: $(SRC:%.ml=%.cmx)
    ocamlfind ocamlopt $(OCAMLOPTFLAGS) $(PACKAGES) -linkpkg $+ -o $@

%.cmi: %.mli
    ocamlfind ocamlc $(OCAMLFLAGS) $(PACKAGES) -c $<

%.cmo: %.ml
    ocamlfind ocamlc $(OCAMLFLAGS) $(PACKAGES) -c $<

%.cmx: %.ml
    ocamlfind ocamlopt $(OCAMLOPTFLAGS) $(PACKAGES) -c $<

%.o: %.c
    ocamlfind ocamlc -ccopt "$(CFLAGS)" -c $<

    /bin/rm -rf *.o *.a *.so *.cmi *.cmo *.cmx *~ *.log

distclean: clean
    /bin/rm -rf $(PROGS) depend

    $(OCAMLDEP) -one-line $(OCAMLLIBS) *.ml *.mli > depend

include depend

.PHONY: clean distclean all


There are a number of more advanced issues I'd like to address in the future. As it is, this framework can handle simple GET and POST requests but won't parse multipart attachments nor handle file transfers. Another thing it doesn't handle is HTTP Authorization; for simple requirements a simple filter can work, while for more complicated set-ups the best way to go is, in my opinion, to leverage Apache as I did here.

For those of you that have to interoperate with SOAP Web Services, the same architectural strategy is perfectly applicable with the aid of PXP and perhaps OC-SOAP.

A big field for further exploration is how to structure a complex Web application into independent services; Netplex makes that possible, if not necessarily easy. There is a hot architectural trend making some noise now called Command-Query Separation (CQS); this pattern can be profitably implemented with a single Netplex RPC service that handles all commands to which the Nethttpd_plex workers delegate. The advantages of this separation are enforced separation of concerns and automatic, transparent fault tolerance and distribution, both of which are the guiding principles behind Ocamlnet's design.

A closing remark that I don't want to miss on making is that the payoff of using Ocamlnet's process model is that it is really fast. My "production" server is an ancient 400 MHz PPC with 384 MB RAM which is perfectly capable of producing and serving really computationally-intensive content with minimal environmental requirements. This is something that I simply couldn't hope to pull off with PHP or Java. I encourage you to try Ocamlnet and see if you find, like I do, that Ocaml is the language of choice of discriminating Web developers.


Merge Right

The so-called master-transaction update is one of the, if not the defining algorithms of the discipline formerly known as "data processing". Given two sorted files of records with increasing keys, the process applies each record in the transaction file to each record of the the master file and outputs the result, if any, to the updated master file in one pass over each input. The same algorithm can compute the union, intersection or difference of sorted sequences. For instance, the union of two sets represented as sorted lists of unique elements is:

let union       =
  let rec go l r = match l, r with
  | [], xs | xs, []  -> xs
  | x :: xs, y :: ys ->
    match compare x y with
    | -1 -> x :: go       xs (y :: ys)
    |  0 -> x :: go       xs       ys
    |  1 -> y :: go (x :: xs)      ys
    |  _ -> assert false
  in go

Intersection is:

let inter       =
  let rec go l r = match l, r with
  | [], _  | _, []   -> []
  | x :: xs, y :: ys ->
    match compare x y with
    | -1 ->      go       xs (y :: ys)
    |  0 -> x :: go       xs       ys
    |  1 ->      go (x :: xs)      ys
    |  _ -> assert false
  in go

while difference is:

let diff       =
  let rec go l r = match l, r with
  | [], _  | _, []   -> l
  | x :: xs, y :: ys ->
    match compare x y with
    | -1 -> x :: go       xs (y :: ys)
    |  0 ->      go       xs       ys
    |  1 ->      go (x :: xs)      ys
    |  _ -> assert false
  in go

And so on. The three functions use the same underlying merge schemata; what varies is the operation to perform in each of the five possible cases:

  1. The left sequence is nil
  2. The right sequence is nil
  3. The next element in the left sequence is less than the next element in the right sequence
  4. The next element in the left sequence is equal to the next element in the right sequence
  5. The next element in the left sequence is greater than the next element in the right sequence

The question is, then, how many set operations can the merge algorithm implement? These five cases partition both input sequences in disjoint sets such that the output sequence is the natural merge of zero or more of them. For example, processing the sets { 1, 3, 4, 6, 7, 8 } ⋈ { 2, 3, 5, 6, 8, 9 } results in the following steps:

1 { 1, 3, 4, 6, 7, 8 } ⋈ { 2, 3, 5, 6, 8, 9, 10 }
2{ 3, 4, 6, 7, 8 } ⋈ { 2, 3, 5, 6, 8, 9, 10 }
3 { 3, 4, 6, 7, 8 } ⋈ { 3, 5, 6, 8, 9, 10 }
4 { 4, 6, 7, 8 } ⋈ { 5, 6, 8, 9, 10 }
5{ 6, 7, 8 } ⋈ { 5, 6, 8, 9, 10 }
6 { 6, 7, 8 } ⋈ { 6, 8, 9, 10 }
7 { 7, 8 } ⋈ { 8, 9, 10 }
8 { 8 } ⋈ { 9, 10 }
9,10∅ ⋈ { 9, 10 }

Abstracting away the operations to perform in each of these five cases we have the following schema:

let merge ln rn lt eq gt : 'a list -> 'a list -> 'a list =
  let rec go l r = match l, r with
  | [], ys -> ln ys
  | xs, [] -> rn xs
  | x :: xs, y :: ys ->
    match compare x y with
    | -1 -> lt x (go       xs (y :: ys))
    |  0 -> eq x (go       xs       ys )
    |  1 -> gt y (go (x :: xs)      ys )
    |  _ -> assert false
  in go

Both ln and rn must decide what to do with the remaining list and so have type α list → α list, while lt, eq and gt must decide what to do with the element in consideration and so have type αα list → α list; thus the type of merge is (α list → α list) → (α list → α list) → (αα list → α list) → (αα list → α list) → (αα list → α list) → α list → α list → α list. The operations on the remainder either pass it unchanged or return nil, while the operations on the next element either add it to the output sequence or not:

let self   xs =      xs
and null   _  =      []
and cons x xs = x :: xs
and tail _ xs =      xs

(some of these have well-known names in functional programming, but here I choose to use these neat, four-letter ones.) With the proviso that the output sequence must be increasing these four functions exhaust the possibilities by parametricity; otherwise, duplications and rearrangements would satisfy the parametric signature. Now union, intersection and difference are simply:

let union l r = merge self self  cons cons cons l r
and inter l r = merge null null  tail cons tail l r
and diff  l r = merge null self  cons tail tail l r

It is obvious that the question I posed above is answered as 25 = 32 possible set operations obtainable by varying each of the five operations. The next question is, then, what are these 32 operations? Let me characterize each of the five sets ln, rn, lt, eq and gt. The easiest one is eq, as it obviously is the intersection of both sets:

eq(A, B) = A ∩ B

By substitution in merge it is possible to show that ln(A, B) = rn(B, A) and vice versa; hence just one set expression suffices. The merge ends with rn for every element in A that is greater than every element in B, as the latter were included in the comparison sets; and conversely for ln. Hence:

ln(A, B) = B / A ≡ ⟨S y : B : ⟨∀ x : A : y > x⟩⟩
rn(A, B) = A / B ≡ ⟨S x : A : ⟨∀ y : B : x > y⟩⟩

(read A / B as "A over B"). All three sets are pairwise disjoint, since A / B ⊆ A and A / B ∩ B = ∅, and conversely, by construction.

The two remaining sets are also symmetric in the sense that lt(A, B) = gt(B, A) but are more difficult to characterize. My first attempt was to think of each element in A as being processed in turn and put into lt(A, B) just when strictly less than all the elements in B against which it could be matched, namely lt(A, B) = ⟨S x : A : x < ⟨min y : B : x ≤ y⟩⟩. The condition can be simplified with a bit of equational reasoning:

   x∈A ∧ x < ⟨min y : B : x ≤ y⟩
≡ { GLB }
   x∈A ∧ ⟨∀ y : y∈B ∧ x ≤ y : x < y⟩⟩
≡ { Trading }
   x∈A ∧ ⟨∀ y : y∈B : x > y ∨ x < y⟩⟩
≡ { Trichotomy }
   x∈A ∧ ⟨∀ y : y∈B : x ≠ y⟩⟩
   x∈A ∧ x∉B

In other words, A − B. The problem is that, since the quantification over an empty set is trivially true, this set is too big as it includes the respective remainder; that is to say A / B ⊆ A − B as I showed above. To preserve disjointness I define:

lt(A, B) = A − B − A / B
gt(A, B) = B − A − B / A

In a Venn diagram, these five sets are:

Venn diagram of the five component sets

So by including or excluding one of the five components depending on the function passed to each of the five operations, the 32 set operations achievable by merge are:

Venn diagrams for all 32 set operations

Or in tabular form:

1selfselfconsconstailA B/A
4selfselftailconsconsB A/B
7selfselftailtailtail B/AA/B
9selfnullconsconstailA B/AA/B
15selfnulltailtailtail B/A
20nullselftailconsconsB B/AA/B
23nullselftailtailtail A/B
25nullnullconsconstailA A/B
28nullnulltailconsconsB B/A

Arguably, besides the traditional five set operations A ∪ B, A ∩ B, A − B, B − A and A ∆ B, only the remainders A / B, B / A and perhaps A / B ∪ B / A = A ⊔ B, the join of A and B (not to be confused with the relational operation), are independently useful. These three are obscure, and as far as I know have no name, although I'd love to hear if there is literature about them. This might mean that this exhaustive catalog of set merges is rather pointless, but at least now I know for sure.


A Helping Phantom Hand

You don't have to be writing an interpreter or some other kind of abstract code to profit from some phantom types. Suppose you have two or more functions that work by "cooking" a simple value (a float, say) with a lengthy computation before proceeding:

let sun_geometric_longitude j =
  let je = to_jcen (to_jde j) in
  (* … computation with je … *)

let sun_apparent_longitude j =
  let je = to_jcen (to_jde j) in
  let q  = sun_geometric_longitude j in
  (* … computation with je … *)

In this case j is a date expressed in Julian Days as a float, and to_jde computes the Ephemeris Time as a 63-term trigonometric polynomial correction on it. sun_apparent_longitude calls sun_geometric_longitude and both call to_jde. Obviously this unnecessary duplication can be factored out:

let sun_geometric_longitude je =
  let t  = to_jcen je in
  (* … computation with je … *)

let sun_apparent_longitude je =
  let q  = sun_geometric_longitude je in
  let t  = to_jcen je in
  (* … computation with je … *)

let foo j =
  let je = to_jde j in
  let l  = sun_apparent_longitude je in
  (* … *)

(to_jcen is cheap and not worth factoring out.) But now a naked float represent two different things, Universal Time and Ephemeris Time, and we have a valid concern of mixing them up. We can wrap the time in an ADT:

type dt = UT of float | TD of float

let to_jcen j = match j with
| UT j ->
  (* … lengthy computation … *)
  TD j
| TD _ -> invalid_arg "to_jcen"

let sun_geometric_longitude je = match je with
| TD je ->
  let t  = to_jcen je in
  (* … computation with je … *)
| UT _  -> invalid_arg "sun_geometric_longitude"

let sun_apparent_longitude je = match je with
| TD je ->
  let q  = sun_geometric_longitude je in
  let t  = to_jcen je in
  (* … computation with je … *)
| UT _  -> invalid_arg "sun_apparent_longitude"

let foo j =
  let je = to_jde j in
  (* … computation with sun_apparent_longitude je … *)

but this forces us to check at run-time whether we mixed times up in our code. A better technique is to use a phantom type. First fix an abstract signature for the module implementing these functions:

module Test : sig
  type 'a dt

  val datetime : yy:int -> mm:int -> dd:int -> hh:int -> nn:int -> ss:float -> [ `JD ] dt
  val to_jde   : [ `JD ] dt -> [ `JDE ] dt
  val to_jcen  : 'a dt -> float

  val sun_geometric_longitude : [ `JDE ] dt -> float
  val sun_apparent_longitude  : [` JDE ] dt -> float
end = struct
  (* … *)

We have a way to construct our type α dt from a Universal datetime, a way to convert it to Dynamical Time with to_jde and operations that respect the kind of measure. The implementation is as before:

  (* … *)
  type 'a dt = float (* phantom type! *)

  let datetime ~yy ~mm ~dd ~hh ~nn ~ss = (* … *)

  let to_jde  j = (* … *)
  let to_jcen j = (* … *)

  let sun_geometric_longitude je =
    (* … computation with a statically checked je … *)

  let sun_apparent_longitude je =
    let q  = sun_geometric_longitude je in
    (* … computation with a statically checked je … *)

Now the compiler checks for us that we don't mix up measures. The only inconvenient of this approach is that the type α dt is fully abstract, and you must provide coercions, string_ofs and pretty printers for it if you need to show them or debug your code. There is a way out, though; just make it a private type abbreviation:

module Test : sig
  type 'a dt = private float
  (* … signature exactly as before … *)
end = struct
  type 'a dt = float (* phantom type! *)
  (* … implementation exactly as before … *)

Now α dt will be shown in the top-level, can be printed with a coercion (je :> float), etc.

For another simple example, suppose you want to represent sets as lists. The best way to do that is to keep them sorted so that set operations run in linear time. If you want to provide some operations that temporarily destroy the ordering, a phantom type can keep track of the invariant "this list is sorted":

module Set : sig
  type ('a, 'b) t = private 'a list

  val of_list : 'a list -> ('a, [ `S ]) t
  val as_set  : ('a, [ `U ]) t -> ('a, [ `S ]) t
  val empty   : ('a, [ `S ]) t
  val mem     : 'a -> ('a, [ `S ]) t -> bool
  val add     : 'a -> ('a, [ `S ]) t -> ('a, [ `S ]) t
  val union   : ('a, [ `S ]) t -> ('a, [ `S ]) t -> ('a, [ `S ]) t
  val inter   : ('a, [ `S ]) t -> ('a, [ `S ]) t -> ('a, [ `S ]) t
  val append  : ('a, 'b) t -> ('a, 'c) t -> ('a, [ `U ]) t
end = struct
  type ('a, 'b) t = 'a list

  let of_list l   = List.sort compare l
  and as_set  l   = List.sort compare l
  and empty       = []
  let union   l r = (* … linear merge … *)
  and inter   l r = (* … linear merge … *)
  let mem     e   = List.mem e
  and add     e   = union [e]
  and append      = List.append

The phantom type [ `S | `U ] tracks the sortedness of the list. Note that in the case of append the argument lists can have any ordering but the result is known to be unsorted. Note also how the fact that the empty list is by definition sorted is directly reflected in the type.


Theorems for Free: The Monad Edition

This is for the record, since the derivations took me a while and I'd rather not lose them.

A functor is the signature:

module type FUNCTOR = sig
  type 'a t
  val fmap : ('a -> 'b) -> ('a t -> 'b t)

satisfying the following laws:

Identity:    fmap id      ≡ id
Composition: fmap (f ∘ g) ≡ fmap f ∘ fmap g

An applicative structure or idiom is the signature:

module type APPLICATIVE = sig
  type 'a t
  val pure : 'a -> 'a t
  val ap : ('a -> 'b) t -> ('a t -> 'b t)

satisfying the following laws:

Identity:     ap (pure id)                  ≡ id
Composition:  ap (ap (ap (pure (∘)) u) v) w ≡ ap u (ap v w)
Homomorphism: ap (pure f) ∘ pure            ≡ pure ∘ f
Interchange:  ap u (pure x)                 ≡ ap (pure (λf.f x)) u

An applicative functor is the structure:

module type APPLICATIVE_FUNCTOR = sig
  type 'a t
  include FUNCTOR     with type 'a t := 'a t
  include APPLICATIVE with type 'a t := 'a t

that is simultaneously a functor and an applicative structure, satisfying the additional law:

Fmap: fmap ≡ ap ∘ pure

A monad is the structure:

module type MONAD = sig
  type 'a t
  val return : 'a -> 'a t
  val bind : ('a -> 'b t) -> ('a t -> 'b t)

satisfying the following laws:

Right unit:    bind return     ≡ id
Left unit:     bind f ∘ return ≡ f
Associativity: bind f ∘ bind g ≡ bind (bind f ∘ g)

Every monad is an applicative functor:

module ApplicativeFunctor (M : MONAD)
: APPLICATIVE_FUNCTOR with type 'a t = 'a M.t
= struct
  type 'a t = 'a M.t
  open M
  let fmap f = bind (fun x -> return (f x))
  let pure   = return
  let ap u v = bind (fun f -> fmap f v) u

This can be proved by easy (but tedious) equational reasoning. That the proof is rigorous is Wadler's celebrated result.

Proof of the Functor Identity:

  fmap id
≡ { definition }
  bind (return ∘ id)
≡ { composition }
  bind return
≡ { Monad Right unit }

Proof of the Functor Composition:

  fmap f ∘ fmap g
≡ { definition }
  bind (return ∘ f) ∘ bind (return ∘ g)
≡ { Monad Associativity }
  bind (bind (return ∘ f) ∘ return ∘ g)
≡ { Monad Left unit }
  bind (return ∘ f ∘ g)
≡ { definition }
  fmap (f ∘ g)

A number of naturality conditions are simple equations between λ-terms. I'll need these later:

Lemma 1 (Yoneda):

  fmap f ∘ (λh. fmap h x)
≡ { defn. ∘, β-reduction }
  λg. fmap f (fmap g x)
≡ { defn. ∘ }
  λg. (fmap f ∘ fmap g) x
≡ { Functor Composition }
  λg. fmap (f ∘ g) x
≡ { abstract }
  λg. (λh. fmap h x) (f ∘ g)
≡ { defn. ∘, η-contraction }
  (λh. fmap h x) ∘ (f ∘)

Lemma 2:

  fmap f ∘ return
≡ { definition }
  bind (return ∘ f) ∘ return
≡ { Monad Left unit }
  return ∘ f

Lemma 3:

  bind f ∘ fmap g
≡ { definition fmap }
  bind f ∘ bind (return ∘ g)
≡ { Monad Associativity }
  bind (bind f ∘ return ∘ g)
≡ { Monad Left unit }
  bind (f ∘ g)

Lemma 4:

  bind (fmap f ∘ g)
≡ { definition fmap }
  bind (bind (return ∘ f) ∘ g)
≡ { Monad Associativity }
  bind (return ∘ f) ∘ bind g
≡ { definition fmap }
  fmap f ∘ bind g

The Applicative Functor condition is easy to prove and comes in as a handy lemma:

  ap ∘ pure
≡ { definition }
  λv. bind (λf. fmap f v) ∘ return
≡ { Monad Left unit }
  λv. λf. fmap f v
≡ { η-contraction }

Proof of the Applicative Identity:

  ap (pure id)
≡ { Applicative Functor }
  fmap id
≡ { Functor Identity }

Proof of the Applicative Homomorphism:

  ap (pure f) ∘ pure
≡ { Applicative Functor }
  fmap f ∘ pure
≡ { Lemma 2, defn. pure }
  pure ∘ f

Proof of the Applicative Interchange:

  ap (pure (λf.f x)) u
≡ { Applicative Functor }
  fmap (λf.f x) u
≡ { definition }
  bind (return ∘ (λf.f x)) u
≡ { defn. ∘, β-reduction }
  bind (λf. return (f x)) u
≡ { Lemma 2 }
  bind (λf. fmap f (return x)) u
≡ { definition }
  ap u (pure x)

The proof of the Applicative Composition condition is the least straightforward of the lot, as it requires ingenuity to choose the reduction to apply at each step. I started with a long, tedious derivation that required forward and backward reasoning; at the end I refactored it in byte-sized lemmas in order to simplify it as much as I could. As a heuristic, I always try to start from the most complicated expression to avoid having to guess where and what to abstract (that is, applying elimination rules requires neatness, while applying introduction rules requires backtracking):

  ap (ap (ap (pure (∘)) u) v) w
≡ { Applicative Functor }
  ap (ap (fmap (∘) u) v) w
≡ { definition }
  bind (λf. fmap f w) (bind (λf. fmap f v) (fmap (∘) u))
≡ { Lemma 3 with f := λf. fmap f v, g := (∘) }
  bind (λf. fmap f w) (bind ((λf. fmap f v) ∘ (∘)) u)
≡ { Monad Associativity with f := λf. fmap f w, g := (λf. fmap f v) ∘ (∘) }
  bind (bind (λf. fmap f w) ∘ (λf. fmap f v) ∘ (∘)) u
≡ { defn. ∘ (at right) }
  bind (λf. (bind (λf. fmap f w) ∘ (λf. fmap f v)) (f ∘)) u
≡ { defn. ∘, β-reduction }
  bind (λf. bind (λf. fmap f w) (fmap (f ∘) v)) u
≡ { Lemma 3 with f := λf. fmap f w, g := (f ∘) }
  bind (λf. bind ((λf. fmap f w) ∘ (f ∘)) v) u
≡ { Lemma 1 }
  bind (λf. bind (fmap f ∘ (λf. fmap f w)) v) u
≡ { Lemma 4 with g := λf. fmap f w }
  bind (λf. fmap f (bind (λf. fmap f w) v)) u
≡ { definition }
  ap u (ap v w)

What is this good for? I don't really know; Haskellers can't seem to be able to live without them while I can't seem to justify their application. I suspect laziness has a lot to do with it; in any case, the machinery is more palatable with the appropriate combinators:

module Functor (F : FUNCTOR) = struct
  include F
  let ( <$> ) = fmap

module Applicative (A : APPLICATIVE) = struct
  include A
  let ( <*> ) = ap

module Monad (M : MONAD) = struct
  include M
  include (ApplicativeFunctor (M)
         : APPLICATIVE_FUNCTOR with type 'a t := 'a t)
  let ( <$> )     = fmap
  let ( <*> )     = ap
  let ( >>= ) m f = bind f m


An Odd Lemma

While proving that every monad is an applicative functor, I extracted the following derivation as a lemma:

  fmap f ∘ (λh. fmap h x)
≡ { defn. ∘, β-reduction }
  λg. fmap f (fmap g x)
≡ { defn. ∘ }
  λg. (fmap f ∘ fmap g) x
≡ { Functor }
  λg. fmap (f ∘ g) x
≡ { abstract f ∘ g }
  λg. (λh. fmap h x) (f ∘ g)
≡ { defn. ∘, η-contraction }
  (λh. fmap h x) ∘ (f ∘)

for all f, x. This is the Yoneda Lemma in a special form. The term λh. fmap h x is the natural transformation between an arbitrary functor F and the functor Hom(X, —), where X is fixed by the type of x. Dan Piponi calls it the hardest trivial thing in mathematics. I didn't recognize it immediately (my category-fu is nonexistent), but the striking symmetry made me curious and I started investigating.