You don't have to be writing an interpreter or some other kind of abstract code to profit from some phantom types. Suppose you have two or more functions that work by "cooking" a simple value (a `float`

, say) with a lengthy computation before proceeding:

let sun_geometric_longitude j = let je = to_jcen (to_jde j) in (* … computation withje… *) let sun_apparent_longitude j = let je = to_jcen (to_jde j) in let q = sun_geometric_longitude j in (* … computation withje… *)

In this case `j` is a date expressed in Julian Days as a `float`

, and `to_jde`

computes the Ephemeris Time as a 63-term trigonometric polynomial correction on it. `sun_apparent_longitude`

calls `sun_geometric_longitude`

and both call `to_jde`

. Obviously this unnecessary duplication can be factored out:

let sun_geometric_longitude je = let t = to_jcen je in (* … computation withje… *) let sun_apparent_longitude je = let q = sun_geometric_longitude je in let t = to_jcen je in (* … computation withje… *) let foo j = let je = to_jde j in let l = sun_apparent_longitude je in (* … *)

(`to_jcen`

is cheap and not worth factoring out.) But now a naked `float`

represent two different things, Universal Time and Ephemeris Time, and we have a valid concern of mixing them up. We can wrap the time in an ADT:

type dt = UT of float | TD of float let to_jcen j = match j with | UT j -> (* … lengthy computation … *) TD j | TD _ -> invalid_arg "to_jcen" let sun_geometric_longitude je = match je with | TD je -> let t = to_jcen je in (* … computation withje… *) | UT _ -> invalid_arg "sun_geometric_longitude" let sun_apparent_longitude je = match je with | TD je -> let q = sun_geometric_longitude je in let t = to_jcen je in (* … computation withje… *) | UT _ -> invalid_arg "sun_apparent_longitude" let foo j = let je = to_jde j in (* … computation with sun_apparent_longitudeje… *)

but this forces us to check *at run-time* whether we mixed times up in our code. A better technique is to use a phantom type. First fix an abstract signature for the module implementing these functions:

module Test : sig type 'a dt val datetime : yy:int -> mm:int -> dd:int -> hh:int -> nn:int -> ss:float -> [ `JD ] dt val to_jde : [ `JD ] dt -> [ `JDE ] dt val to_jcen : 'a dt -> float val sun_geometric_longitude : [ `JDE ] dt -> float val sun_apparent_longitude : [` JDE ] dt -> float end = struct (* … *)

We have a way to construct our type

from a Universal `α` dt`datetime`

, a way to convert it to Dynamical Time with `to_jde`

and operations that respect the kind of measure. The implementation is as before:

(* … *) type 'a dt = float (* phantom type! *) let datetime ~yy ~mm ~dd ~hh ~nn ~ss = (* … *) let to_jde j = (* … *) let to_jcen j = (* … *) let sun_geometric_longitude je = (* … computation with a statically checkedje… *) let sun_apparent_longitude je = let q = sun_geometric_longitude je in (* … computation with a statically checkedje… *) end

Now the *compiler* checks for us that we don't mix up measures. The only inconvenient of this approach is that the type

is fully abstract, and you must provide coercions, `α` dt`string_of`

s and pretty printers for it if you need to show them or debug your code. There is a way out, though; just make it a `private`

type abbreviation:

module Test : sig type 'a dt = private float (* … signature exactly as before … *) end = struct type 'a dt = float (* phantom type! *) (* … implementation exactly as before … *) end

Now

will be shown in the top-level, can be printed with a coercion `α` dt`(`

, etc.`je` :> float)

For another simple example, suppose you want to represent sets as `list`

s. The best way to do that is to keep them sorted so that set operations run in linear time. If you want to provide some operations that temporarily destroy the ordering, a phantom type can keep track of the invariant "this `list`

is sorted":

module Set : sig type ('a, 'b) t = private 'a list val of_list : 'a list -> ('a, [ `S ]) t val as_set : ('a, [ `U ]) t -> ('a, [ `S ]) t val empty : ('a, [ `S ]) t val mem : 'a -> ('a, [ `S ]) t -> bool val add : 'a -> ('a, [ `S ]) t -> ('a, [ `S ]) t val union : ('a, [ `S ]) t -> ('a, [ `S ]) t -> ('a, [ `S ]) t val inter : ('a, [ `S ]) t -> ('a, [ `S ]) t -> ('a, [ `S ]) t val append : ('a, 'b) t -> ('a, 'c) t -> ('a, [ `U ]) t end = struct type ('a, 'b) t = 'a list let of_list l = List.sort compare l and as_set l = List.sort compare l and empty = [] let union l r = (* … linear merge … *) and inter l r = (* … linear merge … *) let mem e = List.mem e and add e = union [e] and append = List.append end

The phantom type `[ `S | `U ]`

tracks the sortedness of the `list`

. Note that in the case of `append`

the argument lists can have any ordering but the result is known to be unsorted. Note also how the fact that the empty `list`

is by definition sorted is directly reflected in the type.

## 2 comments:

I'm not alone !

I'm writing a 10k sloc commercial application on top on OcamlNet, which get and send JSon data, using Atdgen to serialize/deserialize.

OcamlNet isn't simple, it's tree, but it'as great framework which scale transparently. By using a Memcache for global memory, my app can scale from 100 request to 1 million easily.

Pierre-Alexandre, methinks you commented in the wrong entry… maybe you're commenting to this one?

Post a Comment