You don't have to be writing an interpreter or some other kind of abstract code to profit from some phantom types. Suppose you have two or more functions that work by "cooking" a simple value (a float
, say) with a lengthy computation before proceeding:
let sun_geometric_longitude j = let je = to_jcen (to_jde j) in (* … computation with je … *) let sun_apparent_longitude j = let je = to_jcen (to_jde j) in let q = sun_geometric_longitude j in (* … computation with je … *)
In this case j is a date expressed in Julian Days as a float
, and to_jde
computes the Ephemeris Time as a 63-term trigonometric polynomial correction on it. sun_apparent_longitude
calls sun_geometric_longitude
and both call to_jde
. Obviously this unnecessary duplication can be factored out:
let sun_geometric_longitude je = let t = to_jcen je in (* … computation with je … *) let sun_apparent_longitude je = let q = sun_geometric_longitude je in let t = to_jcen je in (* … computation with je … *) let foo j = let je = to_jde j in let l = sun_apparent_longitude je in (* … *)
(to_jcen
is cheap and not worth factoring out.) But now a naked float
represent two different things, Universal Time and Ephemeris Time, and we have a valid concern of mixing them up. We can wrap the time in an ADT:
type dt = UT of float | TD of float let to_jcen j = match j with | UT j -> (* … lengthy computation … *) TD j | TD _ -> invalid_arg "to_jcen" let sun_geometric_longitude je = match je with | TD je -> let t = to_jcen je in (* … computation with je … *) | UT _ -> invalid_arg "sun_geometric_longitude" let sun_apparent_longitude je = match je with | TD je -> let q = sun_geometric_longitude je in let t = to_jcen je in (* … computation with je … *) | UT _ -> invalid_arg "sun_apparent_longitude" let foo j = let je = to_jde j in (* … computation with sun_apparent_longitude je … *)
but this forces us to check at run-time whether we mixed times up in our code. A better technique is to use a phantom type. First fix an abstract signature for the module implementing these functions:
module Test : sig type 'a dt val datetime : yy:int -> mm:int -> dd:int -> hh:int -> nn:int -> ss:float -> [ `JD ] dt val to_jde : [ `JD ] dt -> [ `JDE ] dt val to_jcen : 'a dt -> float val sun_geometric_longitude : [ `JDE ] dt -> float val sun_apparent_longitude : [` JDE ] dt -> float end = struct (* … *)
We have a way to construct our type α dt
from a Universal datetime
, a way to convert it to Dynamical Time with to_jde
and operations that respect the kind of measure. The implementation is as before:
(* … *) type 'a dt = float (* phantom type! *) let datetime ~yy ~mm ~dd ~hh ~nn ~ss = (* … *) let to_jde j = (* … *) let to_jcen j = (* … *) let sun_geometric_longitude je = (* … computation with a statically checked je … *) let sun_apparent_longitude je = let q = sun_geometric_longitude je in (* … computation with a statically checked je … *) end
Now the compiler checks for us that we don't mix up measures. The only inconvenient of this approach is that the type α dt
is fully abstract, and you must provide coercions, string_of
s and pretty printers for it if you need to show them or debug your code. There is a way out, though; just make it a private
type abbreviation:
module Test : sig type 'a dt = private float (* … signature exactly as before … *) end = struct type 'a dt = float (* phantom type! *) (* … implementation exactly as before … *) end
Now α dt
will be shown in the top-level, can be printed with a coercion (je :> float)
, etc.
For another simple example, suppose you want to represent sets as list
s. The best way to do that is to keep them sorted so that set operations run in linear time. If you want to provide some operations that temporarily destroy the ordering, a phantom type can keep track of the invariant "this list
is sorted":
module Set : sig type ('a, 'b) t = private 'a list val of_list : 'a list -> ('a, [ `S ]) t val as_set : ('a, [ `U ]) t -> ('a, [ `S ]) t val empty : ('a, [ `S ]) t val mem : 'a -> ('a, [ `S ]) t -> bool val add : 'a -> ('a, [ `S ]) t -> ('a, [ `S ]) t val union : ('a, [ `S ]) t -> ('a, [ `S ]) t -> ('a, [ `S ]) t val inter : ('a, [ `S ]) t -> ('a, [ `S ]) t -> ('a, [ `S ]) t val append : ('a, 'b) t -> ('a, 'c) t -> ('a, [ `U ]) t end = struct type ('a, 'b) t = 'a list let of_list l = List.sort compare l and as_set l = List.sort compare l and empty = [] let union l r = (* … linear merge … *) and inter l r = (* … linear merge … *) let mem e = List.mem e and add e = union [e] and append = List.append end
The phantom type [ `S | `U ]
tracks the sortedness of the list
. Note that in the case of append
the argument lists can have any ordering but the result is known to be unsorted. Note also how the fact that the empty list
is by definition sorted is directly reflected in the type.
2 comments:
I'm not alone !
I'm writing a 10k sloc commercial application on top on OcamlNet, which get and send JSon data, using Atdgen to serialize/deserialize.
OcamlNet isn't simple, it's tree, but it'as great framework which scale transparently. By using a Memcache for global memory, my app can scale from 100 request to 1 million easily.
Pierre-Alexandre, methinks you commented in the wrong entry… maybe you're commenting to this one?
Post a Comment