2008-08-15

Monads are Plug-ins

As Haskell gains popularity, a dozen articles or more a month about them make the front page at the Programming reddit. Most deal with the necessity of using and understanding monads to control side-effects in Haskell. This is only natural and to be expected since, in Haskell, they're a language design decision, and as such as inescapable as laziness and as structural as the predefined operator precedence.

My background is in OCaml (gratuitous biographical aside: I found a notebook with OCaml code I wrote in '96), and so I view the IO and attendant monads with some disdain. Let me explain myself: my view is that using them to control side-effects is a rather pedestrian application of monads.

There's another, much higher-level view of them, and that is monads as categorical constructions. Category theory, by its very nature as a mathematical framework, is a language in which to express relations and operations between things. It's the mathematical equivalent of the GoF's patterns. So, reading an article like this one left me with the slightly uneasy feeling of having simultaneously not understood much of it (because I don't know Haskell, and because the article has no context to speak of) and knowing that an isomorphism between the IO monad and anything else cannot allow you to "weave in and out of using IO": freeness demands that the "other" side of the isomorphism contain, encode or fake enough of IO so as to be undistinguishable from it (the type of IO' as presented in the article lent me evidence of this). This is where category theory comes handy: understanding by pattern-matching.

The practical upshot of this is that it enables a third view of monads, Moggi's, as capsules of meaning that are, if not composable, certainly pluggable. First in Notions of computation and monads, then in Monads and Effects (whence IO and friends), Moggi shows that monads are the right tool for the (hard) job of making your code independent of, or at least parameterizable by, semantics. Monads are plug-ins.

In what way are monads plug-ins? Well, a plug-in must have two things: specific behavior that you can invoke to augment your program in unforeseen ways even after you've written it, and a standard interface by which the program can call it. Monads have a rather minimalist interface:

module type MONAD = sig
  type 'a m
  val return : 'a -> 'a m
  val bind   : 'a m -> ('a -> 'b m) -> 'b m
end

First comes a type, α m: this is the "protocol" by which your program and the plug-in communicate without one having to "know too much" (or more than the standardized interface allows) about the other. It lets the monad say, in effect: "for me to be arbitrary but compatible, you'll have to accept that I do things my way". The monad gets to have its "private space" in which to do its job, without fear that the program will mess up its stuff.

Then comes the method return, that the program uses to pass values down to the monad. It lets the code say, "here, have this value for your own use". Finally, there's the operation bind that lets the program and the monad maintain a conversation. With it, the program can say to the monad "oh, when you are done with anything, here's another thing I need you to do", to which the monad responds "OK, I'll take that and give you back the result". This "wiring" aspect to the bind is emphasized in it being usually called >>=, an infix operator.

This is all nice, operational and anthropomorphizing, but hardly illuminating. The fact that the monad functions as a protocol is formalized in the monad laws which every concrete, "plug-in" monad must abide to:

  1. bind (return a) k  ≡  k a
  2. bind m return  ≡  m
  3. bind (bind m k) h  ≡  bind m (fun x -> bind (k x) h)

The first law states that return on the left of bind does nothing. The second law states that return on the right of bind does nothing. The third law means that a bind of binds associates freely. But that's exactly what a monoid is: a "thing" with an associative operation that has a neutral, "do-nothing" element on the left and right of it.

Consider by way of example the concatenation of strings, denoted by ^. There is a special string, "" or the empty string, that "doesn't change" a string when catenated to it: for all strings s, "" ^ s is the same as s, and so is s ^ "". Also, it doesn't matter which string you concatenate before and which after; it's the order in which you concatenate that matters. In symbols, s ^ (t ^ u) is the same as (s ^ t) ^ u.

You might think, what have monoids to do with monads? The answer is that monads "behave" algebraically as monoids. You might think next, aren't monads monoids, then? Well, no. The crucial difference between a monoid and a monad is that, packed inside the return and the bind operations of a monad there is meaning tucked in. It could send output to a console. It could update a memory location. It could try simultaneously different possible computations and return any, or all of them. It could record transactional data. The key is that the monad laws are a contract that allow you, no matter how complicated the "stuff" the monad does underneath, operate with it in a structured, predictable way. Arbitrary behavior under a fixed interface. A plug-in, in short.

If you followed me so far you might have a nagging doubt by now. How do you "escape" the monad, once you use it? That is, return serves to put plain old values in a "wrapper" the monad can use but, how can you get back what the monad did with it? Where's the function with type α m → α? The bad news is that there isn't any… in general. That is, there is no guaranteed way to get back a value from an arbitrary monad. However, many useful monads do provide an escape to "retrieve" the useful work the monad did.

At this point I should show a concrete monad. A classical example (one that I'll eventually need to use) is the state monad. It allows to "plug-in" statefulness to an otherwise "pure" or "stateless" function:

module State (S : sig type t end) = struct
  type 'a m = St of (S.t -> 'a * S.t)
  let return a = St (fun s -> (a, s))
  let bind (St m) f = St (fun s ->
    let (x, s') = m s in
    let (St m') = f x in
    m' s')

  let get = St (fun s -> (s, s))
  let put = fun s -> St (fun _ -> ((), s))

  let eval (St m) = fun s -> fst (m s)
  let run  (St m) = fun s -> snd (m s)
end

The innards are perhaps not very illuminating on a first reading, so let me show its type:

module State : functor (S : sig type t end) -> sig
  type 'a m = St of (S.t -> 'a * S.t)
  val return : 'a -> 'a m
  val bind   : 'a m -> ('a -> 'b m) -> 'b m
  val get    : S.t m
  val put    : S.t -> unit m
  val eval   : 'a m -> S.t -> 'a
  val run    : 'a m -> S.t -> S.t
end

This monad is multi-parametric, in the sense that not only the values your function computes are arbitrary (the α in the monad's type), but the states you might want to track are too. Suppose you want to tally the arithmetic operations a function uses to calculate a value. The state can be as simple as a single integer, or as complex as a tuple recording separate counts for additions and subtractions, multiplications, divisions, square roots and so on. Whichever the case, the key to this monad is that an impure function can be split into a pure part and a state transformer part that takes an input state and returns an output state together with the value of the function. With that, it's not difficult to see that if return is to be a neutral element, it must pass its input state unchanged. In other words, the effect of a value is nil. In yet other words, a value is always pure. Also, bind must keep track of intermediate states between effects, and thread them appropriately.

Immutable state is no state at all but just another value, so this monad comes with a pair of operations get and put that allow your program to retrieve the state at some point of your function, and modify it. Also, after your computation terminates, you should be able to retrieve not only its final value with eval but also to access the final state with run. This variant assumes you'll want one or the other; it is possible to get both at once without having to re-compute the function.

Note that get is a monadic value, not a function. This might not make sense unless you look into the monad and see how it is constructed. For a less operational view, think of get as representing, by itself, whatever the state is at that point in the program. Since it's a monadic value, the only useful thing to do with it is to bind it to a function:

module St = State(struct type t = int end)

let incr = St.bind St.get (fun s -> St.put (succ s))

With that, it's simple to write a monadic addition operator that counts how many times it is invoked:

let ( +! ) mx my =
  St.bind mx   (fun x ->
  St.bind my   (fun y ->
  St.bind incr (fun _ -> St.return (x + y))))

And using it is equally straightforward:

let test () =
  let msum l = List.fold_right (fun x m -> St.return x +! m) l (St.return 0)
  in St.run (msum [1;2;3;4]) 0

This is all quite tedious to repeat each time I need a monad. Fortunately, most of the operations you can build out of monadic bind and monadic return are structural, and so generic on the particular monad. Enter the Monad functor:

module Monad (M : MONAD) = struct
  include M

  let seq m f = bind m (fun _ -> f)

  let join m = bind m (fun m -> m)

  let fmap f m = bind m (fun x -> return (f x))

  let liftm = fmap

  let liftm2 f m m' =
    bind m  (fun x ->
    bind m' (fun y ->
    return (f x y)))

  let liftm3 f m m' m'' =
    bind m   (fun x ->
    bind m'  (fun y ->
    bind m'' (fun z ->
    return (f x y z))))

  let mapm f l =
    List.fold_right (liftm2 (fun x xs -> f x :: xs)) l (return [])

  let sequence l = mapm (fun x -> x) l

  let mapm_ f l =
    List.fold_right (fun x -> seq (f x)) l (return ())

  let sequence_ l = mapm_ (fun x -> x) l

  module Op = struct
    let ( >>= ) = bind
    let ( >>  ) = seq
  end
end

Again, looking at the types is more illuminating than squinting at the implementation:

module Monad : functor (M : MONAD) -> sig
  type 'a m = 'a M.m
  val return    : 'a -> 'a m
  val bind      : 'a m -> ('a -> 'b m) -> 'b m
  val seq       : 'a m -> 'b m -> 'b m
  val join      : 'a m m -> 'a m
  val fmap      : ('a -> 'b) -> 'a m -> 'b m
  val liftm     : ('a -> 'b) -> 'a m -> 'b m
  val liftm2    : ('a -> 'b -> 'c) -> 'a m -> 'b m -> 'c m
  val liftm3    : ('a -> 'b -> 'c -> 'd) -> 'a m -> 'b m -> 'c m -> 'd m
  val mapm      : ('a -> 'b) -> 'a m list -> 'b list m
  val sequence  : 'a m list -> 'a list m
  val mapm_     : ('a -> 'b m) -> 'a list -> unit m
  val sequence_ : 'a m list -> unit m
  module Op : sig
    val ( >>= ) : 'a m -> ('a -> 'b m) -> 'b m
    val ( >> )  : 'a m -> 'b m -> 'b m
  end
end

seq allows you to invoke the monad on the left not for its value but for its effect. join lets you "squash down" a monadic monad to the underlying monad; the fact that it is generic and independent of the underlying semantics means that the monad itself, viewed as a categorical functor, is idempotent, that is, it's useless to monadize a monad. fmap "maps" a function inside a monad, which effectively makes a functor out of a monad. liftmn is an indexed family of mappers for functions of one, two… arguments, I've stopped at three but you could go on as you need.

A list of monads can be composed together to turn them "inside out" into a monad of lists. That is what mapm and sequence do, commuting the monad and the list, with or without transforming the list's elements first. Also, a list of monads can be evaluated in sequence purely by their side effects, that is where the mapm_ and sequence_ come in handy (the names of these functions are exactly the same as that of their Haskell counterparts).

Lastly, the two very common operations return and seq have very convenient operators associated with them. With all this, I can write the following:

module StM = Monad(St)
open StM.Op

let count2 op x y = incr >> StM.liftm2 op x y

let ( +! ) = count2 ( + )
and ( -! ) = count2 ( - )
(* and so on *)

This is the big payoff monads offer: functions can be generic on the structure of the monad, and so independent of its semantics. Monads are plug-ins.

Let me try with another very different monad, a very simple Out:

module Out = struct
  type 'a m = Out of (out_channel -> 'a)
  let return a = Out (fun _ -> a)
  let bind (Out m) f = Out (fun out ->
      let x = m out in
      let (Out m') = f x in
      m' out)

  let fmt fmt =
    let ship s = Out (fun out ->
      output_string out s; output_char out '\n')
    in Printf.kprintf ship fmt

  let write name (Out m) =
    let out = open_out_bin name in
    try let x = m out in close_out out; x
    with e -> close_out out; raise e
end

This monad interleaves formatted writing operations, fmt, as a side effect of whatever the code using it does. What is special of this monad is that the "escape" function, write ships out all the accumulated shows to a named output file, safely closing it even in the event of an exception.

With this, I'm ready to show you how to actually consume plug-ins from your program. Suppose you want to have a simple graphics DSL. I have a program with exactly such a need, and as a MacOS X user, the easiest way I found to satisfy it was to generate EPS and to use /usr/bin/open to handle the data for me. But I didn't want to encapsulate (!) my drawing code inside a PostScript interface, but needed it to be as "abstract" as possible (my original back-end was a Mathematica one). I settled on a simple hybrid:

type point = float * float

type rgbcolor = float * float * float

module type GRAP = sig
  module M : MONAD
  type t = unit M.m
  val weight : float -> t
  val gray   : float -> t
  val color  : rgbcolor -> t
  val save   : t -> t
  val dot    : point -> t
  val line   : point -> point -> t
  val rect   : ?fill:bool -> point -> point -> t
  val poly   : ?fill: bool -> point list -> t
end

The module is not parameterized by a monad, but it does include a monad that will provide the "glue" to whatever back-end I want to write. As you can see, it is rather primitive but includes operations for drawing in color. It is worth mentioning that this is not a so-called retained interface, but an immediate one, and so save runs the drawing operations on a "sub-context" that is restored when all the commands are shipped out. Also, the type that carries the drawing operations is an alias for the unit monadic value, as drawing doesn't really have a value. My first implementation was the EPS back-end:

let mm = ( *. ) (72. /. 25.4)

module Eps = struct
  module M = Monad(Out)
  type t = unit M.m
  open Out
  open M.Op

In this case, all drawing is to be output to a file, so the monad that implements the plug-in driver is Out. The code uses fmt extensively to format the PostScript commands. For instance, an EPS file must specify a bounding box against which all drawing will be clipped to:

  let iso_time () =
    let tm = Unix.gmtime (Unix.time ()) in
    Printf.sprintf "%04d%02d%02dT%02d%02d%02d"
      (tm.Unix.tm_year + 1900) (tm.Unix.tm_mon + 1) tm.Unix.tm_mday
      tm.Unix.tm_hour tm.Unix.tm_min tm.Unix.tm_sec

  let eps margin width height drawing =
    let app_name = Filename.basename Sys.executable_name in
       fmt "%%!PS-Adobe-3.0 EPSF-3.0"
    >> fmt "%%%%BoundingBox: 0 0 %d %d"
      (truncate (ceil (width  +. 2. *. margin)))
      (truncate (ceil (height +. 2. *. margin)))
    >> fmt "%%%%Creator: %s" app_name
    >> fmt "%%%%CreationDate: %s" (iso_time ())
    >> fmt "%%%%DocumentData: Clean7Bit"
    >> fmt "%%%%EndComments"
    >> drawing
    >> fmt "showpage"
    >> fmt "%%%%EOF"

The monad is used purely for its side-effects, namely writing to a file, so I don't bind, I always seq. After the drawing is complete, I actually want to see it on screen; for that, I write a temporary EPS file that I pass to /usr/bin/open, which in turn opens Preview, which converts the file to PDF and shows it:

  let show ?(margin=0.) width height f =
    let name = Filename.temp_file "grap" ".eps" in
    write name (eps margin width height f);
    ignore (Unix.system (Printf.sprintf "/usr/bin/open %s" name))

Indirect, but a showcase of the power of UNIX at work. The drawing operations themselves are pretty straightforward:

  let weight w        = fmt "%g setlinewidth" w
  and gray   w        = fmt "%g setgray" w
  and color (r, g, b) = fmt "%g %g %g setrgbcolor" r g b

  let save f = fmt "gsave" >> f >> fmt "grestore"

  let moveto (x, y) = fmt "%g %g moveto" x y
  let lineto (x, y) = fmt "%g %g lineto" x y
  let path f = fmt "newpath" >> f >> fmt "closepath"
  let draw   = fmt "stroke"
  let paint  = fmt "fill"

As is save, path is a higher-order function that constructs a complete path with the drawing commands. This comes handy:

  let dot (x, y) =
    path (fmt "%g %g currentlinewidth 1.5 mul 0 360 arc" x y) >> paint

  let line p q =
    fmt "newpath" >> moveto p >> lineto q >> draw

  let rect ?(fill=false) (x, y) (x', y') =
    let w, h = x' -. x, y' -. y in
    fmt "%g %g %g %g rect%s"
      x y w h (if fill then "fill" else "stroke")

Polygons are built out of a sequence of points. The first one is the starting point and must be passed to moveto, the rest are to be arguments of linetos. A mapping would do, except for the fact that the code is in monadic style, and must use mapm_:

  let polyline closing l = match l with
  | []      -> return ()
  | p :: ps -> path (moveto p >> M.mapm_ lineto ps) >> closing

  let poly ?(fill=false) = polyline (if fill then paint else draw)
end

Note also that I avoid invoking graphics primitives if the polygon is empty. With this, a simple example works as expected:

let () =
  let module M = struct
    open Eps
    open Eps.M.Op
    let draw p = show 100. 100. (poly p >> (M.mapm_ dot (List.tl p)))
  end in M.draw (List.map (fun i ->
    let a = float i *. 2. *. pi /. 10. in
    let c, s = cos a, sin a in
    (50. +. 20. *. c, 50. +. 20. *. s)
  ) (iota 11))

Success! A decagon with fat vertices. The only problem with an EPS back-end is that it is difficult to get the optimal width and height for an image. In other words, it would be ideal to have some auto-scaling. For that, I have to know the size of a drawing before turning it into PostScript code. Enter the Bounds back-end:

module Bounds = struct
  type bounds = { top : float; left : float; bottom : float; right : float; }

  let empty =
    { top = infinity; left = infinity;
      bottom = neg_infinity; right = neg_infinity; }

I keep track of the bounding box of the commands seen by the interface, as a rectangle given by two pairs of coordinates. The initial, empty bounds are such that I can accumulate a point simply by a series of minimum and maximum operations:

  let add b (x, y) =
    { top    = min y b.top;
      left   = min x b.left;
      bottom = max y b.bottom;
      right  = max x b.right; }

Now the commands will have to preserve the bounding box seen so far. This is where the State monad comes in:

  module St = State(struct type t = bounds end)

  module M = Monad(St)
  type t = unit M.m
  open M.Op

A command that doesn't paint to the output area will be a nop. A command that adds a point p updates the bounds with that point and changes the state accordingly (cf with incr above):

  let nop      = M.return ()
  and draw   p = St.get >>= fun b -> St.put (add b p)

Again, I'm not interested in the final value of the drawing, only in the final bounding box, that is, the state after a series of commands are applied to the empty bounding box:

  let bounds f = St.run f empty

The commands themselves are trivial: those that merely alter the graphics state are nops, those that draw accumulate points on the bounding box:

  let weight       _ = nop
  and gray         _ = nop
  and color        _ = nop
  and save         f = f
  and dot          p = draw p
  and line       p q = draw p >> draw q
  and rect ?fill p q = draw p >> draw q
  and poly ?fill   l = M.mapm_ draw l
end

Now the previous code runs almost unmodified, except that the call to Eps.draw is replaced by a call to Bounds.bounds:

let b =
  let module M = struct
    open Bounds
    open Bounds.M.Op
    let draw p = bounds (poly p >> (M.mapm_ dot (List.tl p)))
  end in M.draw (List.map (fun i ->
    let a = float i *. 2. *. pi /. 10. in
    let c, s = cos a, sin a in
    (50. +. 20. *. c, 50. +. 20. *. s)
  ) (iota 11))

Which results in:

val b : Bounds.bounds =
  {Bounds.top = 30.9788696740969272; Bounds.left = 30.;
   Bounds.bottom = 69.0211303259030728; Bounds.right = 70.}

This allows me to scale up the drawing to fit the boundaries of the EPS file, or to adjust the size of the EPS file so that no space is wasted. The important thing is that the code that contains the drawing commands is left unchanged. Monads are plug-ins.

Next, I'll show you pretty pictures:

4 comments:

Anonymous said...

I read this bit...

Then comes the method return, that the program uses to pass values down to the monad. It lets the code say, "here, have this value for your own use". Finally, there's the operation bind that lets the program and the monad maintain a conversation. With it, the program can say to the monad "oh, when you are done with anything, here's another thing I need you to do", to which the monad responds "OK, I'll take that and give you back the result". This "wiring" aspect to the bind is emphasized in it being usually called >>=, an infix operator.

...and had a bit of an "aha" moment. It seems like a monad (by this description - I'm no expert) is like a fork or a pipe. You open up a pipe to another process and you can communicate with it via it's handle. Is that a good analogy?

Anonymous said...

> Is that a good analogy?

Yes

Paolo said...

Thanks for this excellent post!

Jon Harrop said...

This is a really wonderful article! Thank you.

Cheers,
Jon Harrop.