2012-07-19

Theorems for Free: The Monad Edition

This is for the record, since the derivations took me a while and I'd rather not lose them.

A functor is the signature:

module type FUNCTOR = sig
  type 'a t
  val fmap : ('a -> 'b) -> ('a t -> 'b t)
end

satisfying the following laws:

Identity:    fmap id      ≡ id
Composition: fmap (f ∘ g) ≡ fmap f ∘ fmap g

An applicative structure or idiom is the signature:

module type APPLICATIVE = sig
  type 'a t
  val pure : 'a -> 'a t
  val ap : ('a -> 'b) t -> ('a t -> 'b t)
end

satisfying the following laws:

Identity:     ap (pure id)                  ≡ id
Composition:  ap (ap (ap (pure (∘)) u) v) w ≡ ap u (ap v w)
Homomorphism: ap (pure f) ∘ pure            ≡ pure ∘ f
Interchange:  ap u (pure x)                 ≡ ap (pure (λf.f x)) u

An applicative functor is the structure:

module type APPLICATIVE_FUNCTOR = sig
  type 'a t
  include FUNCTOR     with type 'a t := 'a t
  include APPLICATIVE with type 'a t := 'a t
end

that is simultaneously a functor and an applicative structure, satisfying the additional law:

Fmap: fmap ≡ ap ∘ pure

A monad is the structure:

module type MONAD = sig
  type 'a t
  val return : 'a -> 'a t
  val bind : ('a -> 'b t) -> ('a t -> 'b t)
end

satisfying the following laws:

Right unit:    bind return     ≡ id
Left unit:     bind f ∘ return ≡ f
Associativity: bind f ∘ bind g ≡ bind (bind f ∘ g)

Every monad is an applicative functor:

module ApplicativeFunctor (M : MONAD)
: APPLICATIVE_FUNCTOR with type 'a t = 'a M.t
= struct
  type 'a t = 'a M.t
  open M
  let fmap f = bind (fun x -> return (f x))
  let pure   = return
  let ap u v = bind (fun f -> fmap f v) u
end

This can be proved by easy (but tedious) equational reasoning. That the proof is rigorous is Wadler's celebrated result.

Proof of the Functor Identity:

  fmap id
≡ { definition }
  bind (return ∘ id)
≡ { composition }
  bind return
≡ { Monad Right unit }
  id

Proof of the Functor Composition:

  fmap f ∘ fmap g
≡ { definition }
  bind (return ∘ f) ∘ bind (return ∘ g)
≡ { Monad Associativity }
  bind (bind (return ∘ f) ∘ return ∘ g)
≡ { Monad Left unit }
  bind (return ∘ f ∘ g)
≡ { definition }
  fmap (f ∘ g)

A number of naturality conditions are simple equations between λ-terms. I'll need these later:

Lemma 1 (Yoneda):

  fmap f ∘ (λh. fmap h x)
≡ { defn. ∘, β-reduction }
  λg. fmap f (fmap g x)
≡ { defn. ∘ }
  λg. (fmap f ∘ fmap g) x
≡ { Functor Composition }
  λg. fmap (f ∘ g) x
≡ { abstract }
  λg. (λh. fmap h x) (f ∘ g)
≡ { defn. ∘, η-contraction }
  (λh. fmap h x) ∘ (f ∘)

Lemma 2:

  fmap f ∘ return
≡ { definition }
  bind (return ∘ f) ∘ return
≡ { Monad Left unit }
  return ∘ f

Lemma 3:

  bind f ∘ fmap g
≡ { definition fmap }
  bind f ∘ bind (return ∘ g)
≡ { Monad Associativity }
  bind (bind f ∘ return ∘ g)
≡ { Monad Left unit }
  bind (f ∘ g)

Lemma 4:

  bind (fmap f ∘ g)
≡ { definition fmap }
  bind (bind (return ∘ f) ∘ g)
≡ { Monad Associativity }
  bind (return ∘ f) ∘ bind g
≡ { definition fmap }
  fmap f ∘ bind g

The Applicative Functor condition is easy to prove and comes in as a handy lemma:

  ap ∘ pure
≡ { definition }
  λv. bind (λf. fmap f v) ∘ return
≡ { Monad Left unit }
  λv. λf. fmap f v
≡ { η-contraction }
  fmap

Proof of the Applicative Identity:

  ap (pure id)
≡ { Applicative Functor }
  fmap id
≡ { Functor Identity }
  id

Proof of the Applicative Homomorphism:

  ap (pure f) ∘ pure
≡ { Applicative Functor }
  fmap f ∘ pure
≡ { Lemma 2, defn. pure }
  pure ∘ f

Proof of the Applicative Interchange:

  ap (pure (λf.f x)) u
≡ { Applicative Functor }
  fmap (λf.f x) u
≡ { definition }
  bind (return ∘ (λf.f x)) u
≡ { defn. ∘, β-reduction }
  bind (λf. return (f x)) u
≡ { Lemma 2 }
  bind (λf. fmap f (return x)) u
≡ { definition }
  ap u (pure x)

The proof of the Applicative Composition condition is the least straightforward of the lot, as it requires ingenuity to choose the reduction to apply at each step. I started with a long, tedious derivation that required forward and backward reasoning; at the end I refactored it in byte-sized lemmas in order to simplify it as much as I could. As a heuristic, I always try to start from the most complicated expression to avoid having to guess where and what to abstract (that is, applying elimination rules requires neatness, while applying introduction rules requires backtracking):

  ap (ap (ap (pure (∘)) u) v) w
≡ { Applicative Functor }
  ap (ap (fmap (∘) u) v) w
≡ { definition }
  bind (λf. fmap f w) (bind (λf. fmap f v) (fmap (∘) u))
≡ { Lemma 3 with f := λf. fmap f v, g := (∘) }
  bind (λf. fmap f w) (bind ((λf. fmap f v) ∘ (∘)) u)
≡ { Monad Associativity with f := λf. fmap f w, g := (λf. fmap f v) ∘ (∘) }
  bind (bind (λf. fmap f w) ∘ (λf. fmap f v) ∘ (∘)) u
≡ { defn. ∘ (at right) }
  bind (λf. (bind (λf. fmap f w) ∘ (λf. fmap f v)) (f ∘)) u
≡ { defn. ∘, β-reduction }
  bind (λf. bind (λf. fmap f w) (fmap (f ∘) v)) u
≡ { Lemma 3 with f := λf. fmap f w, g := (f ∘) }
  bind (λf. bind ((λf. fmap f w) ∘ (f ∘)) v) u
≡ { Lemma 1 }
  bind (λf. bind (fmap f ∘ (λf. fmap f w)) v) u
≡ { Lemma 4 with g := λf. fmap f w }
  bind (λf. fmap f (bind (λf. fmap f w) v)) u
≡ { definition }
  ap u (ap v w)

What is this good for? I don't really know; Haskellers can't seem to be able to live without them while I can't seem to justify their application. I suspect laziness has a lot to do with it; in any case, the machinery is more palatable with the appropriate combinators:

module Functor (F : FUNCTOR) = struct
  include F
  let ( <$> ) = fmap
end

module Applicative (A : APPLICATIVE) = struct
  include A
  let ( <*> ) = ap
end

module Monad (M : MONAD) = struct
  include M
  include (ApplicativeFunctor (M)
         : APPLICATIVE_FUNCTOR with type 'a t := 'a t)
  let ( <$> )     = fmap
  let ( <*> )     = ap
  let ( >>= ) m f = bind f m
end

3 comments:

Mauricio Scheffer said...

About your question "What is this good for?": if you mean "what are applicative functors good for", they're generally useful for modeling effectful computations, though you probably already read that elsewhere. More concretely, they're great for accumulating effects, for example when validating inputs, the effect can be a monoid accumulating error messages.
They also favor a more "functional" (yes, "applicative") interface compared to monads.
If you're wondering about the usefulness of applicative composition, the best example I know is Formlets, a web form abstraction built by composing several primitive applicative functors.
Laziness is not related: Scalaz makes extensive use of them and I've used them in C# and F#.
I have examples of all of this in my blog :)

Matías Giovannini said...

You could have have linked to your blog ;-) I'm not convinced that there's a use for Applicative Functors for which a Monad isn't appropriate. If Maybe x is a monad, so much so (Monoid y) => Either x y; again, all I can see that the operators give you is DSL convenience.

The bit about laziness is that in a strict language you do have to think about evaluation; to give an example I've found that you can't in general naively use >> (unless you're working in the State monad where every value is already a closure).

Plus the fact that OCaml already is impure and procedural makes me really wonder what do you gain by using a purely functional construct in the first place. In Haskell you don't have a choice; in OCaml you'll find that you either go pure all the way or you have to think very hard about evaluation order and side effects (try to use a State monad with a Stream parser and you'll see what I mean).

Mauricio Scheffer said...

The difference is that Monoid x => Either x y, when used monadically, will abort the computation at the first failure (i.e. the first Left) so it won't accumulate through the monoid. More generally, when doing validations, you'll want to use this Either with monoid through the applicative interface so that you can actually accumulate errors. Sometimes you'll want the monadic short-circuit evaluation, and sometimes you'll want the applicative accumulative evaluation. Note that the validation applicative is not the applicative derived from the Either monad.

Some examples about this here here and here

Then there's the age-old "why restrict myself to purity in an impure language". I'm sure you already know this but I'll list the reasons anyway: composability, compositionality, ability to reason about code. It's not entirely true that if you want to write pure code in an impure language you have to go pure all the way: you can use pure code locally and delay side-effects to the "outer" part of your program, where you perform those side-effects. This is even true of Haskell if you take IO as impure (some people do, some don't). At any rate, separating pure from impure code has benefits in any language.