Post by Erann GatMy question is: is it possible to render and/or explain the concept
of a monad in Scheme? And if not, why not? Is it because Scheme
has no type system, and types are part and parcel of what monads
are? Or can the essence of monads be divorced from type theory?
It's possible, but not the most fun thing in the world. Monads arise
from category theory, which is an inherently typeful branch of
mathematics, and there are very close links between category theory
and typed lambda calculi. Of course, you can do it in Scheme, but it
will be somewhat harder to understand, since you have to keep more of
the program in your head and less in your source code.
But I'll give it a shot -- I've been meaning to write down what I know
in order to solidify it, and you give me a good excuse. :)
o Q: What is a monad?
A: Monads are a concept from category theory, which functional
programmers have adapted to create a useful design pattern for
combinator libraries.
o Q: So all they are is a design pattern for higher order functions?
A: Yeah, basically. It's one of the key unifying ideas in
programming language semantics, but for hackers it's basically
a design pattern.
o Q: So what's the frequency, Kenneth?
A: Um, right. Every monad consists of four things:
1. A type constructor M, which sends a type 'a' to a
types 'M[a]'.
These types should not be thought of as the Scheme types like
string? or procedure?; instead think of them more like as the
types Scheme programmers use to specificy the arguments to
functions (eg, "a list of integers").[1] You can read 'a' as
"the type a", and 'M[a]" as "the type of computations of type
a".
2. A function 'unit', which has type 'a -> M[a]'. That is, it
takes a value of type a, and injects it into the monadic
type.
3. A function 'bind', of type '(M[a], a -> M[b]) -> M[b]'.
bind is a function taking two arguments -- the first is a
monadic value of type M[a], and the second is a function that,
given an a, returns a monadic value of type M[b]. It then
returns a value of type M[b].
Intuitively, you take the computation M[a] and perform it,
getting a value of type a (plus whatever computational stuff
performing the computation did). Then, you pass that value of
type a into the second argument, and get back a new
computation of type M[b].
4. There are the three algebraic laws that bind and unit have to obey.
In Scheme, these equivalences are:
(bind (unit x) f) == (f x)
(bind M unit) == M
(bind (bind M f) g) == (bind M (lambda (x) (bind (f x) g)))
o Q: Give me an example?
A: Sure! Here's the simplest possible monad -- the identity monad.
(define (unit x) x)
(define (bind m f) (f m))
I'll leave checking that bind and unit obey the monad laws as
an exercise.
o Q: That was too trivial to be helpful. How about a less useless
example?
A: Picky, picky. Here is the option or maybe monad, which represents
computations that can either return a value, or fail with an
error.
(define fail (vector 'fail)) ; fail is not eq? to anything else
(define (unit x) x)
(define (bind m f)
(if (eq? m fail)
fail
(f m)))
Again, I'll leave checking the monad laws to you, the reader.
o Q: Can you explain why I'd want to use this?
A: Okay. Suppose you have three functions foo, bar, and baz, each
of which takes a value (with fail not in the domain), and either
return or fail by returning fail. Now, suppose you want to compose
these three functions, with any failure causing the composition
to fail. So you write:
(define (foo-bar-baz x)
(let ((foo-val (foo x)))
(if (eq? foo-val fail)
fail
(let ((bar-val (bar foo-val)))
(if (eq? bar-val fail)
fail
(baz bar-val))))))
Then you look at this code, and sigh, because it is ugly, and
doesn't match the simple concept you had in your head:
(define (concept-of-foo-bar-baz x) (baz (bar (foo x))))
But, since you're a functional programmer, you know you can use
the power of higher order functions:
(define (compose* f x)
(if (eq? x fail)
fail
(f x)))
(define (foo-bar-baz x) (compose* baz (compose* bar (foo x))))
That's much better. But wait! The function compose* is precisely
the bind operation of the maybe monad!
o Q: Pretty nice. But you added fail to this API in a pretty ad-hoc
way. Do you have to do that with all monads?
A: Of course, all monads need to have some operations specific
to whatever thing you are using the monad for. But in the case
of the option monad, fail is an instance of a very common
pattern called "monads with zeros". A monad with a zero has
two extra elements to its API:
1. A value 'zero', of type 'M[a]'.
2. A function 'plus', of type '(M[a], M[a]) -> M[a]'
3. plus and zero must obey the following algebraic laws:
(plus m zero) == (plus zero m) == m
(bind m (lambda (x) zero)) == zero
(bind zero f) == zero
So for the maybe monad, we can write
(define zero (vector 'Fail)) ; just a rename from 'fail' to 'zero'
(define (plus a b)
(if (eq? a zero)
b
a)
The maybe monad's plus should be a very familiar pattern to a
Scheme programmer; it's exactly how the or special form works,
except that its zero value is #f. This is why it's so pleasant to
use APIs that return #f in case of failure, and a useful return
value otherwise.
o Q: Are there any other monads with zero? It's not safe to generalize
from one example.
A: Sure. Lists form a monad with a zero, too. Look:
(define (unit x) (list x))
(define (bind m f) (apply concatenate (map f m))
(define zero '())
(define (plus a b) (append a b))
Verify the monad and zero laws!
o Q: Options and lists are nice, sure. But what's all this I hear
about using monads to treat state in a pure way?
A: Let's start with an example. Suppose that you want to write
a function that will take a binary tree, and then number the
leaves. To do this with imperative state, you can do a tree
walk and increment a counter at each leaf:
(define state 0)
(define (number-tree tree)
(if (pair? tree)
(let* ((left-subtree (number-tree (car tree)))
(right-subtree (number-tree (cdr tree))))
(cons left-subtree right-subtree))
(let ((n state))
(set! state (+ state 1))
n)))
This works, but doesn't let us renumber two trees starting from
the same number. So if we try to do this purely, we need to write
your loop in "state-passing style", and thread a counter through
the program.
(define (number-tree tree state)
(if (pair? tree)
(let-values ((lft state-1) (number-tree (car tree) state))
(let-values ((rgt state-2) (number-tree (cdr tree) state-1))
(values (cons lft rgt) state-2)))
(values (+ state 1) (+ state 1))))
That's rather nasty looking, and it would suck to accidentally use
the wrong updated state. However, we can use a monad to automate the
plumbing, and get the benefit of both approaches.
; The type constructor M[a] = state -> (a, state)
(define (unit x) (lambda (state) (values x state)))
; all bind does is thread the state through some calls.
(define (bind m-a f)
(lambda (state)
(let-values ((a state*) (m-a state))
((f a) state*))))
; get takes a state and returns it as a value
(define (get state) (values state state))
; set takes a state, and returns a monadic value that replaces
; the old state with the new state
(define (set new-state) (lambda (state) (values #f new-state)))
; run will take a monadic state value, and then run it with an
; initial state to get an actual value.
(define (run m state) (m state))
Now, I'll define the number-tree program using a state monad. To
make it readable, I'll use a very eccentric indentation:
(define (number-tree tree)
(if (pair? tree)
(begin (bind (number-tree (car tree)) (lambda (left-subtree)
(bind (number-tree (cdr tree)) (lambda (right-subtree)
(unit (car left-subtree right-subtree)))))))
(begin (bind get (lambda (n)
(bind (set (+ n 1)) (lambda (dont-care)
(unit n))))))))
Compare this to the imperative definition:
(define (number-tree tree)
(if (pair? tree)
(let* ((left-subtree (number-tree (car tree)))
(right-subtree (number-tree (cdr tree))))
(cons left-subtree right-subtree))
(let ((n state))
(set! state (+ state 1))
n)))
We are using bind + lambda to simulate using let*. There is a
1-to-1 correspondence between the monadic and imperative code!
At this point, the eccentric indentation of the monadic code
suggests that it would be worthwhile to define a macro to
simplify things. (This macro is one I stole from an Oleg
article.)
(define-syntax do
(syntax-rules (def)
((do (v <- expr) rest) (bind expr (lambda (v) rest)))
((do expr) expr)
((do expr expr) (bind (set expr) (lambda (dummy) expr)))
((do expr expr* ...) (do expr (do expr* ...)))))
This emulates Haskell's do-notation, which we can use as
follows:
(define (number-tree tree)
(if (pair? tree)
(do (left-subtree <- (number-tree (car tree)))
(right-subtree <- (number-tree (cdr tree)))
(unit (cons left-subtree right-subtree)))
(do (n <- get)
(set (+ n 1))
(unit n))))
Cool, huh?
o Q: Very cool. So a Haskell compiler takes programs using the purely-
functional state monad and optimizes it internally into a program
using mutation?
A: Exactly.
o Q: What else can monads do?
A: You can also turn programs in continuation passing style into
monadic form. In fact, it's a significant result (due to Andrezj
Filinski) that all imperative programs using call/cc and state
can be translated into a monadic style, and vice versa. So
exceptions, nondeterminism, coroutines, and so on all have a
monadic expression.
o Q: Why do Haskell descriptions of monads focus so heavily on this
weird "typeclass" stuff?
A: Typeclasses are just overloading on steroids, and each of the monads
in this article had its own bind and unit operation. So
Haskellers overload bind and unit (which they call >>= and
return) for every monad they write, so that their monadic code
commits as little as possible to a specific monad. It's just
good, solid, software engineering practice.
This would be hard to do in Lisp (eg, with CLOS), because of the
unit operation -- you would need to select the appropriate unit
operation based on the expected *return* type, and that's
something that inherently needs a static type system to work.
--
Neel Krishnaswami
***@cs.cmu.edu