Two simple pattern matchers for Common Lisp
I’ve written two pattern matchers for Common Lisp:
destructuring-match
, ordsm
, is acase
-style construct which can matchdestructuring-bind
-style lambda lists with a couple of extensions;spam
, the simple pattern matcher, does not bind variables but lets you match based on assertions about, for instance, the contents of lists.
Both dsm
and spam
strive to be simple and correct.
Simplicity
Both dsm
and spam
are simple: they do exactly one thing, and try to do that one thing well.
You could think of dsm
as being to some other CL pattern matchers as Unix once was to Multics: dsm
is the result of me looking at those other systems and thinking ‘please, not that’.
Those systems are vast, have several levels, and are extensible: some subset of them might do what I wanted to be able to do — make writing macros less unpleasant — but I’m not sure1. They are obsessed with performance.
dsm
does one thing, and exports a single macro. If you know how to use destructuring-bind
and case
you already know almost all there is to know about dsm
: it’s a case
construct whose cases are destructuring-bind
lambda lists. dsm
doesn’t care about performance at all, because macroexpansion performance never matters.
At least one of those matchers has almost as many commits in its repo as dsm has lines of code.
Like Multics was, those hairy pattern matchers are fine systems. But there was a good reason that Thompson and Ritchie wrote something very different2.
destructuring-match
/ dsm
in CL destructuring-bind
and, mostly equivalently, macro argument lists are both a blessing and a curse. They’re a blessing because they support destructuring, so you can write, for instance
(defmacro with-foo ((var &optional init) &body forms)
...)
They’re a curse because they are so fragile: with-foo
can only support that syntax and will fail with an ugly error message from the implementation when it is fed anything else.
Writing robust macros in CL, especially macros which expect various different argument patterns, then turns into a great saga of manually checking argument patterns before using destructuring-bind
to actually bind things. The result of that, of course, is that very many CL macros are not robust and have terrible error reporting.
destructuring-match
does away with all this unpleasentness. It supports a slightly extended version of the lambda lists that destructuring-bind
supports, has ‘guard’ clauses which allow additional checks, and will match a form against any number of lambda lists until one matches, with a fallback case.
As an example here is a version of with-foo
which allows two patterns:
(defmacro with-foo (&body forms)
(destructuring-match forms
(((var &optional init) &body body)
(:when (symbolp var))
...)
((((var &optional type) &optional init) &body body)
(:when (symbolp var))
...)
(otherwise
(error ...))))
The guard clauses check that var
is a symbol before the match succeeds, and will therefore ensure that the second match is the one chosen for (with-foo ((x y) 1) ...)
.
destructuring-match
also supports ‘blank’ variables: any variable whose name is _
(in any package) is ignored, and all such variables are distinct. So for instance
(destructuring-match l
((_ _ _) ...))
will match if l
is a proper list with exactly three elements.
Using destructuring-match
it’s easy to write this macro3:
(defmacro define-matching-macro (name &body clauses)
(let ((<whole> (make-symbol "WHOLE"))
(<junk> (make-symbol "JUNK")))
(destructuring-match clauses
((doc . the-clauses)
(:when (stringp doc))
`(defmacro ,name (&whole ,<whole> &rest ,<junk>)
,doc
(destructuring-match ,<whole> ,@the-clauses)))
(the-clauses
`(defmacro ,name (&whole ,<whole> &rest ,<junk>)
(destructuring-match ,<whole> ,@the-clauses))))))
And this then allows the above with-foo
macro to be written like this:
(define-matching-macro with-foo
((_ (var &optional init) &body forms)
(:when (symbolp var))
...)
((_ ((var &optional type) &optional init) &body forms)
(:when (symbolp var))
...)
(form
(error "~S is bad syntax for with-foo" form)))
dsm
was not written with performance in mind but it seems to be, typically, around a tenth to a half the speed of destructuring-bind
while being far more powerful of course.
dsm
can be found here. It will probably end up in Quicklisp in due course but currently it isn’t there, and some of its dependencies are also not up to date there.
spam
, the simple pattern matcher
dsm
has a lot of cases where it needs to check what the lambda list it is parsing and compiling looks like. To do this I wrote a bunch of predicate constructors and combinators, which return predicates which will check things. So for example:
(is 'foo)
returns a function which checks its argument iseql
tofoo
;(some-of p1 ... pn)
returns a function of one argument which will succeed if one of the predicates which are its arguments succeeds, so(some-of (is 'foo) (is 'bar))
;(head-matches p1 ... pn)
will succeed if the predicates which are its arguments succeed on the first elements of a list.
There are several other predicate constructrors and predicate combinators, but spam
can use any predicate.
There is then a matches
macro which uses these to match things, and a matchp
function which simply invokes a predicate.
As an example, here’s part of a matcher for &rest
specifications in lambda lists.
(matching ll
((head-matches (some-of (is '&rest) (is '&body))
(var)
(is '&key))
;; &rest x &key ...
...)
((head-matches (some-of (is '&rest) (is '&body))
(var)
(any))
;; &rest x with something else
...)
((list-matches (some-of (is '&rest) (is '&body))
(var))
;; &rest x and no more
...)
(otherwise
(error "oops")))
spam
is pretty useful, and code written using it is much easier to read than doing the equivalent checks manually. It is used extensively in the implementation of dsm
.
spam
is now one of my CL hax.
-
At the time of writing Trivia supports lambda lists I think, but not destructuring-lambda lists:
(match '(1 (1)) ((lambda-list a (b)) (values a b)))
will fail, for instance. I don’t know whether is it meant to support destructuring lambda lists — comments in the sources imply it is, but it clearly does not in fact. ↩ -
I am aware of Gabriel’s ‘worse is better’ paper and its various afterthoughts.
dsm
is not like that: it is smaller and simpler, but is not intended to be worse.dsm
is to these other systems perhaps as Scheme was to CL. Gabriel also talks about these two options, of course. ↩ -
Note this macro is 12 lines, half of which are handling the possible docstring. ↩