Macroexpansion in Common Lisp

2022-07-05 :: lisp, programming

Yet another description of macroexpansion in Common Lisp. There is nothing particuarly new here and it partly duplicates some previous articles: I just wanted to rescue the text.

The following description is of how macroexpansion works in Common Lisp¹. It is slightly simplified and I have not always mentioned when it is². It is at least a partial duplicate of this previous article.

What macros are

Macros in CL are functions, written in ordinary CL, whose argument is source code, and whose value is other source code.

Source code is represented as s-expressions: symbols, conses, and so on. Macros don’t do string-rewriting.

The way to think slightly more abstractly about macros is that they are functions between languages: a macro is a function which takes as an argument fragments of a language which includes that macro, and returns as a value either a fragment of a language which doesn’t include the macro, or a fragment of a language which includes it in some weaker way.

The aim of macros is to build, on top of the language you are given, another language which is closer to the language in which you want to express your programs. CL itself is one such language, built-up using a number of standard macros on top of a substrate language.

People often think of macros as ‘functions which do not evaluate their arguments’: that’s really not right. They are functions — perfectly ordinary functions, written in CL — but their argument is source code, and their value is source code.

How macroexpansion happens

[This is simplified.]

Given some initial compound form (m ...), macroexpansion proceeds like this.

Start. Given a form, it should be one of

a compound form (m ...),
or a non-compound form.

Compound form. The form is (m ...)

Look at m: if it has an associated macro function (found using macro-function) then simply call that function on the whole form (m ...): its result is a new form³. Recurse on this form from Start.
If m is not a macro, then it may be a special operator, such as setq or if. Consider appropriate forms in the body of this form for expansion: which forms are known by the rules of the special operator. For instance all the forms in (if ...) are considered for expansion, while in (setq <x> <y>) only <y> is, and so on.
If it is not a macro and not a special form, then (m ...) is assumed to be a function call, with m denoting a function. All the forms in the body are now considered for macro expansion. Once that is done the expansion process is complete.
As a special case of the last case, m may be (lambda (...) ...), so the whole form will be ((lambda (...) ...) ...). In this case the forms in the body of the lambda are considered for macroexpansion; otherwise this is the same as the last case⁴.
There are no other cases.

Non-compound form. There is nothing to do here.

As I said, this is simplified: there are local macros for instance, and various other things. However one critical thing is that when expanding some macro form (m ...), the expansion carries on until it gets something which is not a macro form before looking at whatever is in the body of the form. That’s critical: although it’s tempting to think that expansion should happen inside-out, it can’t work that way, because until the outer macro has done its work you can’t know if the things in its body even should be candidates for macro expansion. There’s an example of this below.

Macros the hard way

OK, I said that macros were just functions, and I meant that. Let’s write a macro with-debugging which is like progn but it will perhaps print what it is doing.

So let’s write the macro function:

(defvar *debugging* t)

(defun expand-with-debugging (form environment)
  (declare (ignore environment))        ;I'm not mentioning environments
  `(progn
     ,@(loop for thing in (rest form)
             collect `(when *debugging*
                        (format *debug-io* "~&~S~%" ',thing))
             collect thing)))

And we can test it:

> (expand-with-debugging '(with-debugging (cons 1 2) 4) nil)
(progn
  (when *debugging* (format *debug-io* "~&~S~%" '(cons 1 2)))
  (cons 1 2)
  (when *debugging* (format *debug-io* "~&~S~%" '4))
  4)

And now we can install it as the macro function for with-debugging:

(setf (macro-function 'with-debugging) #'expand-with-debugging)

And now

> (with-debugging
   (cons 1 2)
   4)
(cons 1 2)
4
4

 (setf *debugging* nil)
nil

> (with-debugging
   (cons 1 2)
   4)
4

OK, here’s another macro done this way, and purpose of this one is to show you why macroexpansion has to happen outside in. Let’s say we want to be able to denote functions by (fun (arg ...) form ...), but we’d like to be able to debug the body with with-debugging. We can do that:

(defun expand-fun (form environment)
  (declare (ignore environment))        ;still not mentioning environments
  `(function (lambda ,(second form)
               ;; Not dealing with declarations
               (with-debugging ,@(cddr form)))))

(setf (macro-function 'fun) #'expand-fun)

And now

> (let ((*debugging* t))
    (funcall (fun (a) (+ a a)) 1))
(+ a a)
2

Now you can see why the macro expander has to work the way it does: the first form in the body of fun should not be macroexpanded at all, and the remaining forms are going to get wrapped in a macro which isn’t there in the source at all. So macroexpansion has to go outside in, as described above.

A better way

Well, you could write macros like that. Probably once they were written like that. But it’s a pain, because you almost never care about the first element of the form — the macros own name — and you have to manually take the rest of the form apart yourself. And also you need to deal with questions about making sure macros are defined at compile time and so on.

That’s what defmacro does. It is itself a macro, and its expansion will involve setting the macro-function of the macro to some appropriate thing. So using defmacro I can write the fun macro:

(defmacro fun ((&rest args) &body forms)
  ;; still not dealing with declarations
  `(function (lambda (,@args) (with-debugging ,@forms))))

This is easier to understand of course. But all it is is a (fairly elaborate!) wrapper around what I did above.

Watching the detectives

Using trace-macroexpand you can watch macroexpansion happen.

> (needs (:org.tfeb.hax.trace-macroexpand :compile t :use t))
; Loading [...]
((:org.tfeb.hax.trace-macroexpand t))

> (trace-macroexpand t)
nil

> (trace-macro fun with-debugging)
> (setf *trace-macroexpand-print-length* nil
        *trace-macroexpand-print-level* nil)
nil

> (trace-macro fun with-debugging)
(fun with-debugging)

> (setf *debugging* nil)                
nil

> (funcall (fun (a) a) 1)
(fun (a) a)
 -> #'(lambda (a) (with-debugging a))
(with-debugging a)
 -> (progn (when *debugging* (format *debug-io* "~&~S~%" 'a)) a)
(with-debugging a)
 -> (progn (when *debugging* (format *debug-io* "~&~S~%" 'a)) a)
1

Note that with-debugging is expanded twice: this is an artifact of the implementation: there’s no promise that macros only get expanded once in interpreted code.

This was once going to be a Stack Overflow answer, and I didn’t want to throw it away. ↩
And of course I might just be wrong about some details. ↩
I am not talking about the environment objects which get passed to macro functions. ↩
Another way of thinking about ((lambda (...) ...) ...) is that is is the same as (funcall (function (lambda (...) ...)) ...) and, since function is a special operator, its rules apply, and include expanding the forms in the body of the (lambda (...) ...) form (and of course lambda is itself a macro, so (lambda (...) ...) expands to (function (lambda (...) ...))) and then the rules for function apply again). I am old enough to remember adding the macro for lambda to various antique CLs. ↩