Dynamic scope and macros

:: computer, lisp

I’ve recently been writing some Emacs Lisp code to do some massaging of files. Quite apart from having forgotten how primitive elisp is, I hadn’t realised before how hostile dynamic scope was for macros in particular.

A very common pattern for macros is call-with-* / with-*, in which there is a functional level which is wrapped by a more syntacticlly-friendly macro level. For instance, in Common Lisp you can map over lists with mapcar:

 (lambda (e)

but you might want to map over them with a syntax like

(mapping (e ...)

Well, it’s easy to implement this:

(defmacro mapping ((e l) &body forms)
  `(mapcar (lambda (,e) ,@forms) ,l))

Even with CL’s unhygienic macro system & without a mass of gensymmery such a macro is safe.

A good example where CL exposes one side of a pattern like this is with-open-file: you can easily see how to implement this in terms of a function:

(defun call/open-file (fn filespec &rest keys
                          &key &allow-other-keys)
  (let ((s nil))
          (setf s (apply #'open filespec keys))
          (funcall fn s))
      (when s (close s)))))

(defmacro with-open-file* ((sn filespecn &rest keysn 
                               &key &allow-other-keys)
                           &body forms)
  `(call/open-file (lambda (,sn) ,@forms)
                   ,filespecn ,@keysn))

(This is probably not completely robust code: it’s just meant to get the idea across.)

Scheme exposes the other side of this pattern with call/cc:

(define-syntax-rule (with-cc (c) form ...)
  (call/cc (λ (c) form ...)))

(define-syntax-rule may be specific to Racket but, again, this is just meant to get the idea across.)

Well, now think about something like the above call/open-file / with-open-file* in a Lisp dialect with dynamic scope. In particular, what does this do:

(let ((s t))
  (with-open-file* (h ...)
    (when s ...)))

This expands to

(let ((s t))
  (call/open-file (lambda (h) (when s ...))))

But call/open-file binds s: so the binding of s in the called function is different than the outer binding, and nothing works.

Well, of course, this is something that happens pervasively with dynamically-scoped languages: every binding above you (or below you, depending on your viewpoint) matters, and can infect your namespace. But it’s particularly toxic for macros, because macros very often interpose bits of code into your code, and that code can include bindings which are dynamically, but not lexically, visible, even in the expansion of the macro. Dynamic scope enormously increases the hygiene problems of a macro system.

Dynamic scope is really useful as an option, and systems written in languages which don’t have it generally have to reinvent it, usually badly. But it’s just toxic and horrible as the only option. I can’t understand any more how I managed to use lisps with dynamic scope at all: perhaps I never wrote macros or just expected things to behave in a mysterious and strange way occasionally. Fortunately, even elisp now has the option of being lexically scoped.

No future

:: politics, doomed

We’ve been fooling ourselves for thirty years. We believed that the awful toxins that defined society in our youths were, while not yet dead or even nearly dead, clearly dying.

The end of summer

:: politics, economics, doomed

On midsummer’s eve 2016 old people in the UK demonstrated that, by a significant majority, they are xenophobic leeches who are happy to suck the life out of their children and grandchildren, and have now found a way of continuing to do so even after they are dead.

Python instead of Lisp

:: computer, lisp

Lots of people, even famous Lisp hackers, like to claim that ‘Python can be seen as a dialect of Lisp with “traditional” syntax’.

Being famous does not make them right.

Python is nothing like Lisp

Expression language. Lisp is an expression language: everything in the language is an expression and has a value, and there is no distinction between expressions and statements, because there are no statements. Python is not: it has expressions, such as 2+3, lambda x: x*2 and statements such as x = 3. If expressions and statements are different things then writing macros and any kind of general-purpose lambda becomes very difficult.

Conses. Lisp has conses, Python does not. Conses are not everything1, but unless you have them you can’t implement them reasonably, and they are extremely useful data structures for many purposes. In particular for conses to be useful you need two things:

  • a good syntax for them and for lists built from them;
  • good performance — conses should be extremely cheap, so you can’t implement them as a special case of some heavyweight data structure such as a Python list, because there is an enormous header.

This means that conses need to be wired into the language: you can’t take a language without conses and add them, because even if you can get the first (you can’t in Python) you can’t get the second.

Symbols. Lisp has symbols, Python does not. You can use strings, and this works sometimes.

Lambda. Lisp has lambda, Python has an extremely limited version. Not being an expression language (see above) and the lack of scoping and block constructs in Python cripples its lambda.

Source code available as a low-commitment data structure. Lisp has this, Python does not. ‘Low-commitment’ means that it is available before it has been decided what it means, but after it has been turned from a stream of characters into something more interesting. This matters because it makes macros possible: macros which work by transforming streams of characters are doomed to the sort of unspeakable horror of which Jinja2 is a good example, while macros which work after it has been decided what the code means then can’t make their own decision about what it means, which is half the point of macros.

Scoping. Lisp has a multiplicity of scoping constructs and all modern Lisps have lexical scope, with some (Scheme) extending this to control constructs. Binding and assignment are irreparably confused in Python: scope does not work properly and this can never be fixed. A language which requires a global declaration is not going to be fixed by adding nonlocal.

Macros. Lisp has them, Python doesn’t. Since macros are the point of Lisp, it is really hard to see how the above quote makes any kind of sense.

There is a terrible truth about the percieved arrogance of Lisp hackers that it has taken me a long time to understand. The arrogance is justified: Lisp is, in fact, a better programming language.

  1. In particular conses are not a useful universal data structure in the way that, perhaps, early Lisp people thought they were. 

Macros in Racket, part three: checking boolean operators

:: computer, lisp

I wanted to see if I could write a mildly complicated macro in Racket without becoming too confused. I can, although I am not sure it is terribly idiomatic.

This is the third part of a series on writing macros in Racket for someone used to Common Lisp, although it is mostly independent of the previous parts. The previous parts are part one & part two.

One of the nice things about Lisp-family languages is that you can write your own control constructs, and it’s essentially easy to do so: if when did not exist then you could write it:

(define-syntax-rule (when test form ...)
  (and test
       (begin form ...)))

This kind of extensibility is one of the wonders of Lisp and Scheme: it’s tempting to say that it makes them better than programming languages which can’t do this but that’s not correct: it makes them incomparable to such languages: Lisp1 programs can reason about themselves and often do2. Everything about Lisp really leads to this ability.

When I taught (Common) Lisp to people one of the things I would try to get across was this ability of macros to extend the control constructs in the language: people often thought of macros as a way of essentially inlining code3, but that’s not what they’re actually good for. If you can add control constructs to your language, then you can make a new language, and that’s what Lisp macros are about, and therefore what Lisp is about.

A good way to get this across to people is to pretend that Lisp doesn’t have some control construct, and write it as a macro. This is easier than inventing new control constructs both because it doesn’t require thinking of a domain where they might be useful and because the existing control constructs have clear semantics. Reimplementing existing control constructs also demonstrates how the language is already built up from a more primitive language by macros and how the approach to solving problems in Lisp is to design and implement a language in which to talk about the problem, where that language is seamlessly built on the underlying Lisp, and can inherit all of its power and flexibiliy, including the ability to extend the language.

An advantage of reimplementing existing control constructs for teaching Lisp is that you can compare the new construct to the existing one, and with some small constraints you can do this exhaustively, so you can know whether you have actually implemented it right. This is, obviously, not possible in general, but if the operator has trivial syntax (so not cond) and if you limit the arguments of the operator to booleans then you can enumerate all the possible arguments in the obvious way, and so long as it returns a result for all combinations of arguments (does not fail to halt in other words) and is deterministic then there are only two things you need to check:

  1. does the operator produce the same result for all combinations of arguments (\(2^n\) possibilities for \(n\) arguments) as the existing one?
  2. does the operator evaluate its arguments the same number of times as the existing one for all these combinations?

So, for instance, if takes three arguments (in Racket) and should evaluate the first exactly once, and the others at most once, as well as returning the correct value.

Obviously such a check is not a full check of the operator — it does not tell you what it does with non-boolean arguments for instance. But I was interested in writing the check largely because it’s clearly a reasonably hairy macro which I know how to write in CL and wanted to see if I could write in Racket (I’m not very likely to teach people Lisp again).

What the macro needs to do

The idea is that to compare two boolean operators o1 and o2 which take n arguments you need to generate code which looks like this:

(for/and ([c (expt 2 n)])
  (let ([a1 (bitwise-bit-set? c 0)] ...)
    (let ([o1c1 0] ...)
      (let ([o2c1 0] ...)
        (and (eq? (o1 (begin (set! o1c1 (+ o1c1 1)) a1) ...)
                  (o2 (begin (set! o2c1 (+ o2c1 1)) a1) ...))
             (= o1c1 o2c1) ...)))))

So a1 is the first argument, o1c1 counts how many times o1 evaluates it, and o2c1 counts how many times o2 evaluates it, and so on. I decided to compare the operators with eq? rather than eqv? for no very good reason except that it works for operators whose results are booleans, which is what I was interested in. I should almost certainly use eqv? I think — certainly the -equivalent in the name would imply that — but I’m not.

It’s clear that a loop like that checks all of the \(2^n\) possibilities for the arguments, where each argument can be either #f or #t only. So this does an exhaustive check of all the possibilities, and provided o1 and o2 are deterministic and halt on all their arguments it will tell you whether they are equivalent.

And finally, this must be written as a macro, because the operators it is testing are themselves not generally functions: in particular things like if and or are obviously themselves not functions.

Things I did not know how to do

The big thing I didn’t know how to do here was to make up new identifiers: all the counters need to be created, and possibly also the argument names. In CL you’d do this with make-symbol or gensym or something like that. Assuming I want to use syntax-case rather than writing a CL-style construct-the-form-with-backquote-and-use-datum->syntax macro (which I very much do want to do) then there are two problems:

  1. constructing the names of the counters;
  2. making them available as pattern variables.

Well, (2) is easy: you can use nested syntax-cases, or equivalently but much more prettily, with-syntax to bind the pattern variables. And it turns out that with-syntax is willing to do a lot of work on your behalf: if you give it something which is not a syntax object it will massage it into one for you. So, in particular, this works:

(with-syntax ([(o1c ...) (list ...)])

It takes the list it is given, turns it into a syntax object (with datum->syntax I suppose) and then does the matching. So you can be really lazy here: all you need to invent is a list of identifier syntax objects, and with-syntax will do the rest, making the program a lot less noisy. This is a really neat feature, although it might lead you to get confused about what is, and what is not, a syntax object I suppose. Anyway, I used it ruthlessly.

So this leaves (1). You could obviously do this with something like (datum->syntax ctx (string->symbol (format ...))), but Racket provides a nice shorthand for that in the form of format-id: (format-id ctx "~a-count" v) will construct an identifier syntax object from v using ctx as lexical context. And it will do the appropriate magic if v is an identifier syntax object: extract the symbol from it and use it as the argument to format in the appropriate way.

So it looks pretty straightforward to construct lists of identifiers and bind them to pattern variables. The final thing that confuses me is what lexical context to use for the identifiers. The macro should be hygenic, which means they can’t have the context of the syntax object it is working on, but I think can have more-or-less any other context where they have no existing meaning: I just invented an object for them, which I think is safe, although I am a bit confused about this.

What users see

I spent a really long time stuck on what the syntax of the macro should be: this is entirely stupid because it just does not matter that much. The reason I got stuck is that it would matter if this was a real library and I am constitutionally incapable of writing things without worrying about that kind of thing. Eventually I decided that it would be best if the user provided the argument names as a list, because they generally make sense to users and because I didn’t want to get into something which looked as if you could pass it an integer when in fact what it needs is a literal integer. So I decided on a syntax like this:

(boolean-operators-equivalent? o1 o2 (a1 ...))

So, for instance:

(boolean-operators-equivalent? if my-if (test then else))

I still don’t really like this; but I’m just playing so, well, it will do.

Additional cleverness

I wanted to report syntax errors in a reasonable way: apparently the proper way to do this is using syntax-parse but I am not ready to understand that yet, so I used wrong-syntax and the current-syntax-context parameter to get reasonable-looking errors.

I thought it would be nice to be able to report failures of equivalence, so there is a parameter which controls that and the expansion of the macro includes a check for the parameter and prints the failed cases if it’s true. All this happens at run time (phase 0) of course.

The macro itself

So, finally, here it is.

(require (for-syntax (only-in racket/syntax format-id
                              current-syntax-context wrong-syntax)))

(define boe-report-failure? (make-parameter #f))

(define-syntax (boolean-operators-equivalent? stx)
  ;; Given the names of two boolean operators and a list of argument
  ;; names, expand to a form which tests that they are equivalent, by
  ;; evaluating the with arguments bound to all the combinations of #t
  ;; and #f, and also checking that they evaluate the same arguments
  ;; in each case.
  (parameterize ([current-syntax-context stx])
    (syntax-case stx ()
      [(_ o1 o2 (v ...))
       (let* ([vars (syntax->list #'(v ...))]
              [nvars (length vars)])
         ;; This check could be a guard, but we need the bindings
         ;; anyway, so.
         (for ([var vars])
           (unless (identifier? var)
             (wrong-syntax var "not an identifier")))
         ;; vars is now a list of identifiers, and nvars is how many
         ;; there are.  We need to construct syntax for check
         ;; variables for each var and and operator, as well as
         ;; construct 2^n and a list of bit numbers.]  This is being
         ;; fairly fast and loose: it turns out that various things
         ;; get automagically converted into syntax objects, and I
         ;; have not cared about the context for numbers (what is
         ;; it?).  In general I am a bit confused about what the
         ;; context should be here, but it clearly should *not* be
         ;; stx.
         (with-syntax ([(o1c ...) (for/list ([v vars])
                                    (format-id #'boe "~a-1-eval-count" v))]
                       [(o2c ...) (for/list ([v vars])
                                    (format-id #'boe "~a-2-eval-count" v))]
                       [2^n (expt 2 nvars)]
                       [(b ...) (for/list ([i nvars]) i)])
           ;; And now just write the pattern we want.  '...' is pretty
           ;; clever, it turns out
           #'(for/and ([c 2^n])
               (let ([v (bitwise-bit-set? c b)] ...)
                 (let ([o1c 0] ...)
                   (let ([o2c 0] ...)
                     (or (and (eq? (o1 (begin (set! o1c (+ o1c 1)) v) ...)
                                   (o2 (begin (set! o2c (+ o2c 1)) v) ...))
                              (= o1c o2c) ...)
                           (when (boe-report-failure?)
                             (eprintf "Not equivalent:~% ~a~% ~a~%"
                                      (list 'o1 `(,v ,o1c) ...)
                                      (list 'o2 `(,v ,o2c) ...)))
       (wrong-syntax #'else "expecting o1 o2 (a1 ...)")])))

To my astonishment, this worked pretty much first time (it did not initially have the wrong-syntax stuff, but this was easy compared to the rest of it):

> (define-syntax-rule (if/broken test then else)
    (or (and test then) else))
> (boe-report-failure? #t)
> (boolean-operators-equivalent? if if/broken (test then else))
Not equivalent:
 (if (#t 1) (#f 1) (#f 0))
 (if/broken (#t 1) (#f 1) (#f 1))

The macro, complete with some tests and other infrastructure can be found here4.

Notes and queries

I still don’t know whether this is really idiomatic Racket, although I am reasonably happy that I understand what is going on. There are a couple of things I am not sure about:

  • is the context for the count variables right? I think it is, but I am not sure;
  • the macro relies heavily on Racket’s extremely smart behaviour with ... — I am still unclear just how smart this is and whether I am relying on things which are not actually specified to happen;
  • similarly it relies on with-syntax being willing to convert things to syntax objects for you, which I am not sure is safe.

However, even with these worries, I think it’s pretty clear that Racket macros are significantly nicer than CL macros, if also significantly more opaque.

  1. I am going to use ‘Lisp’ to mean ‘Lisp-family’ from now on. This is not meant to denigrate Scheme — this post is about Racket, after all — I just need a term which is not too clumsy. 

  2. Of course, programs in other languages often do end up reasoning about themselves: people end up writing little languages all the time. But you only have to look at most examples of this sort of thing to realise how far ahead Lisp is: I’m currently having to deal with a system whose configuration files are in a mutant version of Windows ini file syntax, with a preprocessor which is entirely unaware of that syntax, and an entire other language which lives in strings in the base language. The preprocessor does not know about the string syntax so it pokes down into this inner language as well. I’d like to say that Greenspun’s tenth law applies, but that would imply a level of sophistication entirely missing in this horrible thing: all I want to do is leave this job and never think about it again. 

  3. Macros were often used to inline code in the days of primitive compilers of course, but that’s a long time ago now. 

  4. I may move it somewhere more permanent in due course, so bookmark this at your peril.