Fragments: Posts tagged 'lisp'urn:https-www-tfeb-org:-fragments-tags-lisp-html2023-10-12T14:08:27ZSymbol nicknames: a broken toyurn:https-www-tfeb-org:-fragments-2023-10-12-symbol-nicknames-a-broken-toy2023-10-12T14:08:27Z2023-10-12T14:08:27ZTim Bradshaw
<p><a href="https://github.com/tfeb/symbol-nicknames">Symbol nicknames</a> allows multiple names to refer to the same symbol in supported implementations of Common Lisp. That may or may not be useful.</p>
<!-- more-->
<p>People often say the Common Lisp package system is deficient. But a lot of the same people write code which is absolutely full of explicit package prefixes in what I can only suppose is an attempt to make programs harder to read. Somehow this is meant to be made better by using package-local nicknames for packages. And let’s not mention the unspeakable idiocy that is thinking that a package name like, say, <code>XML</code> is suitable for any kind of general use at all. So forgive me if I don’t take their concerns too seriously.</p>
<p>The CL package system can’t do all the things something like the Racket module system can do. But it’s not clear that, given its job of collecting symbols into, well, packages, it could do that much more than it currently does. Probably some kind of ‘package universe’ notion such as Symbolics Genera had would be useful. But the namespace has to be anchored <em>somewhere</em>, and if you’re willing to give packages domain-structured names in the obvious way <em>and</em> spend time actually constructing a namespace for the language you want to use, it’s perfectly pleasant in my experience.</p>
<p>One thing that <em>might</em> be useful is to allow multiple names to refer to the same symbol. So for instance you might want to have <code>eq?</code> be the same symbol as <code>eq</code>:</p>
<pre class="brush: lisp"><code>> (setf (nickname-symbol "EQ?") 'eq)
eq
> (eq 'eq? 'eq)
t
> (eq? 'eq 'eq?)
t</code></pre>
<p>This allows you to construct languages which have different names for things, but where the names are translated to the underlying name efficiently. As another example, let’s say you wanted to call <code>eql</code> <code>equivalent-p</code>:</p>
<pre class="brush: lisp"><code>> (setf (nickname-symbol "EQUIVALENT-P") 'eql)
eql
> (eql 'eql 'equivalent-p)
t</code></pre>
<p>Well, now you can use <code>equivalent-p</code> as a synonym for <code>eql</code> <em>wherever</em> it occurs:</p>
<pre class="brush: lisp"><code>> (defmethod foo ((x (equivalent-p 1)))
"x is 1")
#<standard-method foo nil ((eql 1)) 801005BD23>
> (foo 1)
"x is 1"</code></pre>
<p>Symbol nicknames is not completely portable as it requires hooking string-to-symbol lookup. It is supported in LispWorks and SBCL currently: it will load in other Lisps but will complain that it can’t infect them.</p>
<p>Symbol nicknames is also not completely compatible with CL. In CL you can assume that <code>(find-symbol "FOO")</code> either returns a symbol whose name is <code>"FOO"</code> or <code>nil</code> and <code>nil</code>: with symbol nicknames you can’t. In the case where a nickname link has been followed the second value of <code>find-symbol</code> will be <code>:nickname</code>.</p>
<p>Symbol nicknames is a toy. I am not convinced that the idea is even useful, and if it is it probably needs to be thought about more than I have.</p>
<p>But it exists.</p>A horrible solutionurn:https-www-tfeb-org:-fragments-2023-05-04-a-horrible-solution2023-05-04T11:33:41Z2023-05-04T11:33:41ZTim Bradshaw
<p><a href="https://www.tfeb.org/fragments/2023/05/03/two-sides-to-hygiene/">Yesterday</a> I wrote an article describing one of the ways traditional Lisp macros can be unhygienic even when they appear to be hygienic. Here’s a horrible solution to that.</p>
<!-- more-->
<p>The problem I described is that the expansion of a macro can refer to the values (usually the function values) of names, which the <em>user</em> of the macro can bind, causing the macro to fail. So, given a function</p>
<pre class="brush: lisp"><code>(defun call-with-foo (thunk)
...
(funcall thunk))</code></pre>
<p>Then the macro layer on top of it</p>
<pre class="brush: lisp"><code>(defmacro with-foo (&body forms)
`(call-with-foo (lambda () ,@forms)))</code></pre>
<p>is not hygienic so long as local functions named <code>call-with-foo</code> are allowed:</p>
<pre class="brush: lisp"><code>(flet ((call-with-foo (...) ...))
(with-foo ...))</code></pre>
<p>The <em>sensible</em> solution to this is to say, just as the standard does about symbols in the <code>CL</code> package that you are not allowed to do that.</p>
<p>Here’s another solution:</p>
<pre class="brush: lisp"><code>(defmacro with-foo (&body forms)
`(funcall (symbol-function 'call-with-foo) (lambda () ,@forms)))</code></pre>
<p>This is robust against anything short of top-level redefinition of <code>call-with-foo</code>. And you can be mostly robust even against that:</p>
<pre class="brush: lisp"><code>(defmacro with-foo (&body forms)
`(funcall (load-time-value (symbol-function 'call-with-foo))
(lambda () ,@forms)))</code></pre>
<p>This still isn’t safe against really malignant users, since the load time of the macro’s definition and its uses are not generally the same. But it’s probably fairly good.</p>
<p>I hope I never feel I have to use techniques like this.</p>Two sides to hygieneurn:https-www-tfeb-org:-fragments-2023-05-03-two-sides-to-hygiene2023-05-03T11:28:09Z2023-05-03T11:28:09ZTim Bradshaw
<p>It’s tempting to think that by being sufficiently careful about names bound by traditional Lisp macros you can write macros which are hygienic. This is not true: it’s much harder than that.</p>
<!-- more-->
<h2 id="hygienic-macros">Hygienic macros</h2>
<p>I do not fully understand all the problems which <a href="https://en.wikipedia.org/wiki/Hygienic_macro">Scheme-style hygienic macros</a> try to solve, and the implementation of the solutions is usually sufficiently difficult to understand that I have always been put off doing so, especially as the details of the implementation in <a href="https://racket-lang.org/">Racket</a>, the Scheme-related language I use most, seems to <a href="https://users.cs.utah.edu/plt/scope-sets/">change every few years</a>. I’m happy enough that I am mostly competent to <em>write</em> the macros I need in Racket, without understanding the details of the implementation.</p>
<p>Traditional Lisp macros are, to me, far more appealing because they work in such an explicit and simple way: you could pretty easily write a macroexpander which did most of what the Common Lisp macroexpander does, for instance. I have written several toy versions of such a thing: I’m sure most Lisp people have. Traditional Lisp macros are just functions between bits of language expressed explicitly as s-expressions: what could be simpler?</p>
<p>In fact I am reasonably confident that, if I had to choose one, I’d choose CL’s macros over Racket’s: writing macros in raw CL is a bit annoying because you need explicit gensyms and you need to do pattern matching yourself. But you can write, and I <a href="https://tfeb.org/fragments/2022/09/26/metatronic-macros/">have</a> <a href="https://tfeb.org/fragments/2022/07/21/two-simple-pattern-matchers-for-common-lisp/">written</a> tools to make most of this go away. With these, writing macros in CL can often be very pleasant. And it’s easy to understand what is going on.</p>
<p>What is far harder though, is to make it completely hygienic. Here’s one reason why.</p>
<h2 id="several-versions-of-a-macro-in-common-lisp">Several versions of a macro in Common Lisp</h2>
<p>Let’s imagine I want a macro which allows you to select actions based on the interval a real number is in. It might look like this:</p>
<pre class="brush: lisp"><code>(interval-case x
((0 1) ...)
((1) 2) ...)
(otherwise ...))</code></pre>
<p>Here intervals are specified the way they are in type specifiers for reals:</p>
<ul>
<li><code>(a b)</code> where <code>a</code> and <code>b</code> are reals means \([a,b]\);</li>
<li><code>((a) b)</code> where <code>a</code> and <code>b</code> are reals means \((a,b]\);</li>
<li>and so on.</li></ul>
<p>There can be only one interval per clause, for simplicity.</p>
<p>I will write several versions of this macro. For all of them I will use <a href="https://tfeb.github.io/#destructuring-match-for-common-lisp">dsm</a> and, later, <a href="https://tfeb.github.io/tfeb-lisp-hax/#metatronic-macros">metatronic macros</a> to make things better.</p>
<p>First of all here’s a function<sup><a href="#2023-05-03-two-sides-to-hygiene-footnote-1-definition" name="2023-05-03-two-sides-to-hygiene-footnote-1-return">1</a></sup> which, given an interval specification, returns a form which will match numbers in that interval:</p>
<pre class="brush: lisp"><code>(defun compute-interval-form (v iv)
(destructuring-match iv
(((l) (h))
(:when (and (realp l) (realp h)))
`(< ,l ,v ,h))
((l (h))
(:when (and (realp l) (realp h)))
`(and (<= ,l ,v) (< ,v ,h)))
(((l) h)
(:when (and (realp l) (realp h)))
`(and (< ,l ,v) (<= v ,h)))
((l h)
(:when (and (realp l) (realp h)))
`(<= ,l ,v ,h))
(default
(:when (member default '(t otherwise)))
t)
(otherwise
(error "~S is not an interval designator" iv))))</code></pre>
<h3 id="a-hopeless-version">A hopeless version</h3>
<p>Here is a version of this macro which is entirely hopeless:</p>
<pre class="brush: lisp"><code>(defmacro interval-case (n &body clauses)
;; Hopeless
`(cond
,@(mapcar (lambda (clause)
(destructuring-bind (iv &body forms) clause
`(,(compute-interval-form n iv) ,@forms)))
clauses)))</code></pre>
<p>It’s hopeless because of this:</p>
<pre class="brush: lisp"><code>> (let ((x 1))
(interval-case (incf x)
((1 (2)) '(1 (2)))
((2 (3)) '(2 (3)))))</code></pre>
<p>So <code>(incf x)</code> where <code>x</code> is initially <code>1</code> is apparently neither in \([1,2)\) nor \([2,3)\) which is strange. This is happening, of course, because the macro is multiply-evaluating its argument, which it should not do.</p>
<h3 id="an-obviously-unhygienic-repair">An obviously unhygienic repair</h3>
<p>So let’s try to fix that:</p>
<pre class="brush: lisp"><code>(defmacro interval-case (n &body clauses)
;; Unhygenic
`(let ((v ,n))
(cond
,@(mapcar (lambda (clause)
(destructuring-bind (iv &body forms) clause
`(,(compute-interval-form 'v iv) ,@forms)))
clauses))))</code></pre>
<p>Well, this is better:</p>
<pre class="brush: lisp"><code>> (let ((x 1))
(interval-case (incf x)
((1 (2)) '(1 (2)))
((2 (3)) '(2 (3)))))
((2) (3))</code></pre>
<p>but … not much better:</p>
<pre class="brush: lisp"><code>> (let ((x 1) (v 10))
(interval-case (incf x)
((1 (2)) nil)
((2 (3)) v)))
2</code></pre>
<p>The macro binds <code>v</code>, which shadows the outer binding of <code>v</code> and breaks everything.</p>
<h3 id="a-repair-which-might-be-hygienic">A repair which might be hygienic</h3>
<p>Here is the normal way to fix that:</p>
<pre class="brush: lisp"><code>(defmacro interval-case (n &body clauses)
;; OK
(let ((vn (make-symbol "V")))
`(let ((,vn ,n))
(cond
,@(mapcar (lambda (clause)
(destructuring-bind (iv &body forms) clause
`(,(compute-interval-form vn iv) ,@forms)))
clauses)))))</code></pre>
<p>And now</p>
<pre class="brush: lisp"><code>> (let ((x 1) (v 10))
(interval-case (incf x)
((1 (2)) nil)
((2 (3)) v)))</code></pre>
<p>Good. I think it is possible to argue that this version of the macro is hygienic, at least in terms of names.</p>
<h3 id="a-simpler-repair-using-metatronic-macros">A simpler repair using metatronic macros</h3>
<p>Here is the previous macro written using metatronic macros:</p>
<pre class="brush: lisp"><code>(defmacro/m interval-case (n &body clauses)
;; OK, easier
`(let ((<v> ,n))
(cond
,@(mapcar (lambda (clause)
(destructuring-bind (iv &body forms) clause
`(,(compute-interval-form '<v> iv) ,@forms)))
clauses))))</code></pre>
<p>This is simpler to read and should be as good.</p>
<h3 id="an-alternative-approach-">An alternative approach …</h3>
<p>Although it is not entirely natural in the case of this macro, many macros can be written by having the macro expand into a call to a function, passing another function whose body is the body of the macro as an argument. These things often exist as pairs of <code>with-</code>* (the macro) and <code>call-with-</code>* (the function).</p>
<p>We can persuade <code>interval-case</code> to work like that: it’s not a natural macro to write that way and writing it that way will end up with something almost certainly less efficient as (at least the way I’ve written it) as it needs to interpret the interval specifications at runtime rather than compile them<sup><a href="#2023-05-03-two-sides-to-hygiene-footnote-2-definition" name="2023-05-03-two-sides-to-hygiene-footnote-2-return">2</a></sup>. But I wanted to have just one example.</p>
<p>Here is <code>call/intervals</code>, the function layer:</p>
<pre class="brush: lisp"><code>(defun call/intervals (n ivs/thunks)
;; Given a real n and a list of (interval-spec thunk ...), find the
;; first spec that n matches and call its thunk.
(if (null ivs/thunks)
nil
(destructuring-bind (iv thunk . more) ivs/thunks
(if (destructuring-match iv
(((l) (h))
(:when (and (realp l) (realp h)))
(< l n h))
((l (h))
(:when (and (realp l) (realp h)))
(and (<= l n) (< n h)))
(((l) h)
(:when (and (realp l) (realp h)))
(and (< l n) (<= n h)))
((l h)
(:when (and (realp l) (realp h)))
(<= l n h))
(default
(:when (member default '(t otherwise)))
t)
(otherwise
(error "~S is not an interval designator" iv)))
(funcall thunk)
(call/intervals n more)))))</code></pre>
<p>As well, here is a ‘nospread’ variation on <code>call/intervals</code> which serves as an impedence matcher:</p>
<pre class="brush: lisp"><code>(defun call/intervals* (n &rest ivs/thunks)
;; Impedence matcher
(declare (dynamic-extent ivs/thunks))
(call/intervals n ivs/thunks))</code></pre>
<p>Now here’s the macro layer:</p>
<pre class="brush: lisp"><code>(defmacro interval-case (n &body clauses)
;; Purports to be hygienic
`(call/intervals*
,n
,@(mapcan (lambda (clause)
`(',(car clause)
(lambda () ,@(cdr clause))))
clauses)))</code></pre>
<p>So we can test this:</p>
<pre class="brush: lisp"><code>> (let ((x 1) (v 10))
(interval-case (incf x)
((1 (2)) nil)
((2 (3)) v)))
10</code></pre>
<p>So, OK, that’s good, right? This is another hygienic macro. Not so fast.</p>
<h3 id="which-is-not-hygienic">… which is not hygienic</h3>
<pre class="brush: lisp"><code>> (flet ((call/intervals* (&rest junk)
(declare (ignore junk))
86))
(interval-case 2
((1 2) 'two)))
86</code></pre>
<p>Not so hygienic, then.</p>
<h3 id="the-alternative-approach-in-racket">The alternative approach in Racket</h3>
<p>Here is a similar alternative approach implemented in Racket:</p>
<pre class="brush: racket"><code>(define (call/intervals n ivs/thunks)
;; Here ivs/thunks is a list of (iv thunk) pairs, which is not the same
;; as the CL version: that's because I can't work out how to do the
;; syntax rule otherwise.
(match ivs/thunks
['() #f]
[(list (list iv thunk) more ...)
(if
(match iv
[(list (list (? real? l))
(list (? real? h)))
(< l n h)]
[(list (? real? l)
(list (? real? h)))
(and (<= l n) (< n h))]
[(list (list (? real? l))
(? real? h))
(and (< l n) (<= n h))]
[(list (? real? l) (? real? h))
(<= l n h)]
[(or 'otherwise #t)
#t]
[_
(error 'call/intervals "~S is not an interval designator" iv)])
(thunk)
(call/intervals n more))]))
(define (call/intervals* n . ivs/thunks)
;; impedence matcher (not so useful here)
(call/intervals n ivs/thunks))
(define-syntax-rule (interval-case n (key body ...) ...)
(call/intervals* n (list 'key (thunk body ...)) ...))</code></pre>
<p>And now:</p>
<pre class="brush: racket"><code>> (call/intervals* 1 (list '(0 1) (thunk 3)))
3
> (interval-case 2
((1 2) 'two))
'two
> (let ([call/intervals* (thunk* 86)])
(interval-case 2
((1 2) 'two)))
'two
> (let ([call/intervals* (thunk* 86)])
(call/intervals* 2))
86</code></pre>
<p>In Racket this macro is hygienic.</p>
<h2 id="two-sides-to-hygiene">Two sides to hygiene</h2>
<p>So the problem here is that there are at least <em>two sides to hygiene</em> for macros:</p>
<ul>
<li>names they use, usually by binding variables but also in other ways, must not interfere with names used in the program where the macro is used;</li>
<li>the program where the macro is used must not be able to alter what names the macro <em>refers to</em> mean.</li></ul>
<p>In both cases, of course, there need to be exceptions which are part of the macro’s contract with its users: <code>with-standard-io-syntax</code> is allowed (and indeed required) to bind <code>*print-case*</code> and many other variables.</p>
<p>I think almost everyone understands the first of these problems, but the second is much less often thought about.</p>
<h2 id="dealing-with-this-problem-in-common-lisp">Dealing with this problem in Common Lisp</h2>
<p>I think a full solution to this problem in CL would be very difficult: macros would have to refer to the names they rely on by names which were somewhow unutterable by the programs that used them. Short of actually writing a fully-fledged hygienic macro system for CL this sounds impractical.</p>
<p>In practice the solution is to essentially extend what CL already does. For symbols (so, names) in the CL package there are <a href="http://www.lispworks.com/documentation/HyperSpec/Body/11_aba.htm">strong restrictions</a> on what conforming programs may do. This program is not legal CL<sup><a href="#2023-05-03-two-sides-to-hygiene-footnote-3-definition" name="2023-05-03-two-sides-to-hygiene-footnote-3-return">3</a></sup> for instance:</p>
<pre class="brush: lisp"><code>(flet ((car (x) x))
... (car ...))</code></pre>
<p>So the best answer is then, I think, to:</p>
<ul>
<li>use packages with well-defined interfaces in the form of exported symbols;</li>
<li>disallow or strongly discourage the use of internal symbols of packages by programs which are not part of the implementation of the package;</li>
<li>and finally place restrictions similar to those placed on the CL package on <em>exported</em> symbols of your packages.</li></ul>
<p>Note that package <em>locks</em> don’t answer this problem: they usually forbid the modification of various attributes of symbols and the creation or deletion of symbols, but what is needed is considerably stronger than that: it needs to be the case that you can’t establish any kind of binding, even a lexical one, for symbols in the package.</p>
<p>Is this a problem in practice? Probably not often. Do I still prefer traditional Lisp macros? Yes, I think so.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2023-05-03-two-sides-to-hygiene-footnote-1-definition" class="footnote-definition">
<p>This function what you would want to make more complicated to allow multiple intervals per clause. <a href="#2023-05-03-two-sides-to-hygiene-footnote-1-return">↩</a></p></li>
<li id="2023-05-03-two-sides-to-hygiene-footnote-2-definition" class="footnote-definition">
<p>This interpretation could be avoided by havig the compiler turn the interval specifications into one-argument functions. I think it’s still not a natural way to write this macro. <a href="#2023-05-03-two-sides-to-hygiene-footnote-2-return">↩</a></p></li>
<li id="2023-05-03-two-sides-to-hygiene-footnote-3-definition" class="footnote-definition">
<p>Assuming that <code>car</code> means ‘the symbol whose name is <code>"CAR"</code> in the <code>"COMMON-LISP"</code> package’. <a href="#2023-05-03-two-sides-to-hygiene-footnote-3-return">↩</a></p></li></ol></div>Nirvanaurn:https-www-tfeb-org:-fragments-2023-05-02-nirvana2023-05-02T13:16:58Z2023-05-02T13:16:58ZTim Bradshaw
<p>An article constructed from several emails from my friend Zyni, reproduced with her permission. Note that Zyni’s first language is not English.</p>
<!-- more-->
<p>Many people have tried to answer what is so special about Lisp by talking about many things.</p>
<p>Such as interactive development, a thing common now to many languages of course, and if you use Racket with DrRacket not in fact how development usually works there at all. Are we to cast Racket into the outer darkness?<sup><a href="#2023-05-02-nirvana-footnote-1-definition" name="2023-05-02-nirvana-footnote-1-return">1</a></sup></p>
<p>Such as CLOS, a thing specific to Common Lisp: can you not achieve Lisp enlightenment unless you program in Common Lisp? Was Lisp enlightmenent impossible before CLOS existed? What stupid ideas. Could you implement CLOS in a language which was not Lisp? Certainly you could.</p>
<p>Such as the CL condition system: a thing also specific to Common Lisp. Something also which could be implemented in any sufficiently dynamic language. Something almost nobody who writes in Common Lisp understands I think.</p>
<p>And so it goes on.</p>
<p>None of this is the answer. None of this is close to the answer. To find the answer ask <em>why</em> did these things arise in Lisp first? What is the property of Lisp which is in fact unique to Lisp and which <em>defines</em> Lisp in strict sense that if any other language had this property <em>it would be a Lisp</em>? To see answer to this you must understand <a href="https://www.tfeb.org/fragments/2022/10/03/bradshaw-s-laws/" title="Bradshaw's law">Bradshaw’s law</a> and my corollary to it:</p>
<p><strong>Bradshaw’s law.</strong> <em>All sufficiently large software systems end up being programming languages.</em></p>
<p><strong>Zyni’s corollary.</strong> <em>At whatever size you think Bradshaw’s law applies, it applies sooner than that.</em></p>
<p>This means that <em>all programming is language construction</em>.<sup><a href="#2023-05-02-nirvana-footnote-2-definition" name="2023-05-02-nirvana-footnote-2-return">2</a></sup> When you write a program you are writing a language in which to express the problem you wish to solve.</p>
<p>Now you can begin understand what is so interesting about Lisp. In almost all programming languages when you solve a problem you define a lot of new words for the language you have, and perhaps you define elaborate classifications of the nouns of the language you will allow. But you can do nothing with the structure of the language you must use because the language will not allow that: it has a fixed grammar handed down by the great and good who designed it who are sometimes not fools. And indeed you are fiercely discouraged from even understanding what it is you are doing: discouraged from understanding that you are building a new language.</p>
<p>And quite soon (sooner than you think and in fact immediately) you find you must actually have new structure, new <em>grammar</em>. But you cannot do this easily both because the language you use does not allow it and also because you do not know what it is you are doing – you do not realise that you are making a language. So probably you use a templating system or something and build an awful horror. Often this horror will have nested languages where inner languages appear in strings in outer languages. Often it will have evaluation rules so obscure and inconsistent that it is impossible for humans to write safe large programs in this language (Unix shells: I look at you). We have all seen these things.</p>
<p>And so you live out your life crawling in the dirt, never understanding what thing it is of which you are making a very bad, very unsafe, very ugly version. Because you have been taught there is only mud so all you do is pile up structures out of mud, to be washed away by the next rain. A little way over is a tribe who knows only straw and they build structures from straw which blow away in the first wind. You hate them; they hate you. Sometimes you have little wars.</p>
<p>What, on the other hand, do you do in Lisp? Well, few days ago I needed a way to express the idea of searching some (very) large structure and being able to fail in a structured way. So after ten minutes work, my program now says things like this:</p>
<pre class="brush: lisp"><code>(defun big-serch-thing (thing)
(attempting
(quick-and-dirty thing)
(try-harder thing)))
(defun try-harder (thing)
(walking-thing (node thing :level 0)
(attempting
(first-pass thing)
(desparate-fallback thing))))
(defun first-pass (thing)
...
(when doom (fail))
...)</code></pre>
<p>Well it does not matter what this does and this is not what my program is actually like, but what is clear just by looking is that <em>this language is not Common Lisp</em>. Instead it is Common Lisp extended with at least two new grammatical constructs: <code>attempting</code> with its friend <code>fail</code> which looks like a verb but in fact is a control construct really, and <code>walking-thing</code> which is some kind of new iteration construct perhaps.</p>
<p>And there is more: when you look at <code>attempting</code> you will find it is implemented (by a function which) uses a construct called <code>looping</code> which is <em>another</em> extension to Common Lisp. And similarly for <code>walking-thing</code> (which is not really called that) which uses I think four separate new grammatical constructs I do not remember.</p>
<p>And there is more: when I started this essay these constructs were mostly as I showed above, but we have decided this was wrong, so the new language is now somewhat different and somewhat richer. A few more tens of minutes of work, most of it altering the existing programs in the old language to use the new language. The new language is even defined using a language-extending construct which itself is an extension to CL’s provided ones.</p>
<p>And this is how you program in Lisp. <em>In Lisp, writing programs is building languages</em>: in Lisp to solve a problem is to first build a language in which the problem may be solved. And because doing this is so easy in Lisp, this is what you do even for very small problems: you incrementally extend the grammar of the language — not just its lexicon — to create a language in which to describe the problem.</p>
<p>Well, this is not surprising, is it? This is what the laws imply: programming <em>is</em> constructing languages, and this applies even for very small programs. What is surprising is that so few languages encourage this. And because they do not we end up with the horror we all know. Perhaps even this is not surprising: any language which supports this well will have all the characteristics of Lisp, will in fact <em>be</em> a Lisp. So no other languages do this because to do it requires being Lisp. So why is Lisp not more popular? Well, answer is fairly easy but this is discussion for another day, I think.</p>
<p>And now we see why Lisp got features first: because it could. Let us say you wish to explore an object system in Lisp. Well, perhaps you will want a class-defining construct, so you write a macro, <code>define-class</code> or something. And you wish to be able to send messages, so you write a <code>send</code> function and then you modify the readtable so <code>[o message ...]</code> is <code>(send o message ...)</code>. And perhaps you wish some new binding construct for fields so you write <code>with-fields</code> and so, and so.</p>
<p>And now you have a new language. If you were careful you may even have constructed that new language inside a single running Lisp image. And this took, perhaps, some hours. And later, you decide that no, you wish your new language to be different, so you change it. Another few hours. Eventually, in a different world, you call this part of the language ZLOS and there is a standard.</p>
<p>And this is why these linguistic innovations happen in Lisp: because Lisp is a machine for linguistic innovation. It is <em>that</em> feature of Lisp which makes it interesting, and it is <em>only</em> that feature: both because all other features derive from that one and because to have that feature is to be Lisp.</p>
<p>That is all.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2023-05-02-nirvana-footnote-1-definition" class="footnote-definition">
<p>Do not answer this or I will kill you with a stale loaf of bread. <a href="#2023-05-02-nirvana-footnote-1-return">↩</a></p></li>
<li id="2023-05-02-nirvana-footnote-2-definition" class="footnote-definition">
<p>This is exaggeration: if you define <em>no</em> names in your program you are, perhaps, not constructing a language. <a href="#2023-05-02-nirvana-footnote-2-return">↩</a></p></li></ol></div>Something unclear in the Common Lisp standardurn:https-www-tfeb-org:-fragments-2023-04-18-something-unclear-in-the-common-lisp-standard2023-04-18T09:53:46Z2023-04-18T09:53:46ZTim Bradshaw
<p>There is what I think is a confusion as to bound declarations in the Common Lisp standard. I may be wrong about this, but I think I’m correct.</p>
<!-- more-->
<h2 id="bound-and-free-declarations">Bound and free declarations</h2>
<p><a href="http://www.lispworks.com/documentation/HyperSpec/Body/03_c.htm">Declarations</a> in Common Lisp can be either <a href="http://www.lispworks.com/documentation/HyperSpec/Body/03_cd.htm">bound or free</a>:</p>
<ul>
<li>a <strong>bound</strong> declaration appears at the head of a binding form and applies to a variable or function binding made by that form;</li>
<li>a <strong>free</strong> declaration is any declaration which is not bound.</li></ul>
<p>There are declarations which do not apply to bindings, such as <code>optimize</code>: these are always free.</p>
<h2 id="examples-of-bound-and-free-declarations">Examples of bound and free declarations</h2>
<p>In the form</p>
<pre class="brush: lisp"><code>(let ((x 1))
(declare (type integer x))
...)</code></pre>
<p>the declaration is bound and applies to the binding of <code>x</code>. In the form</p>
<pre class="brush: lisp"><code>(let ((/x/ 1))
(declare (special /x/)
(optimize (speed 3)))
...)</code></pre>
<p>the <code>special</code> declaration is bound and applies to the binding of <code>/x/</code>, while the <code>optimize</code> declaration is free.</p>
<p>In the form</p>
<pre class="brush: lisp"><code>(let ((x 1))
(locally
(declare (type integer x)
(optimize speed))
...)
...)</code></pre>
<p>Both declarations are free and apply only to the body of the <code>locally</code> form.</p>
<h2 id="declarations-which-may-not-be-ignored">Declarations which may not be ignored</h2>
<p>Most declarations may be ignored by the implementation: this is the case for all type declarations, for instance. Two may not be:</p>
<ul>
<li><code>notinline</code> forbids inline compilation of the functions it names;</li>
<li><code>special</code> requires dynamic bindings to be made when it is bound, and requires references to be to dynamic, not lexical bindigns when it is free.</li></ul>
<p>I’m going to exploit the non-ignorability of <code>special</code> declarations to show a case where the confusion arises.</p>
<h2 id="the-confusion">The confusion</h2>
<p>Forms like <a href="http://www.lispworks.com/documentation/HyperSpec/Body/s_let_l.htm"><code>let*</code></a> bind <em>sequentially</em>:</p>
<pre class="brush: lisp"><code>(let* ((x 1) (y x))
...)</code></pre>
<p>first binds <code>x</code> and then binds <code>y</code> to the value of <code>x</code>. Now, I am not sure of the standard ever says this, but all implementations I have tried take this to mean that <em>the same name can be bound several times by <code>let*</code></em>:</p>
<pre class="brush: lisp"><code>(let* ((x 1) (x x))
...)</code></pre>
<p>is legal, if stylistically awful. That’s because the obvious transformation of <code>let*</code> into nested <code>let</code>s turns this into:</p>
<pre class="brush: lisp"><code>(let ((x 1))
(let ((x x))
...))</code></pre>
<p>which is clearly fine.</p>
<p>So now we come to the problem: what should this mean?</p>
<pre class="brush: lisp"><code>(let* ((x 1) (x x))
(declare (type fixnum x))
...</code></pre>
<p>Which binding of <code>x</code> does the declaration apply to? The standard does not say. In this case it might not matter, because this declaration can be ignored, but here is a case where it <em>does</em> matter:</p>
<pre class="brush: lisp"><code>(let (c)
(let* ((/x/ 1)
(/x/ (progn
(setf c (lambda () /x/))
2)))
(declare (special /x/))
(values c (lambda () /x/))))</code></pre>
<p>This expression returns two values, both of which are functions:</p>
<ul>
<li>if the first <code>/x/</code> is special then calling the first function will result in an error;</li>
<li>if the second <code>/x/</code> is special then calling the second function will result in an error.</li></ul>
<p>So using this trick you can know whether the first binding, second binding, or both bindings are affected by the <code>special</code> declaration.</p>
<p>And, again, the standard does not say which binding is affected, or whether both should be. And implementations differ. Given the following file</p>
<pre class="brush: lisp"><code>(in-package :cl-user)
(defun call-ok-p (f)
(multiple-value-bind (v c)
(ignore-errors
(funcall f)
t)
(declare (ignore c))
v))
(defun ts ()
(multiple-value-bind (one two)
(let (c)
(let* ((/x/ 1)
(/x/ (progn
(setf c (lambda () /x/))
2)))
(declare (special /x/))
(values c (lambda () /x/))))
(values (call-ok-p one)
(call-ok-p two))))
(multiple-value-bind (first-lexical second-lexical) (ts)
(format t "~&first ~:[special~;lexical~]~%~
second ~:[special~;lexical~]~%"
first-lexical second-lexical))</code></pre>
<p><strong>SBCL</strong></p>
<pre><code>first lexical
second special</code></pre>
<p><strong>CCL</strong></p>
<pre><code>first special
second special</code></pre>
<p><strong>LispWorks</strong></p>
<pre><code>first special
second special</code></pre>
<h2 id="what-should-the-answer-be">What should the answer be?</h2>
<p>I think that the interpretation taken by CCL and LispWorks is better: in forms like this declarations should apply to <em>all</em> the bindings made by the form. An alternative answer is that the declarations should apply to the <em>visible</em> bindings at the point of the declaration, which is the approach taken by SBCL.</p>
<p>It’s tempting to say that the obvious rewrite of <code>let*</code> as nested <code>let</code>s gives you the SBCL answer, but it does not. In a form like</p>
<pre class="brush: lisp"><code>(let* ((x 3) (y x))
(declare (type integer x)
(type (integer 0) y))
...)</code></pre>
<p>This must be rewritten as</p>
<pre class="brush: lisp"><code>(let ((x 3))
(declare (type integer x))
(let ((y x))
(declare (type (integer 0) y))
...))</code></pre>
<p>So the declaration for <code>x</code> must be raised out of the inner <code>let</code> so it remains bound: the implementation already has to do work to get declarations in the right place and can’t just naïvely rewrite the form.</p>
<p>I prefer the first interpretation both because I think it represents what people are likely to want more closely, but also because I think the standard could be interpreted as meaning that without being rewritten.</p>
<h2 id="does-this-matter">Does this matter?</h2>
<p>Probably only in very obscure cases! I just thought it was interesting.</p>
<hr />
<p>Thanks to vrious people on the Lisp-HUG mailing list for coming up with this.</p>Measuring some tree-traversing functionsurn:https-www-tfeb-org:-fragments-2023-03-26-measuring-some-tree-traversing-functions2023-03-26T09:25:50Z2023-03-26T09:25:50ZTim Bradshaw
<p>In a <a href="https://www.tfeb.org/fragments/2023/03/13/variations-on-a-theme/" title="Variations on a theme">previous article</a> my friend Zyni wrote some variations on a list-flattening function, some of which were ‘recursive’ and some of which ‘iterative’, managing the stack explicitly. We thought it would be interesting to see what the performance differences were, both for this function and a more useful variant which searches a tree rather than flattening it.</p>
<!-- more-->
<h2 id="what-we-measured">What we measured</h2>
<p>The code we used is <a href="https://github.com/tfeb/zyni-flatten" title="sample code">here</a><sup><a href="#2023-03-26-measuring-some-tree-traversing-functions-footnote-1-definition" name="2023-03-26-measuring-some-tree-traversing-functions-footnote-1-return">1</a></sup>. We measured four variations of each of two functions.</p>
<h3 id="list-flattening">List flattening</h3>
<p>All these functions use <a href="https://tfeb.github.io/tfeb-lisp-hax/#collecting-lists-forwards-and-accumulating-collecting" title="collecting"><code>collecting</code></a> to build their results forwards. They live in <a href="https://github.com/tfeb/zyni-flatten/blob/main/flatten-variants.lisp" title="flatten-variants.lisp"><code>flatten-variants.lisp</code></a>.</p>
<ul>
<li><code>flatten/implicit-stack</code> works in the obvious recursive way, with an implicit stack. This uses <a href="https://tfeb.github.io/tfeb-lisp-hax/#applicative-iteration-iterate" title="iterate"><code>iterate</code></a> to express the local recursive function.</li>
<li><code>flatten/explicit-stack</code> uses an explicit stack (called <code>agenda</code> in the code) represented as a vector, and uses <a href="https://tfeb.github.io/tfeb-lisp-hax/#decomposing-iteration-simple-loops" title="looping"><code>looping</code></a> to express iteration.</li>
<li><code>flatten/explicit-stack/adja</code> is like the previous function but it is willing to extend the explicit stack, which it does by using <code>adjust-array</code> and assignment.</li>
<li><code>flatten/explicit-stack/adjb</code> is like <code>flatten/explicit-stack/adja</code> but uses a local tail-recursive function to <em>bind</em> the extended stack rather than assignment.</li>
<li>Finally <code>flatten/consy-stack</code> is very close to Zyni’s original iterative solution: it represents the stack as a list. This version necessarily conses fairly copiously.</li></ul>
<h3 id="searching-cons-trees">Searching cons trees</h3>
<p>These functions, in <a href="https://github.com/tfeb/zyni-flatten/blob/main/treesearch-variants.lisp" title="treesearch-variants.lisp"><code>treesearch-variants.lisp</code></a>, correspond to the flattening variants, except they are searching for some atomic value in the tree of conses:</p>
<ul>
<li><code>search/implicit-stack</code> uses an implicit stack;</li>
<li><code>search/explicit-stack</code> uses a vector;</li>
<li><code>search/explicit-stack/adja</code> uses a vector and adjusts by assignment;</li>
<li><code>search/explicit-stack/adjb</code> uses a vector and adjusts by binding;</li>
<li><code>search/consy-stack</code> uses a consy stack.</li></ul>
<h3 id="notes-on-the-code">Notes on the code</h3>
<p>The functions all have <code>(declare (optimize (speed 3)))</code> but specifically <em>don’t</em> turn off safety or use implementation-specific settings: we wanted to test code we felt we’d be happy running, and that means code compiled with reasonable settings for safety: if you turn safety off you’re brave, foolish, or both.</p>
<p>We did not compare <code>looping</code> with <code>do</code> or <code>loop</code>: we probably should. However the expansion of <code>looping</code> is pretty straightforward:</p>
<pre class="brush: lisp"><code>(looping ((this o) (depth 0))
(declare ...)
...)</code></pre>
<p>Turns into</p>
<pre class="brush: lisp"><code>(let ((this o) (depth 0))
(declare ...)
(block nil
(tagbody
#:start
(multiple-value-setq (this depth) ...)
(go #:start))))</code></pre>
<p>The only real question here, we think is whether <code>multiple-value-setq</code> is compiled well: brief inspection implies it is. We should probably still compare the current version with more ‘native CL’ variants.</p>
<p>The variants which use a vector as a stack maintain the current element themselves: that’s because we tested using a fill pointer and <code>vector-push</code> / <code>vector-pop</code> and it was really significantly slower in both implementations.</p>
<h2 id="what-we-did">What we did</h2>
<h3 id="the-lisp-implementations-we-used">The Lisp implementations we used</h3>
<p>We used LispWorks 8.0 and very recent SBCL builds, compiled from the <code>master</code> branch no more than a few days before we ran the tests in mid March 2023.</p>
<p>In the case of SBCL we paid attention to notes and warnings during compilation. The significant one we did <em>not</em> address was that it complained vociferously about not being able to optimize calls to <code>eql</code>: that’s because we don’t know the type of the thing we are searching for: it <em>needs</em> to do the work it is trying to avoid. Apart from this the only warnings were about the computation of the new length of the agenda, which never actually happens in the tests we ran.</p>
<h3 id="the-machines-we-benchmarked-on">The machines we benchmarked on</h3>
<p>We both have M1-based Macbook Airs so this is what we used. In particular we have not run any benchmarks on x64.</p>
<h3 id="what-we-ran">What we ran</h3>
<p><code>make-car-cdr</code>, in <a href="https://github.com/tfeb/zyni-flatten/blob/main/common.lisp" title="common.lisp"><code>common.lisp</code></a>, makes a list where each element is a chain linked by cars, finally terminating in a specified element. Controlling the length of the list and the depth of the chains gives the functions more iterative or more recursive work to do respectively. The benchmarking code then made a series of suitable structures of increasing size and timed many iterations of each function on the same structure, computing the time per call. We then wrote a program in Racket to plot the results on axes of ‘breadth’ (length of the list) and ‘depth’ (depth of the car-linked chain). For the search functions the element being searched for was not in the tree so they had to do as much work as possible.</p>
<p>Life was usually arranged so that the initial agenda was big enough for the functions which used a vector as the agenda, so none of that aspect of them was teated, except for one case below. Apart from that case, the ‘vector stack’ timings refer to <code>flatten/explicit-stack</code> and <code>treesearch/explicit-stack</code>, not the adjustable-stack variants.</p>
<h2 id="some-results">Some results</h2>
<p>We timed 1,000 iterations of each call, for list lengths (breadth in the plots and below) from 30 to 1,000 in steps of 10 and depths (depth in the plots and below) from 10 to 300 in steps of 10, computing times in μs per iteration. Neither of us knows anything about how data like this should be best presented but simply plotting the performance surfaces seemed reasonable. We used bilinear interpolation to make the surface from the points<sup><a href="#2023-03-26-measuring-some-tree-traversing-functions-footnote-2-definition" name="2023-03-26-measuring-some-tree-traversing-functions-footnote-2-return">2</a></sup>.</p>
<h3 id="lispworks">LispWorks</h3>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-treesearch-implicit-vector.svg" alt="Treesearch: implicit compared with vector stack" />
<p class="caption">Treesearch: implicit compared with vector stack</p></div>
<p>This is nicely linear in both breadth and depth, and so quadratic in breadth \(\times\) depth. And it’s easy to see that for LW using the implicit stack is faster than the manually-managed stack.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-treesearch-vector-consy.svg" alt="Treesearch: vector stack compared with consy stack" />
<p class="caption">Treesearch: vector stack compared with consy stack</p></div>
<p>This compares the vector stack with the consy stack, for treesearch. The consy stack is slightly faster which surprised us. This conses a list as long as the depth of the tree for each ‘leftward’ branch, and then immediately unwinds that and throws the whole list away. So it creates significant garbage, but the allocation and garbage collection overhead together is still faster than using a vector. Consing really is (almost) free.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-treesearch-flatten.svg" alt="Treesearch compared with flatten, both with implicit stacks" />
<p class="caption">Treesearch compared with flatten, both with implicit stacks</p></div>
<p>Here is more evidence that consing is very cheap: the difference between treesearch (which does not cons) and flatten (which does) is tiny.</p>
<h3 id="sbcl">SBCL</h3>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/sbcl-treesearch-implicit-vector.svg" alt="Treesearch: implicit compared with vector stack" />
<p class="caption">Treesearch: implicit compared with vector stack</p></div>
<p>So here is SBCL. For SBCL explicitly managing the stack as a vector is significantly faster than the implicit stack. Something that is also apparent here is how variable SBCL’s timings are compared with LW’s: we don’t know why that is although we suspect it might be because SBCL’s garbage collector is more intrusive than LW’s. We also don’t know whether this variation is repeatable, or whether it’s due to a single very slow run or something like that.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/sbcl-treesearch-vector-consy.svg" alt="Treesearch: vector stack compared with consy stack" />
<p class="caption">Treesearch: vector stack compared with consy stack</p></div>
<p>For SBCL the consy stack is significantly slower than the vector stack, so for SBCL the vector stack is the fastest.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/sbcl-treesearch-flatten.svg" alt="Treesearch compared with flatten, both with implicit stacks" />
<p class="caption">Treesearch compared with flatten, both with implicit stacks</p></div>
<p>SBCL has a slightly larger difference between treesearch and flatten, with flatten being slower. There are also curious ‘waves’ in the plot as depth increases.</p>
<h3 id="lispworks-compared-with-sbcl">LispWorks compared with SBCL</h3>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-sbcl-treesearch-implicit.svg" alt="Treesearch: SBCL compared with Lispworks, implicit stacks" />
<p class="caption">Treesearch: SBCL compared with Lispworks, implicit stacks</p></div>
<p>LW is significantly faster than SBCL for implicit stacks except for very small depths.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-sbcl-treesearch-best.svg" alt="Treesearch: SBCL compared with Lispworks, best stacks" />
<p class="caption">Treesearch: SBCL compared with Lispworks, best stacks</p></div>
<p>This compares LW using an implicit stack with SBCL using an explicit vector stack. The difference is pretty small now.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-sbcl-flatten-consy.svg" alt="Flatten: SBCL compared with Lispworks, consy stacks" />
<p class="caption">Flatten: SBCL compared with Lispworks, consy stacks</p></div>
<p>This was meant to be the worst-case for both: flattening and a consy stack. But it’s not particularly informative, I think.</p>
<h3 id="the-outer-reaches-lispworks-with-a-deep-tree">The outer reaches: LispWorks with a deep tree</h3>
<p>We did one run with the maximum depth set to 10,000 with a step of 500, and maximum breadth set to 1,000 with a step of 100, averaged over 100 iterations instead of 1,000. This is too deep for LW’s stack, but LW allows stack extension, and we wrote what later became <a href="https://github.com/tfeb/tfeb-lisp-implementation-hax/blob/main/lw/modules/allowing-stack-extensions.lisp">this</a> to extend the stack as required. Note that this happens only during the first recursion into the left-hand branch of the tree so has minimal effect on performance. This also used <code>search/explicit-stack/adjb</code> for the vector stack.</p>
<div class="figure"><img src="/fragments/img/2023/zyni-flatten/lw-treesearch-implicit-vector-deep.svg" alt="Treesearch: implicit compared with consy stack, deep tree" />
<p class="caption">Treesearch: implicit compared with consy stack, deep tree</p></div>
<p>As before the implicit stack is much better for LW. This is much more bumpy than LW was for smaller depths, this might have been because the machine did other things while it was running but we don’t think so.</p>
<h2 id="some-conclusions">Some conclusions</h2>
<p>None of the differences were really large. In particular there’s no enormous advantage from managing the stack yourself.</p>
<p>Consing and the resulting garbage-collection does really seem to be very cheap, especially in LispWorks: the days of long GC pauses are long gone.</p>
<p>We were surprised that LispWorks was fairly reliably faster than SBCL: surprised enough that we ran everything several times to be sure. It’s also interesting how much smoother LW’s performance surface is in most cases.</p>
<p>It is possible that our implementations just suck, of course.</p>
<p>Mostly it’s just some pretty pictures.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2023-03-26-measuring-some-tree-traversing-functions-footnote-1-definition" class="footnote-definition">
<p>All of the functions should be portable CL. Some of the mechanism for expressing dependencies and loading things is not. However it should be easy for anyone to run this if they wish to. <a href="#2023-03-26-measuring-some-tree-traversing-functions-footnote-1-return">↩</a></p></li>
<li id="2023-03-26-measuring-some-tree-traversing-functions-footnote-2-definition" class="footnote-definition">
<p>Getting the bilinear interpolation right took longer than anything else, and perhaps longer than everything else put together. <a href="#2023-03-26-measuring-some-tree-traversing-functions-footnote-2-return">↩</a></p></li></ol></div>The absurdity of stacksurn:https-www-tfeb-org:-fragments-2023-03-25-the-absurdity-of-stacks2023-03-25T10:57:19Z2023-03-25T10:57:19ZTim Bradshaw
<p>Very often people regard the stack as a scarce, expensive resource, while the heap is plentiful and very cheap. This is absurd: the stack is memory, the heap is also memory. Deforming programs so they are ‘iterative’ in order that they do not run out of the stack we imagine to be so costly is ridiculous: if you have a program which is inherently recursive, let it be recursive.</p>
<!-- more-->
<p>In a <a href="https://www.tfeb.org/fragments/2023/03/13/variations-on-a-theme/" title="Variations on a theme">previous article</a> my friend Zyni wrote some variations on a list-flattening function<sup><a href="#2023-03-25-the-absurdity-of-stacks-footnote-1-definition" name="2023-03-25-the-absurdity-of-stacks-footnote-1-return">1</a></sup>, some of which were ‘recursive’ and some of which ‘iterative’. Of course, the ones which claim to be iterative are, in fact, recursive: any procedure which traverses a recursively-defined data structure such as a tree of conses is necessarily recursive. The ‘iterative’ versions just use an explicitly-maintained stack rather than the implicit stack provided by the language. That makes sense only if stack space is very small compared to the heap and must therefore be conserved. And, well, for many systems that’s true. But it is small only because we have administratively decided it should be small: the stack is just memory. If there is plenty of memory for the heap, there is plenty for the stack.</p>
<p>There are, or may be, arguments for why stacks needed to be small on ancient machines. The history is fascinating, but it is not relevant to today’s systems, other than tiny embedded ones. The persistent view of modern machines as giant PDP–11s has been a blight for well over two decades now: it needs to stop.</p>
<p>The argument that the stack should be small often seems to be that, if it’s not, people will write programs which run away. That’s spurious: if such a program is, in fact, iterative, then good compilers will eliminate the tail calls and it will not use stack: a small limit on the stack will not help. If it’s really recursive then why should it run out of storage before its conversion to a program which manages the stack explicitly does? Of course <em>that’s exactly what compilers which do <a href="https://en.wikipedia.org/wiki/Continuation-passing_style?wprov=sfti1" title="continuation-passing style">CPS conversion</a> already do</em>: programs written using compilers which do that won’t have these weird stack limits in the first place. But it should not be necessary to rely on a CPS-converting compiler, or to write in continuation-passing style manually to avoid stack usage: it should be used for other reasons, because the stack is not, in fact, expensive.</p>
<p>Still less should people feel the need to write programs which explicitly manage a stack except in extraordinary cases.</p>
<p>There need to be <em>some</em> limits on stack size, just as there need to be <em>some</em> limits on heap size, but making the limit on stack size far smaller than the limit on heap size simply encourages people to believe things which aren’t true, and to live in fear of recursive programs.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2023-03-25-the-absurdity-of-stacks-footnote-1-definition" class="footnote-definition">
<p>I still want to know how often functions like this are used in real life. <a href="#2023-03-25-the-absurdity-of-stacks-footnote-1-return">↩</a></p></li></ol></div>Variations on a themeurn:https-www-tfeb-org:-fragments-2023-03-13-variations-on-a-theme2023-03-13T12:36:33Z2023-03-13T12:36:33ZTim Bradshaw
<p>My friend Zyni wrote a comment to a thread on reddit with some variations on a list-flattening function. We’ve since spent some time thinking about things related to this, which is written up in a following article. Here is her comment so the following article can refer to it. Other than notes at the end the following text is Zyni’s, not mine.</p>
<!-- more-->
<h2 id="httpswwwredditcomrcommonlispcomments11o1wvmcommentjbt9n54utmsourceshareutmmediumweb2xcontext3the-reddit-comment-by-zyni"><a href="https://www.reddit.com/r/Common_Lisp/comments/11o1wvm/comment/jbt9n54/?utm_source=share&utm_medium=web2x&context=3">The reddit comment by Zyni</a></h2>
<p>First of all we all know that CL does not promise to optimize tail recursion: means that tail recursive program may generate recursive not iterative process. So recursive program in CL <em>even if tail recursive</em> is not safe on data of unknown size, assuming stack is limited.</p>
<p>But let us assume as good implementations do that tail recursion is optimized in implementation (no need for general tail calls here but is obvious nice thing if implementations do this). Certainly if we are deploying code in space we know what implementation we use and can check this.</p>
<p>So we look at this supposed wonder of code, which I rewrite slightly to use <a href="https://tfeb.github.io/tfeb-lisp-hax/#applicative-iteration-iterate" title="iterate"><code>iterate</code> macro</a> which is simply Scheme’s named-<code>let</code> to be compatible with later examples:</p>
<pre class="brush: lisp"><code>(defun flatten (o)
;; original terrible one
(iterate ftn ((x o) (accumulator '()))
(typecase x
(null accumulator)
(cons (ftn (car x) (ftn (cdr x) accumulator)))
(t (cons x accumulator)))))</code></pre>
<p>This … is really bad program. It makes an essential mistake that it wishes to build result forwards but lists wish to be built backwards, so it must therefore recurse (not tail) on cdr of structure first. But most list-based structures have little weight in car but much in cdr, so this will fail <em>even on list which is already flat</em>: <code>(flatten (make-list 100000 :initial-element 1))</code> will fail if your example fails.</p>
<p>Any person presenting this code as good example should be ashamed of self.</p>
<p>So first change: we accept that we must build lists backwards but we change program so that tail call is on cdr not car, and reverse result:</p>
<pre class="brush: lisp"><code>(defun flatten (o)
;; not TR but better on usual assumptions
(nreverse
(iterate ftn ((x o) (accumulator '()))
(typecase x
(null accumulator)
(cons (ftn (cdr x) (ftn (car x) accumulator)))
(t (cons x accumulator))))))</code></pre>
<p>This function will be fine on assumption of structures which have most weight in their cdrs, which often is true.</p>
<p>Well, you say, ugly <code>reverse</code>. OK this is easy: we simply add in a <a href="https://tfeb.github.io/tfeb-lisp-hax/#collecting-lists-forwards-and-accumulating-collecting" title="collecting"><code>collecting</code> macro</a> which allows construction of list forwards, implementation is obvious (tail pointer). Now we have done this we can also reorder calls to be more obvious (car call, not TR, is now first):</p>
<pre class="brush: lisp"><code>(defun flatten (o)
;; not TR, better on usual assumptions, no reverse
(collecting
(iterate ftn ((x o))
(typecase x
(cons
(ftn (car x))
(ftn (cdr x)))
(null)
(t (collect x))))))</code></pre>
<p>This is still not fully TR, so will fail on structures which have much weight in car.</p>
<p>Well, of course, we can deal with this as well: we use explicit agenda to move stack onto heap and turn into pure tail recursive version. First one which builds list backwards in obvious way, therefore needs <code>reverse</code> again:</p>
<pre class="brush: lisp"><code>(defun flatten (o)
;; pure TR
(iterate ftn ((agenda (list o))
(accumulator '()))
(if (null agenda)
;; can write own reverse as tail recursive of course if wish
;; to be pure of heart
(nreverse accumulator)
(destructuring-bind (this . more) agenda
(typecase this
(null
(ftn more accumulator))
(cons
(ftn (list* (car this) (cdr this) more) accumulator))
(t
(ftn more (cons this accumulator))))))))</code></pre>
<p>Assuming implementation optimizes tail recursion this will flatten completely arbitrary structure limited only by memory.</p>
<p>We can avoid this reversery of course:</p>
<pre class="brush: lisp"><code>(defun flatten (o)
;; pure TR, no reverse
(collecting
(iterate ftn ((agenda (list o)))
(when (not (null agenda))
(destructuring-bind (this . more) agenda
(typecase this
(null
(ftn more))
(cons
(ftn (list* (car this) (cdr this) more)))
(t
(collect this)
(ftn more))))))))</code></pre>
<p>As before this is limited only by memory assuming implementation optimizes tail calls.</p>
<hr />
<p>Well, I have written Lisp for only couple of years really (but have maths background). But even I can see that this idea of having to put scary label on recursive function is very bad. Instead people using such code should perhaps <em>read it and understand it</em> to see what its problems and advantages are. Radical idea, I know.</p>
<p>Finally idea that stack space is scarce may or may not be true. Example, if we rewrite original version in Racket (first Lisp I used before being lured to dark side):</p>
<pre class="brush: lisp"><code>(define (flatten o)
(let ftn ([x o] [accumulator '()])
(cond
[(null? x) accumulator]
[(cons? x) (ftn (car x) (ftn (cdr x) accumulator))]
[else (cons x accumulator)])))</code></pre>
<p>This will happily ‘flatten’ 100,000 element list and is only limited by memory available because Racket does not treat stack same way.</p>
<hr />
<p>Finally here is variant of final version using <a href="https://tfeb.github.io/tfeb-lisp-hax/#decomposing-iteration-simple-loops" title="simple loops"><code>looping</code> macro</a> which does applicative iteration: this is iterative, on any implementation:</p>
<pre class="brush: lisp"><code>(defun flatten (o)
;; Iterative
(collecting
(looping ((agenda (list o)))
(when (null agenda)
(return))
(destructuring-bind (this . more) agenda
(typecase this
(null more)
(cons (list* (car this) (cdr this) more))
(t (collect this) more))))))</code></pre>
<p><code>looping</code> part of this turns into:</p>
<pre class="brush: lisp"><code>(let ((agenda (list o)))
(block nil
(tagbody
#:start (setq agenda
(progn
(when (null agenda) (return))
(destructuring-bind (this . more) agenda
(typecase this
(null more)
(cons (list* (car this) (cdr this) more))
(t (collect this) more)))))
(go #:start))))</code></pre>
<p>which is iterative.</p>
<p>I think <code>iterate</code> one is nicer.</p>
<hr />
<h2 id="notes-from-tim">Notes from Tim</h2>
<p>English is Zyni’s third language: she wanted me to fix up the above but I refused as I find the way she writes so charming.</p>
<p>Both of us would like to know how often <code>flatten</code> is actually used: everyone seems to be very keen on it, but we can’t think of any cases where we’ve ever wanted it or anything very much like it.</p>
<p>All of the macros referenced are ‘mine’ in a somewhat loose sense: They’re all published by me, and some of them are mine, some of them were mine but have been made much better by Zyni, some of them are really hers. There are generally comments in the code. Zyni refuses to have anything but a very minimal internet presence for reasons I used to think were absurd but no longer do: you can’t be too careful when your parents and by extension you might be on the wrong side of Putin.</p>
<p>Zyni is not her real name, obviously.</p>Two tiny Lisp evaluatorsurn:https-www-tfeb-org:-fragments-2023-02-27-two-tiny-lisp-evaluators2023-02-27T14:19:38Z2023-02-27T14:19:38ZTim Bradshaw
<p>Everyone who has written Lisp has written tiny Lisp evaluators in Lisp: here are two more.</p>
<!-- more-->
<p>Following two <a href="https://tfeb.org/fragments/2023/02/22/how-to-understand-closures-in-common-lisp/">recent</a> <a href="https://tfeb.org/fragments/2023/02/27/dynamic-binding-without-special-in-common-lisp/">articles</a> I wrote on scope and extent in Common Lisp, I thought I would finish with two very tiny evaluators for dynamically and lexically bound variants on a tiny Lisp.</p>
<h2 id="the-language">The language</h2>
<p>The tiny Lisp these evaluators interpret is not minimal: it has constructs other than <code>lambda</code>, and even has assignment. But it is pretty small. Other than the binding rules the languages are identical.</p>
<ul>
<li><strong><code>λ</code></strong> & <strong><code>lambda</code></strong> are synonyms and construct procedures, which can take any number of arguments;</li>
<li><strong><code>quote</code></strong> quotes its argument;</li>
<li><strong><code>if</code></strong> is conditional expression (the else part is optional);</li>
<li><strong><code>set!</code></strong> is assignment and mutates a binding.</li></ul>
<p>That is all that exists.</p>
<p>Both evaluators understand primitives, which are usually just functions in the underlying Lisp: since the languages are Lisp–1s, you could also expose other sorts of things of course (for instance true and false values). You can provide a list of initial bindings to them to define useful primitives.</p>
<h2 id="requirements">Requirements</h2>
<p>Both evaluators rely on my <a href="https://tfeb.github.io/tfeb-lisp-hax/#applicative-iteration-iterate">iterate</a> and <a href="https://tfeb.github.io/tfeb-lisp-hax/#simple-pattern-matching-spam">spam</a> hacks: they could easily be rewritten not to do so.</p>
<h2 id="the-dynamic-evaluator">The dynamic evaluator</h2>
<p>A procedure is represented by a structure which has a list of formals and a body of one or more forms.</p>
<pre class="brush: lisp"><code>(defstruct (procedure
(:print-function
(lambda (p s d)
(declare (ignore d))
(print-unreadable-object (p s)
(format s "λ ~S" (procedure-formals p))))))
(formals '())
(body '()))</code></pre>
<p>The evaluator simply dispatches on the type of thing and then on the operator for compound forms.</p>
<pre class="brush: lisp"><code>(defun evaluate (thing bindings)
(typecase thing
(symbol
(let ((found (assoc thing bindings)))
(unless found
(error "~S unbound" thing))
(cdr found)))
(list
(destructuring-bind (op . arguments) thing
(case op
((lambda λ)
(matching arguments
((head-matches (list-of #'symbolp))
(make-procedure :formals (first arguments)
:body (rest arguments)))
(otherwise
(error "bad lambda form ~S" thing))))
((quote)
(matching arguments
((list-matches (any))
(first arguments))
(otherwise
(error "bad quote form ~S" thing))))
((if)
(matching arguments
((list-matches (any) (any))
(if (evaluate (first arguments) bindings)
(evaluate (second arguments) bindings)))
((list-matches (any) (any) (any))
(if (evaluate (first arguments) bindings)
(evaluate (second arguments) bindings)
(evaluate (third arguments) bindings)))
(otherwise
(error "bad if form ~S" thing))))
((set!)
(matching arguments
((list-matches #'symbolp (any))
(let ((found (assoc (first arguments) bindings)))
(unless found
(error "~S unbound" (first arguments)))
(setf (cdr found) (evaluate (second arguments) bindings))))
(otherwise
(error "bad set! form ~S" thing))))
(t
(applicate (evaluate (first thing) bindings)
(mapcar (lambda (form)
(evaluate form bindings))
(rest thing))
bindings)))))
(t thing)))</code></pre>
<p>The interesting thing here is that <code>applicate</code> needs to know the current set of bindings so it can extend them dynamically.</p>
<p>Here is <code>applicate</code> which has a case for primitives and procedures</p>
<pre class="brush: lisp"><code>(defun applicate (thing arguments bindings)
(etypecase thing
(function
;; a primitive
(apply thing arguments))
(procedure
(iterate bind ((vtail (procedure-formals thing))
(atail arguments)
(extended-bindings bindings))
(cond
((and (null vtail) (null atail))
(iterate eval-body ((btail (procedure-body thing)))
(if (null (rest btail))
(evaluate (first btail) extended-bindings)
(progn
(evaluate (first btail) extended-bindings)
(eval-body (rest btail))))))
((null vtail)
(error "too many arguments"))
((null atail)
(error "not enough arguments"))
(t
(bind (rest vtail)
(rest atail)
(acons (first vtail) (first atail)
extended-bindings))))))))</code></pre>
<p>The thing that makes this evaluator dynamic is that the bindings that <code>applicate</code> extends are those it was given: procedures do not remember bindings.</p>
<h2 id="the-lexical-evaluator">The lexical evaluator</h2>
<p>A procedure is represented by a structure as before, but this time it has a set of bindings associated with it: the bindings in place when it was created.</p>
<pre class="brush: lisp"><code>(defstruct (procedure
(:print-function
(lambda (p s d)
(declare (ignore d))
(print-unreadable-object (p s)
(format s "λ ~S" (procedure-formals p))))))
(formals '())
(body '())
(bindings '()))</code></pre>
<p>The evaluator is almost identical:</p>
<pre class="brush: lisp"><code>(defun evaluate (thing bindings)
(typecase thing
(symbol
(let ((found (assoc thing bindings)))
(unless found
(error "~S unbound" thing))
(cdr found)))
(list
(destructuring-bind (op . arguments) thing
(case op
((lambda λ)
(matching arguments
((head-matches (list-of #'symbolp))
(make-procedure :formals (first arguments)
:body (rest arguments)
:bindings bindings))
(otherwise
(error "bad lambda form ~S" thing))))
((quote)
(matching arguments
((list-matches (any))
(first arguments))
(otherwise
(error "bad quote form ~S" thing))))
((if)
(matching arguments
((list-matches (any) (any))
(if (evaluate (first arguments) bindings)
(evaluate (second arguments) bindings)))
((list-matches (any) (any) (any))
(if (evaluate (first arguments) bindings)
(evaluate (second arguments) bindings)
(evaluate (third arguments) bindings)))
(otherwise
(error "bad if form ~S" thing))))
((set!)
(matching arguments
((list-matches #'symbolp (any))
(let ((found (assoc (first arguments) bindings)))
(unless found
(error "~S unbound" (first arguments)))
(setf (cdr found) (evaluate (second arguments) bindings))))
(otherwise
(error "bad set! form ~S" thing))))
(t
(applicate (evaluate (first thing) bindings)
(mapcar (lambda (form)
(evaluate form bindings))
(rest thing)))))))
(t thing)))</code></pre>
<p>The differences are that when constructing a procedure the current bindings are recorded in the procedure, and it is no longer necessary to pass bindings to <code>applicate</code>.</p>
<p><code>applicate</code> is also almost identical:</p>
<pre class="brush: lisp"><code>(defun applicate (thing arguments)
(etypecase thing
(function
;; a primitive
(apply thing arguments))
(procedure
(iterate bind ((vtail (procedure-formals thing))
(atail arguments)
(extended-bindings (procedure-bindings thing)))
(cond
((and (null vtail) (null atail))
(iterate eval-body ((btail (procedure-body thing)))
(if (null (rest btail))
(evaluate (first btail) extended-bindings)
(progn
(evaluate (first btail) extended-bindings)
(eval-body (rest btail))))))
((null vtail)
(error "too many arguments"))
((null atail)
(error "not enough arguments"))
(t
(bind (rest vtail)
(rest atail)
(acons (first vtail) (first atail)
extended-bindings))))))))</code></pre>
<p>The difference is that the bindings it extends when binding arguments are the bindings which the procedure remembered, not the dynamically-current bindings, which it does not even know.</p>
<h2 id="the-difference-between-them">The difference between them</h2>
<p>Here is the example that shows how these two evaluators differ.</p>
<p>With the dynamic evaluator:</p>
<pre class="brush: lisp"><code>? ((λ (f)
((λ (x)
;; bind x to 1 around the call to f
(f))
1))
((λ (x)
;; bind x to 2 when the function that will be f is created
(λ () x))
2))
1</code></pre>
<p>The binding in effect is the dynamically current one, not the one that was in effect when the procedure was created.</p>
<p>With the lexical evaluator:</p>
<pre class="brush: lisp"><code>? ((λ (f)
((λ (x)
;; bind x to 1 around the call to f
(f))
1))
((λ (x)
;; bind x to 2 when the function that will be f is created
(λ () x))
2))
2</code></pre>
<p>Now the binding in effect is the one that existed when the procedure was created.</p>
<p>Something more interesting is how you create recursive procedures in the lexical evaluator. With suitable bindings for primitives, it’s easy to see that this can’t work:</p>
<pre class="brush: lisp"><code>((λ (length)
(length '(1 2 3)))
(λ (l)
(if (null? l)
0
(+ (length (cdr l)) 1))))</code></pre>
<p>It can’t work because <code>length</code> is not in scope in the body of <code>length</code>. it <em>will</em> work in the dynamic evaluator.</p>
<p>The first fix, which is similar to what Scheme does with <code>letrec</code>, is to use assignment to mutate the binding so it is correct:</p>
<pre class="brush: lisp"><code>((λ (length)
(set! length (λ (l)
(if (null? l)
0
(+ (length (cdr l)) 1))))
(length '(1 2 3)))
0)</code></pre>
<p>Note the initial value of <code>length</code> is never used.</p>
<p>The second fix is to use something like <a href="https://tfeb.org/fragments/2020/03/09/the-u-combinator/">the U combinator</a> (you could use Y of course: I think U is simpler to understand):</p>
<pre class="brush: lisp"><code>((λ (length)
(length '(1 2 3)))
(λ (l)
((λ (c)
(c c l 0))
(λ (c t s)
(if (null? t)
s
(c c (cdr t) (+ s 1)))))))</code></pre>
<h2 id="source-code">Source code</h2>
<p>These two evaluators, together with a rudimentary REPL which can use either of them, can be found <a href="https://github.com/tfeb/tiny-eval">here</a>.</p>Dynamic binding without special in Common Lispurn:https-www-tfeb-org:-fragments-2023-02-27-dynamic-binding-without-special-in-common-lisp2023-02-27T09:53:27Z2023-02-27T09:53:27ZTim Bradshaw
<p>In Common Lisp, dynamic bindings and lexical bindings live in the same namespace. They don’t have to.</p>
<!-- more-->
<p>Common Lisp has <a href="https://www.tfeb.org/fragments/2023/02/22/how-to-understand-closures-in-common-lisp/" title="How to understand closures in Common Lisp">two sorts of bindings for variables</a>: lexical binding and dynamic binding. Lexical binding has lexical scope — the binding is available where it is visible in source code — and indefinite extent — the binding is available as long as any code might reference it. Dynamic binding has indefinite scope — the binding is available to any code which runs between when the binding is established and when control leaves the form which established it — and dynamic extent — the binding ceases to exist when control leaves the binding form.</p>
<p>These are really two very different things. However CL places both of these kinds of bindings into the same namespace, relying on <code>special</code> declarations and proclamations to tell the system which sort of binding to create and reference for a given name.</p>
<p>That doesn’t have to be the case: it’s possible in CL to completely isolate these two namespaces from each other. This means you could write code where all variable references were to lexical bindings and where dynamic bindings were created and referenced by a completely different set of operators. Here is an example of that. Following practice in some old Lisps I will call this ‘fluid’ binding. I will also use <code>/</code> to delimit the names of fluid variables simply to distinguish them from normal variables.</p>
<pre class="brush: lisp"><code>(defun inner (varname value)
(setf (fluid-value varname) value))
(defun outer (varname value)
(call/fluid-bindings
(lambda ()
(values
(fluid-value varname)
(progn
(inner varname (1+ value))
(fluid-value varname))))
(list varname)
(list value)))</code></pre>
<p>And now</p>
<pre class="brush: lisp"><code>> (outer '/v/ 1)
1
2</code></pre>
<p>Here are a set of operators for dealing with these fluid variables:</p>
<p><strong><code>fluid-value</code></strong> accesses the value of a fluid variable.</p>
<p><strong><code>fluid-boundp</code></strong> tells you if a name is bound as a fluid variable.</p>
<p><strong><code>call/fluid-bindings</code></strong> calls a function with one or more fluid variables bound.</p>
<p><strong><code>define-fluid</code></strong> (not used above) defines a global value for a fluid variable.</p>
<p>Well, of course you can do something like this using an explicit binding stack and a single special variable to hang it from. But that’s not how this works: these ‘fluid variables’ are just CL’s dynamic variables:</p>
<pre class="brush: lisp"><code>(defun call/print-base (f base)
(call/fluid-bindings f '(*print-base*) (list base)))</code></pre>
<pre class="brush: lisp"><code>> (call/print-base
(lambda ()
*print-base*)
2)
2</code></pre>
<p>So how does this work? Well <code>fluid-value</code> and <code>fluid-boundp</code> are obvious:</p>
<pre class="brush: lisp"><code>(defun fluid-value (s)
(symbol-value s))
(defun (setf fluid-value) (n s)
(setf (symbol-value s) n))
(defun fluid-boundp (s)
(boundp s))</code></pre>
<p>And the trick now is that <em>CL gives you enough mechanism to bind named dynamic variables yourself</em>, that mechanism being <a href="http://www.lispworks.com/documentation/HyperSpec/Body/s_progv.htm" title="progv">progv</a>, which</p>
<blockquote>
<p>[…] allows binding one or more dynamic variables whose names may be determined at run time […]</p></blockquote>
<p>So now <code>call/fluid-bindings</code> just uses <code>progv</code>:</p>
<pre class="brush: lisp"><code>(defun call/fluid-bindings (f fluids values)
(progv fluids values (funcall f)))</code></pre>
<p>And finally <code>define-fluid</code> looks like this:</p>
<pre class="brush: lisp"><code>(defmacro define-fluid (var &optional (value nil)
(doc nil docp))
`(progn
(setf (fluid-value ',var) ,value)
,@(if docp
`((setf (documentation ',var 'variable) ',doc))
'())
',var))</code></pre>
<p>The interesting thing here is that there are no <code>special</code> declarations or proclamations: you can create and bind new fluid variables without any recourse to <code>special</code> at all, in a way which is completely compatible with the existing dynamic variables, because fluid variables <em>are</em> dynamic variables.</p>
<p>So one way of thinking about <code>special</code> is that it is a declaration that says ‘for this variable name, access the namespace of dynamic bindings rather than lexical bindings’. This is not really what <code>special</code> was of course in Lisps before CL — it was historically closer to an instruction to use the interpreter’s variable binding mechanism in compiled code — but you can think of it this way in CL, where the interpreter and compiler do not have separate binding rules.</p>
<p>And, of course, using something like the above, you could write code in CL where all variable bindings were lexical and dynamic variables lived entirely in their own namespace. For instance this works fine:</p>
<pre class="brush: lisp"><code>(defun f ()
(let ((x 2))
(call/fluid-bindings
(lambda ()
(values x (fluid-value 'x)))
'(x) '(3))))</code></pre>
<pre class="brush: lisp"><code>> (f)
2
3</code></pre>
<p>The reference to <code>x</code> as a variable refers to its lexical binding, while <code>(fluid-value 'x)</code> refers to its dynamic binding.</p>
<p>Whether writing code like that would be useful I am not sure: I think that the <code>*</code>-convention for dynamic variables is perfectly fine in fact. But it is perhaps interesting to see that you can think of dynamic bindings in CL this way.</p>How to understand closures in Common Lispurn:https-www-tfeb-org:-fragments-2023-02-22-how-to-understand-closures-in-common-lisp2023-02-22T13:51:07Z2023-02-22T13:51:07ZTim Bradshaw
<p>The first rule of understanding closures is that you do not talk about closures. The second rule of understanding closures in Common Lisp is that <em>you do not talk about closures</em>. These are all the rules.</p>
<!-- more-->
<p>There is a lot of elaborate bowing and scraping about closures in the Lisp community. But despite that <em>a closure isn’t actually a thing</em>: the thing people call a closure is just a function which obeys the language’s rules about the scope and extent of bindings. <em>Implementors</em> need to care about closures: users just need to understand the rules for bindings. So rather than obsessing about this magic invisible thing which doesn’t actually exist in the language, I suggest that it is far better simply to think about the rules which cover <em>bindings</em>.</p>
<h2 id="angels-and-pinheads">Angels and pinheads</h2>
<p>It’s easy to see why this has happened: <a href="http://www.lispworks.com/documentation/HyperSpec/Front/index.htm" title="HyperSpec">the CL standard</a> has a lot of discussion of <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_l.htm#lexical_closure" title="lexical closure">lexical closures</a>, <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_l.htm#lexical_environment" title="lexical environment">lexical</a> and <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_d.htm#dynamic_environment" title="dynamic environment">dynamic</a> environments and so on. So it’s tempting to think that this way of thinking about things is ‘the one true way’ because it has been blessed by those who went before us. And indeed CL does have <a href="http://www.lispworks.com/documentation/HyperSpec/Body/03_aad.htm" title="environment objects">objects representing part of the lexical environment</a> which are given to macro functions. Occasionally these are even useful. But there are <em>no</em> objects which represent closures as distinct from functions, and <em>no</em> predicates which tell you if a function is a closure or not in the standard language: closures simply do not exist as objects distinct from functions at all. They were useful, perhaps, as part of the text which <em>defined</em> the language, but they are nowhere to be found in the language itself.</p>
<p>So, with the exception of the environment objects passed to macros, <em>none</em> of these objects exist in the language. They may exist in implementations, and might even be exposed by some implementations, but from the point of the view of the language they simply do not exist: if I give you a function object you cannot know if it is a closure or not.</p>
<p>So it is strange that people spend so much time worrying about these objects which, if they even exist in the implementation, can’t be detected by anyone using the standard language. This is worrying about angels and pinheads: wouldn’t it be simpler to simply understand what the rules of the language actually say should observably happen? I think it would.</p>
<p>I am not arguing that the terminology used by the standard is wrong! All I am arguing is that, if you think you want to understand closures, you might instead be better off understanding the rules that give rise to them. And when you have done that you may suddenly find that closures have simply vanished into the mist: all you need is the rules.</p>
<h2 id="history">History</h2>
<p>Common Lisp is steeped in history: it is full of traces of the Lisps which went before it. This is intentional: one goal of CL was to enable programs written in those earlier Lisps — which were <em>all</em> Lisps at that time of course — to run without extensive modification.</p>
<p>But one place where CL <em>didn’t</em> steep itself in history is in exactly the areas that you need to understand to understand closures. Before Common Lisp (really, before Scheme), people spent a lot of time writing papers about <a href="https://en.wikipedia.org/wiki/Funarg_problem" title="the funarg problem">the funarg problem</a> and describing and implementing more-or-less complicated ways of resolving it. Then Scheme came along and decided that this was all nonsense and that it could just be made to go away by implementing the language properly. And the Common Lisp designers, who knew about Scheme, said that, well, if Scheme can do this, then we can do this as well, and so they also made it the problem vanish, although not in quite such an extreme way as Scheme did.</p>
<p>And this is now ancient history: these predecessor Lisps to CL are all at least 40 years old now. I am, just, old enough to have used some of them when they were current, but for most CL programmers these questions were resolved before they were born. The history is very interesting, but you do not need to steep yourself in it to understand closures.</p>
<h2 id="bindings">Bindings</h2>
<p>So the notion of a closure is part of the history behind CL: a hangover from the time when people worried about the funarg problem; a time before they understood that the whole problem could simply be made to go away. So, again, if you think you want to understand closures, the best approach is to understand something else: to understand <em>bindings</em>. Just as with closures, bindings do not exist as objects in the language, although you <em>can</em> make some enquiries about some kinds of bindings in CL. They are also a concept which exists in many programming languages, not just CL.</p>
<p>A <strong>binding</strong> is an association between a name — a symbol — and something. The most common binding is a variable binding, which is an association between a name and a value. There are other kinds of bindings however: the most obvious kind in CL is a function binding: an association between a name and a function object. And for example within a (possibly implicit) <code>block</code> there is a binding between the name of the block and a point to which you can jump. And there are other kinds of bindings in CL as well, and the set is extensible. <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_b.htm#binding" title="binding">The CL standard</a> only calls variable bindings ‘bindings’, but I am going to use the term more generally.</p>
<p>Bindings are established by some binding construct and are usually not first-class objects in CL: they are just as vaporous as closures and environments. Nevertheless they are a powerful and useful idea.</p>
<h2 id="what-can-be-bound">What can be bound?</h2>
<p>By far the most common kind of binding is a <strong>variable binding</strong>: an association between a name and a value. However there are other kinds of bindings: associations between names and other things. I’ll mention those briefly at the end, but in everything else that follows it’s safe to assume that ‘binding’ means ‘variable binding’ unless I say otherwise.</p>
<h2 id="scope-and-extent">Scope and extent</h2>
<p>For both variable bindings and other kinds of bindings there are two interesting questions you can ask:</p>
<ul>
<li><em>where</em> is the binding available?</li>
<li><em>when</em> is the binding visible?</li></ul>
<p>The first question is about the <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_s.htm#scope" title="scope"><strong>scope</strong></a> of the binding. The second is about the <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_e.htm#extent" title="extent"><strong>extent</strong></a> of the binding.</p>
<p>Each of these questions has (at least) two possible answers giving (at least) four possibilities. CL has bindings which use three of these possibilities and the fourth in a restricted case: two and a restricted version of a third for variable bindings, the other one for some other kinds of bindings.</p>
<p><strong>Scope.</strong> The two options are:</p>
<ul>
<li>the binding may be available only in code where the binding construct is visible;</li>
<li>or the binding may be available during all code which runs between where the binding is established and where it ends, regardless of whether the binding construct is visible.</li></ul>
<p>What does ‘visible’ mean? Well, given some binding form, it means that the bindings it establishes are visible to all the code that is inside that form in the source. So, in a form like <code>(let ((x 1)) ...)</code> the binding of <code>x</code> is visible to the code that replaces the ellipsis, including any code introduced by macroexpansion, and only to that code.</p>
<p><strong>Extent.</strong> The two options are:</p>
<ul>
<li>the binding may exist only during the time that the binding construct is active, and goes away when control leaves it;</li>
<li>or the binding may exist as long as there is any possibility of reference.</li></ul>
<p>Unfortunately the CL standard is, I think, slightly inconsistent in its naming for these options. However I’m going to use the standard’s terms with one exception. Here they are.</p>
<p><strong>Scope</strong>:</p>
<ul>
<li>when a binding is available only when visible this called <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_l.htm#lexical_scope" title="lexical scope"><strong>lexical scope</strong></a>;</li>
<li>when a binding available to all code within the binding construct this is called <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_i.htm#indefinite_scope" title="indefinite scope"><strong>indefinite scope</strong></a><sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-1-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-1-return">1</a></sup>;</li></ul>
<p><strong>Extent</strong>:</p>
<ul>
<li>when a binding ends at the end of the binding form this is called <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_d.htm#dynamic_extent" title="dynamic extent"><strong>dynamic extent</strong></a><sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-2-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-2-return">2</a></sup>;</li>
<li>when a binding available indefinitely this called <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_i.htm#indefinite_extent" title="indefinite extent"><strong>indefinite extent</strong></a>.</li></ul>
<p>The term from the standard I am <em>not</em> going to use is <a href="http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_d.htm#dynamic_scope" title="dynamic scope"><strong>dynamic scope</strong></a>, which it defines to mean the combination of indefinite scope and dynamic extent. I am not going to use this term because I think it is confusing: although it has ‘scope’ in its name it concerns both scope and extent. Instead I will introduce better, commonly used, terms below for the interesting combinations of scope and extent.</p>
<p>The four possibilities for bindings are then:</p>
<ul>
<li>lexical scope and dynamic extent;</li>
<li>lexical scope and indefinite extent;</li>
<li>indefinite scope and dynamic extent;</li>
<li>indefinite scope and indefinite extent.</li></ul>
<h2 id="the-simplest-kind-of-binding">The simplest kind of binding</h2>
<p>So then let’s ask: what is the simplest kind of binding to understand? If you are reading some code and you see a reference to a binding then what choice from the above options will make it easiest for you to understand whether that reference is valid or not?</p>
<p>Well, the first thing is that you’d like to be able to know <em>by looking at the code</em> whether a reference is valid or not. That means that the binding construct should be <em>visible</em> to you, or that the binding should have lexical scope. Compare the following two fragments of code:</p>
<pre class="brush: lisp"><code>(defun simple (x)
...
(+ x 1)
...)</code></pre>
<p>and</p>
<pre class="brush: lisp"><code>(defun confusing ()
...
(+ *x* 1)
...)</code></pre>
<p>Well, in the first one you can tell, just by looking at the code, that the reference to <code>x</code> is valid: the function, when called, establishes a binding of <code>x</code> and you can see that when reading the code. In the second one you just have to assume that the reference to <code>*x*</code> is valid: you can’t tell by reading the code whether it is or not.</p>
<p><strong>Lexical scope</strong> makes it easiest for people reading the code to understand it, and in particular it is easier to understand than indefinite scope. It is the simplest kind of scoping to understand for people reading the code.</p>
<p>So that leaves extent. Well, in the two examples above definite or indefinite extent makes no difference to how simple the code is to understand: once the functions return there’s no possibility of reference to the bindings anyway. To expose the difference we need somehow to construct some object which can refer to a binding <em>after the function has returned</em>. We need something like this:</p>
<pre class="brush: lisp"><code>(defun maker (x)
...
<construct object which refers to binding of x>)
(let ((o (maker 1)))
<use o somehow to cause it to reference the binding of x>)</code></pre>
<p>Well, what it this object going to be? What sort of things reference bindings? <em>Code</em> references bindings, and the objects which contain code are <em>functions</em><sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-3-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-3-return">3</a></sup>. What we need to do is construct and return a function:</p>
<pre class="brush: lisp"><code>(defun maker (x)
(lambda (y)
(+ x y)))</code></pre>
<p>and then cause this function to reference the binding by calling it:</p>
<pre class="brush: lisp"><code>(let ((f (maker 1)))
(funcall f 2))</code></pre>
<p>So now we can, finally, ask: what is the choice for the <em>extent</em> of the binding of <code>x</code> which makes this code simplest to understand? Well, the answer is that unless the binding of <code>x</code> remains visible to the function that is created in <code>maker</code>, this code <em>can’t work at all</em>. It would have to be the case that it was simply not legal to return functions like this from other functions. Functions, in other words, would not be first-class objects.</p>
<p>Well, OK, that’s a possibility, and it makes the above code simple to understand: it’s not legal and it’s easy to see that it is not. Except consider this small variant on the above:</p>
<pre class="brush: lisp"><code>(defun maybe-maker (x return-identity-p)
(if return-identity-p
#'identity
(lambda (y)
(+ x y))))</code></pre>
<p>There is <em>no way to know</em> from reading this code whether <code>maybe-maker</code> will return the nasty anonymous function or the innocuous <code>identity</code> function. If it is not allowed to return anonymous functions in this way then there is <em>no way of knowing</em> whether</p>
<pre class="brush: lisp"><code>(funcall (maybe-maker 1 (zerop (random 2)))
2)</code></pre>
<p>is correct or not. This is certainly not simple: in fact it is a horrible nightmare. Another way of saying this is that you’d be in a situation where</p>
<pre class="brush: lisp"><code>(let ((a 1))
(funcall (lambda ()
a)))</code></pre>
<p>would work, but</p>
<pre class="brush: lisp"><code>(funcall (let ((a 1))
(lambda ()
a)))</code></pre>
<p>would not. There are languages which work that way: those languages suck.</p>
<p>So what <em>would</em> be simple? What would be simple is to say that if a binding is visible, it is visible, and that’s the end of the story. In a function like <code>maker</code> above the binding of <code>x</code> established by <code>maker</code> is visible to the function that it returns. Therefore <em>it’s visible to the function that <code>maker</code> returns</em>: without any complicated rules or weird special cases. That means the binding must have indefinite extent.</p>
<p><strong>Indefinite extent</strong> makes it easiest for people reading the code to understand it when that code may construct and return functions, and in particular it is easier to understand than dynamic extent, which makes it essentially impossible to tell in many cases whether such code is correct or not.</p>
<p>And that’s it: lexical scope and indefinite extent, which I will call <strong>lexical binding</strong>, is the simplest binding scheme to understand for a language which has first-class functions<sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-4-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-4-return">4</a></sup>.</p>
<p>And really <em>that’s it</em>: that’s all you need to understand. Lexical scope and indefinite extent make reading code simple, and entirely explain the things people call ‘closures’ which are, in fact, simply functions which obey these simple rules.</p>
<h2 id="examples-of-the-simple-binding-rules">Examples of the simple binding rules</h2>
<p>One thing I have not mentioned before is that, in CL, bindings are <strong>mutable</strong>, which is another way of saying that CL supports assignment: assignment to variables is mutation of variable bindings. So, as a trivial example:</p>
<pre class="brush: lisp"><code>(defun maximum (list)
(let ((max (first list)))
(dolist (e (rest list) max)
(when (> e max)
(setf max e)))))</code></pre>
<p>This is very easy to understand and does not depend on the binding rules in detail.</p>
<p>But, well, bindings are mutable, so the rules which say they exist as long as they can be referred to also imply they can be mutated as long as they can be referred to: anything else would certainly not be simple. So here’s a classic example of this:</p>
<pre class="brush: lisp"><code>(defun make-incrementor (&optional (value 0))
(lambda (&optional (increment 1))
(prog1 value
(incf value increment))))</code></pre>
<p>And now:</p>
<pre class="brush: lisp"><code>> (let ((i (make-incrementor)))
(print (funcall i))
(print (funcall i))
(print (funcall i -2))
(print (funcall i))
(print (funcall i))
(values))
0
1
2
0
1</code></pre>
<p>As you can see, the function returned by <code>make-incrementor</code> is mutating the binding that it can still see.</p>
<p>What happens when two functions can see the same binding?</p>
<pre class="brush: lisp"><code>(defun make-inc-dec (&optional (value 0))
(values
(lambda ()
(prog1 value
(incf value)))
(lambda ()
(prog1 value
(decf value)))))</code></pre>
<p>And now</p>
<pre class="brush: lisp"><code>> (multiple-value-bind (inc dec) (make-inc-dec)
(print (funcall inc))
(print (funcall inc))
(print (funcall dec))
(print (funcall dec))
(print (funcall inc))
(values))
0
1
2
1
0</code></pre>
<p>Again, what happens is the simplest thing: you can see simply from reading the code that both functions can see the <em>same</em> binding of <code>value</code> and they are therefore both mutating this common binding.</p>
<p>Here is an example which demonstrates all these features: an implementation of a simple queue as a pair of functions which can see two shared bindings:</p>
<pre class="brush: lisp"><code>(defun make-queue ()
(let ((head '())
(tail nil))
(values
(lambda (thing)
;; Push thing onto the queue
(if (null head)
;; It's empty currently so set it up
(setf head (list thing)
tail head)
;; not empty: just adjust the tail
(setf (cdr tail) (list thing)
tail (cdr tail)))
thing)
(lambda ()
(cond
((null head)
;; empty
(values nil nil))
((null (cdr head))
;; will be empty: don't actually need this case but it is
;; cleaner
(values (prog1 (car head)
(setf head '()
tail nil))
t))
(t
;; will still have content
(values (pop head) t)))))))</code></pre>
<p><code>make-queue</code> will return two functions:</p>
<ul>
<li>the first takes one argument which it appends to the queue;</li>
<li>the second takes no argument and either the next element of the queue and <code>t</code> or <code>nil</code> and <code>nil</code> if the queue is empty.</li></ul>
<p>So, with this little function to drain the queue</p>
<pre class="brush: lisp"><code>(defun drain-and-print (popper)
(multiple-value-bind (value fullp) (funcall popper)
(when fullp
(print value)
(drain-and-print popper))
(values)))</code></pre>
<p>we can see this in action</p>
<pre class="brush: lisp"><code>> (multiple-value-bind (pusher popper) (make-queue)
(funcall pusher 1)
(funcall pusher 2)
(funcall pusher 3)
(drain-and-print popper))
1
2
3</code></pre>
<h2 id="a-less-simple-kind-of-binding-which-is-sometimes-very-useful">A less-simple kind of binding which is sometimes very useful</h2>
<p>Requiring bindings to be simple usually makes programs easy to read and understand. But it also makes it hard to do some things. One of those things is to control the ‘ambient state’ of a program. A simple example would be the base for printing numbers. It’s quite natural to say that ‘in this region of the program I want numbers printed in hex’.</p>
<p>If all we had was lexical binding then this becomes a nightmare: every function you call in the region you want to cause printing to happen in hex needs to take some extra argument which says ‘print in hex’. And if you then decide that, well, you’d also like some other ambient parameter, you need to provide more arguments to every function<sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-5-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-5-return">5</a></sup>. This is just horrible.</p>
<p>You might think you can do this with global variables which you temporarily set: that is both fiddly (better make sure you set it back) and problematic in the presence of multiple threads<sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-6-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-6-return">6</a></sup>.</p>
<p>A better approach is to allow <strong>dynamic bindings</strong>: bindings with indefinite scope & dynamic extent. CL has these, and at this point history becomes unavoidable: rather than have some separate construct for dynamic bindings, CL simply says that some variable bindings, and some references to variable bindings, are to be treated as having indefinite scope and dynamic extent, and you tell the system which bindings this applies to with<code>special</code> declarations / proclamations. CL does this because that’s very close to how various predecessor Lisps worked, and so makes porting programs from them to CL much easier. To make this less painful there is a convention that dynamically-bound variable names have <code>*</code>stars<code>*</code> around them, of course.</p>
<p>Dynamic bindings are so useful that if you don’t have them you really need to invent them: I have on at least two occasions implemented a dynamic binding system in Python, for instance.</p>
<p>However this is not an article on dynamic bindings so I will not write more about them here: perhaps I will write another article later.</p>
<h2 id="what-else-can-be-bound">What else can be bound?</h2>
<p>Variable bindings are by far the most common kind. But not the only kind. Other things can be bound. Here is a partial list<sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-7-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-7-return">7</a></sup>:</p>
<ul>
<li><strong>local functions</strong> have lexical scope and indefinite extent;</li>
<li><strong>block names</strong> have lexical scope and definite extent (see below);</li>
<li><strong>tag names</strong> have lexical scope and definite extent (see below);</li>
<li><strong>catch tags</strong> have indefinite scope and definite extent;</li>
<li><strong>condition handlers</strong> have indefinite scope and definite extent;</li>
<li><strong>restarts</strong> have indefinite scope and definite extent.</li></ul>
<p>The two interesting cases here are block names and tag names. Both of these have lexical scope but only definite extent. As I argued above this makes it hard to know whether references to them are valid or not. Look at this, for example:</p>
<pre class="brush: lisp"><code>(defun outer (x)
(inner (lambda (r)
(return-from outer r))
x))
(defun inner (r rp)
(if rp
r
(funcall r #'identity)))</code></pre>
<p>So then <code>(funcall (outer nil) 1)</code> will: call <code>inner</code> with a function which wants to return from <code>outer</code> and <code>nil</code>, which will cause <code>inner</code> to call that function, returning the <code>identity</code> function, which is then called by <code>funcall</code> with argument <code>1</code>: the result is 1.</p>
<p>But <code>(funcall (outer t) 1)</code> will instead return the function which wants to return from <code>outer</code>, which is then called by <code>funcall</code> which is an error since it is outside the dynamic extent of the call to <code>outer</code>.</p>
<p>And there is no way that either a human reading the code <em>or the compiler</em> can detect that this is going to happen: a very smart compiler might perhaps be able to deduce that the internal function <em>might</em> be returned from <code>outer</code>, but probably only because this is a rather simple case: for instance in</p>
<pre class="brush: lisp"><code>(defun nasty (f)
(funcall f (lambda ()
(return-from nasty t))))</code></pre>
<p>the situation is just hopeless. So this is a case where the binding rules are not as simple as you might like.</p>
<h2 id="what-is-simple">What is simple?</h2>
<p>For variable bindings I think it’s easy to see that the simplest rule for a person reading the code is lexical binding. The other question is whether that is simpler <em>for the implementation</em>. And the answer is that probably it is not: probably lexical scope and definite extent is the simplest implementationally. That certainly approximates what many old Lisps did<sup><a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-8-definition" name="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-8-return">8</a></sup>. It’s fairly easy to write a <em>bad</em> implementation of lexical binding, simply by having all functions retain all the bindings, regardless of whether they might refer to them. A <em>good</em> implementation requires more work. But CL’s approach here is that doing the right thing <em>for people</em> is more important than making the implementor’s job easier. And I think that approach has worked well.</p>
<p>On the other hand CL hasn’t done the right thing for blocks and tags: There are at least three reasons for this.</p>
<p><strong>Implementational complexity.</strong> If the bindings had lexical scope and <em>indefinite</em> extent then you would need to be able to return from a block which had already been returned from, and go to a tag from outside the extent of the form that established it. That opens an enormous can of worms both in making such an implementation work at all but also handling things like dynamic bindings, open files and so on. That’s not something the CL designers were willing to impose on implementors.</p>
<p><strong>Complexity in the specification.</strong> If CL had lexical bindings for blocks and tags then the specification of the language would need to describe what happens in all the many edge cases that arise, including cases where it is genuinely unclear what the correct thing to do is at all such as dealing with open files and so on. Nobody wanted to deal with that, I’m sure: the language specification was already seen as far too big and the effort involved would have made it bigger, later and more expensive.</p>
<p><strong>Conceptual difficulty.</strong> It might seem that making block bindings work like lexical variable bindings would make things simpler to understand. Well, that’s exactly what Scheme did with <code>call/cc</code> and <code>call/cc</code> can give rise to some of the most opaque code I have ever seen. It is often very <em>pretty</em> code, but it’s not easy to understand.</p>
<p>I think the bargain that CL has struck here is at least reasonable: to make the common case of variable bindings simple for people, and to avoid the cases where doing the right thing results in a language which is harder to understand in many cases and far harder to implement and specify.</p>
<p>Finally, once again I think that the best way to understand how closures in CL is not to understand them: instead understand the binding rules for variables, why they are simple and what they imply.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-1-definition" class="footnote-definition">
<p>indefinite scope is often called ‘dynamic scope’ although I will avoid this term as it is used by the standard to mean the combination of indefinite scope and dynamic extent. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-1-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-2-definition" class="footnote-definition">
<p>Dynamic extent could perhaps be called ‘definite extent’, but this is not the term that the standard uses so I will avoid it. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-2-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-3-definition" class="footnote-definition">
<p>Here and below I am using the term ‘function’ in the very loose sense that CL usually uses it: almost none of the ‘functions’ I will talk about are actually mathematical functions: they’re what Scheme would call ‘procedures’. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-3-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-4-definition" class="footnote-definition">
<p>For languages which <em>don’t</em> have first-class functions or equivalent constructs, lexical scope and definite extent is the same as lexical scope and indefinite extent, because it is not possible to return objects which can refer to bindings from the place those bindings were created. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-4-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-5-definition" class="footnote-definition">
<p>More likely, you would end up making every function have, for instance an <code>ambient</code> keyword argument whose value would be an alist or plist which mapped between properties of the ambient environment and values for them. All functions which might call other functions would need this extra argument, and would need to be sure to pass it down suitably. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-5-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-6-definition" class="footnote-definition">
<p>This can be worked around, but it’s not simple to do so. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-6-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-7-definition" class="footnote-definition">
<p>In other words ‘this is all I can think of right now, but there are probably others’. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-7-return">↩</a></p></li>
<li id="2023-02-22-how-to-understand-closures-in-common-lisp-footnote-8-definition" class="footnote-definition">
<p>Very often old Lisps had indefinite scope and definite extent in interpreted code but lexical scope and definite extent in compiled code: yes, compiled code behaved differently to interpreted code, and yes, that sucked. <a href="#2023-02-22-how-to-understand-closures-in-common-lisp-footnote-8-return">↩</a></p></li></ol></div>A case-like macro for regular expressionsurn:https-www-tfeb-org:-fragments-2023-01-11-a-case-like-macro-for-regular-expressions2023-01-11T18:17:29Z2023-01-11T18:17:29ZTim Bradshaw
<p>I often find myself wanting a simple <code>case</code>-like macro where the keys are regular expressions. <code>regex-case</code> is an attempt at this.</p>
<!-- more-->
<p>I use <a href="https://edicl.github.io/cl-ppcre/">CL-PPCRE</a> for the usual things regular expressions are useful for, and probably for some of the things they should not really be used for as well. I often find myself wanting a <code>case</code> like macro, where the keys are regular expressions. There is a contributed package for <a href="https://github.com/guicho271828/trivia">Trivia</a> which will do this, but Trivia is pretty overwhelming. So I gave in and wrote <code>regex-case</code> which does what I want.</p>
<p><code>regex-case</code> is a <code>case</code>-like macro. It looks like</p>
<pre class="brush: lisp"><code>(regex-case <thing>
(<pattern> (...)
<form> ...)
...
(otherwise ()
<form> ...))</code></pre>
<p>Here <code><pattern></code> is a literal regular expression, either a string or in CL-PPCRE’s s-expression parse-tree syntax for them. Unlike <code>case</code> there can only be a single pattern per clause: allowing the parse-tree syntax makes it hard to do anything else. <code>otherwise</code> (which can also be <code>t</code>) is optional but must be last.</p>
<p>The second form in a clause specifies what, if any, variables to bind on a match. As an example</p>
<pre class="brush: lisp"><code>(regex-case line
("fog\\s+(.*)\\s$" (:match m :registers (v))
...)
...)</code></pre>
<p>will bind <code>m</code> to the whole match and <code>v</code> to the substring corresponding to the first register. You can also bind match and register positions. A nice (perhaps) thing is that you can <em>not</em> bind some register variables:</p>
<pre class="brush: lisp"><code>(regex-case line
(... (:registers (_ _ v))
...)
...)</code></pre>
<p>will bind <code>v</code> to the substring corresponding to the third register. You can use <code>nil</code> instead of <code>_</code>.</p>
<p>The current state of <code>regex-case</code> is a bit preliminary: in particular I don’t like the syntax for binding thngs very much, although I can’t think of a better one. Currently therefore it’s in my collection of toys: it will probably migrate from there at some point.</p>
<p>Currently documentation is <a href="https://tfeb.github.io/tfeb-lisp-toys/#case-for-regular-expressions-regex-case">here</a> and source code is <a href="https://github.com/tfeb/tfeb-lisp-toys">here</a>.</p>The empty listurn:https-www-tfeb-org:-fragments-2022-12-16-the-empty-list2022-12-16T17:14:32Z2022-12-16T17:14:32ZTim Bradshaw
<p>My friend Zyni pointed out that someone has been getting really impressively confused and cross on reddit about empty lists, booleans and so on in Common Lisp, which led us to a discussion about what the differences between CL and Scheme really are here. Here’s a summary which we think is correct.</p>
<!-- more-->
<h2 id="a-peculiar-object-in-common-lisp2022-12-16-the-empty-list-footnote-1-definition2022-12-16-the-empty-list-footnote-1-return1">A peculiar object in Common Lisp<sup><a href="#2022-12-16-the-empty-list-footnote-1-definition" name="2022-12-16-the-empty-list-footnote-1-return">1</a></sup></h2>
<p>In Common Lisp there is a single special object, <code>nil</code>.</p>
<ul>
<li>This represents both the empty list, and the special false value, all other objects being true.</li>
<li>This object is a list and is the only list object which is not a cons.</li>
<li>As such this object is an atom, and again it is the only list object which is an atom.</li>
<li>You can take the <code>car</code> and <code>cdr</code> of this object: both of these operations return the object itself.</li>
<li>This object is also a symbol, and it is the only object which is both a list and a symbol.</li>
<li>The empty list when written as an empty list, <code>()</code>, is self-evaluating.</li></ul>
<p>Some comments.</p>
<ul>
<li>It is <em>necessary</em> that there be a special empty-list object which is a list but not a cons: the things which are not necessary are that it be a symbol, and that it represent falsity.</li>
<li>Combining the empty list and the special false object can lead to particularly good implementations perhaps.</li>
<li>The implementation of this object is always going to be a bit weird.</li>
<li>It is clear that the empty list cannot be any kind of compound form so requiring it to be quoted — requiring you to write <code>'()</code> really — serves no useful purpose. Nevertheless I (Tim) would probably rather CL did that.</li>
<li>Not having to quote <code>nil</code> on the other hand is not at all strange: any symbol can be made self-evaluating simply by <code>(defconstant s 's)</code>, for instance.</li>
<li>The graph of types in CL is a DAG, not a tree: it is not at all strange that there is an object whose type is both <code>list</code> and <code>symbol</code>.</li></ul>
<h2 id="some-entirely-mundane-things-in-common-lisp">Some entirely mundane things in Common Lisp</h2>
<ul>
<li>There is a symbol, <code>t</code> which represents the canonical true value. Nothing is magic about this symbol in any way: it could be defined by <code>(defconstant t 't)</code>.</li>
<li>There is a type, <code>boolean</code> which could be defined by <code>(deftype boolean () '(member nil t))</code>, except that it is required that <code>boolean</code> be a recognisable subtype of <code>symbol</code>. All implementations we have tried recognise <code>(member nil t)</code> as a subtype of <code>symbol</code>, but the standard does not require them to do so. Additionally <code>(type-of 't)</code> must return <code>boolean</code> we think.</li>
<li>There is a type, <code>null</code>, which could be defined by <code>(deftype null () '(member nil))</code> or <code>(deftype null () '(eql nil))</code>, with the same caveats as above, and <code>(type-of nil)</code> should return <code>null</code>.</li>
<li>There are types named <code>t</code> (top of the type graph) and <code>nil</code> (bottom of type graph).</li></ul>
<p>These mundane things are just that: they don’t require implementational magic at all.</p>
<h2 id="three-peculiar-objects-in-scheme">Three peculiar objects in Scheme</h2>
<p>In Scheme there is an object, <code>()</code>.</p>
<ul>
<li><code>()</code> is the special object that represents the empty list.</li>
<li>It does not represent false.</li>
<li>It is not a symbol.</li>
<li>It is the only list object which is not a pair (cons): <code>list?</code> is true of it but <code>pair?</code> is false.</li>
<li>You can’t take the <code>car</code> or <code>cdr</code> of it.</li>
<li>It is not self-evaluatiing.</li></ul>
<p>There is another object, <code>#f</code>.</p>
<ul>
<li><code>#f</code> is the distinguished false value and is the only false value in Scheme, all other objects being true.</li>
<li>It is not a symbol or a list but satisfies the <code>boolean?</code> predicate.</li>
<li>It is self-evaluating.</li></ul>
<p>There is another object, <code>#t</code>.</p>
<ul>
<li><code>#t</code> represents the canonical true value, but all objects other than <code>#f</code> are true.</li>
<li>It is not a symbol or a list but satisfies the <code>boolean?</code> predicate.</li>
<li>It is self-evaluating.</li></ul>
<p>Some comments. - Scheme does not have such an elaborate type system as CL and, apart from numbers, doesn’t really have subtype relations the way CL does.</p>
<h2 id="a-summary">A summary</h2>
<p>CL’s treatment of <code>nil</code> clearly makes some people very unhappy indeed. In particular they seem to think CL is somehow inconsistent, which it clearly is not. Generally this is either because they don’t understand how it works, because it doesn’t work the way they want it to work, or (usually) both. Scheme’s treatment is often cited by these people as being better. But CL requires <em>precisely one</em> implementationally-weird object, while Scheme requires two, or three if you count <code>#t</code> which you probably should. Both languages have idiosyncratic evaluation rules around these objects. Additionally it’s worth understanding that things like CL’s <code>boolean</code> type mean essentially nothing implementationally: <code>boolean</code> is just a name for a set of symbols. The only thing preventing you from defining a type like this yourself is the requirement for <code>type-of</code> to return the type.</p>
<p>Is one better than the other? No: they’re just not the same. Certainly the CL approach carries more historical baggage. Equally certainly it is perfectly consistent, and changing it would break essentially all CL programs that exist.</p>
<hr />
<p>Thanks to Zyni for most of this: I’m really writing it up just so we can remember it. We’re pretty confident about the CL part, less so about the Scheme bit.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-12-16-the-empty-list-footnote-1-definition" class="footnote-definition">
<p><strong>peculiar</strong>, <em>adjective</em>: having eccentric or individual variations in relation to the general or predicted pattern, as in peculiar motion or velocity. <em>noun</em>: a parish or church exempt from the jurisdiction of the ordinary or bishop in whose diocese it is placed; anything exempt from ordinary jurisdiction. <a href="#2022-12-16-the-empty-list-footnote-1-return">↩</a></p></li></ol></div>Closed as duplicate considered harmfulurn:https-www-tfeb-org:-fragments-2022-12-05-closed-as-duplicate-considered-harmful2022-12-05T16:10:07Z2022-12-05T16:10:07ZTim Bradshaw
<p>The various <a href="https://stackexchange.com/">Stack Exchange</a> sites, and specifically <a href="https://stackoverflow.com/questions/tagged/lisp">Stack Overflow</a>, seem to be some of the best places for getting reasonable answers to questions on a wide range of topics from competent people. They would be a lot better if they were not so obsessed about closing duplicates.</p>
<!-- more-->
<p>Closing duplicates seems like a good idea: having a single, canonical, question on a given topic with a single, canonical, answer seems like a good thing. It’s not.</p>
<p>The reason it’s not is that it makes two false assumptions:</p>
<ul>
<li>that a given question has a single best answer;</li>
<li>that this answer does not change over time.</li></ul>
<p>Neither of these assumptions is true for a large number of interesting questions.</p>
<p>Questions can have several good answers. I have at least three introductory books on <a href="https://en.m.wikipedia.org/wiki/Mathematical_analysis" title="analysis">analysis</a>, and not because I didn’t find the good one on the first try: I have several because they give different perspectives — different answers, in the sense of Stack Exchange — to various aspects of the subject. I have several books on introductory quantum mechanics, several books on introductory general relativity, and so it goes on. It is, simply, a delusion that there exists a single most helpful answer to many questions: pretending that there is stupidly limiting.</p>
<p>And what constitutes a good answer can change over time. If you asked, for instance, what a macro was in Lisp and what macros are good for, you would have got very different answers in 1982 than in 2022<sup><a href="#2022-12-05-closed-as-duplicate-considered-harmful-footnote-1-definition" name="2022-12-05-closed-as-duplicate-considered-harmful-footnote-1-return">1</a></sup>. The same is true for many other subjects: human knowledge is not static.</p>
<p>All of this is made worse as only the person asking a question can accept an answer: they may not do so at all or, worse, they may be asking in bad faith and accept wrong or misleading answers (yes, this happens in various Stack Exchanges).</p>
<p>The true Stack Exchange believer will now explain in great detail<sup><a href="#2022-12-05-closed-as-duplicate-considered-harmful-footnote-2-definition" name="2022-12-05-closed-as-duplicate-considered-harmful-footnote-2-return">2</a></sup> why none of this matters: people should just spend their time adding improved answers to questions which already have accepted answers rather than to new questions which will be closed as duplicates. Because, of course, the accepted answer will not be the one almost everyone looks at, and even if they don’t care about increasing their karma on Stack Exchange, they will be very happy to write answers that, in the real world, almost nobody will ever look at.</p>
<p>Yeah, right.</p>
<p>This is such a shame: Stack Exchange is a good thing, but it’s seriously damaged by this unnescessary problem. The answer is not simply to allow unrestricted duplicates, but to wait for a bit and see if a question which is, or is nearly, a duplicate has attracted new and interesting answers, and to not close it as a duplicate in that case. This would not be hard to do.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-12-05-closed-as-duplicate-considered-harmful-footnote-1-definition" class="footnote-definition">
<p>And even in 2022 you will get answers from people who seem not to have learned anything since 1982. <a href="#2022-12-05-closed-as-duplicate-considered-harmful-footnote-1-return">↩</a></p></li>
<li id="2022-12-05-closed-as-duplicate-considered-harmful-footnote-2-definition" class="footnote-definition">
<p>Please, don’t: I don’t have a Stack Exchange account any more and, even if I did, I would not be interested. <a href="#2022-12-05-closed-as-duplicate-considered-harmful-footnote-2-return">↩</a></p></li></ol></div>Package-local nicknamesurn:https-www-tfeb-org:-fragments-2022-10-14-package-local-nicknames2022-10-14T09:26:31Z2022-10-14T09:26:31ZTim Bradshaw
<p>What follows is an opinion. Do not under any circumstances read it. Other opinions are available (but wrong).</p>
<!-- more-->
<p>Package-local nicknames are an abomination. They should be burned with nuclear fire, and their ashes launched into space on a trajectory which will leave the Solar System.</p>
<p>The only reason why package-local nicknames matter is if you are writing a lot of code with lots of package-qualified names in it. If you are doing that then <em>you are writing code which is hard to read</em>: the names in your code are longer than they need to be and the first several characters of them are package name noise (people read, broadly from left to right). Imagine me:a la:version ge:of oe:English oe:where la:people wrote like that: it’s just horrible. If you are writing code which is hard to read you are writing bad code.</p>
<p>Instead you should do the work to construct a namespace in which the words you intend to use are directly present. This means constructing suitable packages: the files containing the package definitions are then almost the only place where package names occur, and are a minute fraction of the total code. Doing this is a good practice in itself because the package definition file is then a place which describes just what names your code needs, from where, and what names it provides. Things like conduit packages (shameless self-promotion) can help with this, which is why I wrote them: being able to say ‘this package exports the combination of the exports of these packages, except …’ or ‘this package exports just the following symbols from these packages’ in an explicit way is very useful.</p>
<p>If you are now rehearsing a litany of things that can go wrong with this approach in rare cases<sup><a href="#2022-10-14-package-local-nicknames-footnote-1-definition" name="2022-10-14-package-local-nicknames-footnote-1-return">1</a></sup>, please don’t: this is not my first rodeo and, trust me, I know about these cases. Occasionally, the CL package system can make it hard or impossible to construct the namespace you need, with the key term here being being <em>occasionally</em>: people who give up because something is occasionally hard or impossible have what Erik Naggum famously called ‘one-bit brains’<sup><a href="#2022-10-14-package-local-nicknames-footnote-2-definition" name="2022-10-14-package-local-nicknames-footnote-2-return">2</a></sup>: the answer is to <em>get more bits for your brain</em>.</p>
<p>Once you write code like this then the only place package-local nicknames can matter is, perhaps, the package definition file. And the only reason they can matter there is because people think that picking a name like ‘XML’ or ‘RPC’ or ‘SQL’ for their packages is a good idea. When people in the programming section of my hollowed-out-volcano lair do this they are … well, I will not say, but my sharks are well-fed and those things on spikes surrounding the crater are indeed their heads.</p>
<p>People should use long, unique names for packages. Java, astonishingly, got this right: use domains in big-endian order (<code>org.tfeb.conduit-packages</code>, <code>org.tfeb.hax.metatronic</code>). Do not use short nicknames. Never use names without at least one dot, which should be reserved for implementations and perhaps KMP-style substandards. Names will now not clash. Names will be longer and require more typing, but this will not matter because the only place package names are referred to are in package definition files and in <code>in-package</code> forms, which are a minute fraction of your code.</p>
<p>I have no idea where or when the awful plague of using package-qualified names in code arose: it’s not something people used to do, but it seems to happen really a lot now. I think it may be because people also tend to do this in Python and other dotty languages, although, significantly, in Python you never actually need to do this if you bother, once again, to actually go to the work of constructing the namespace you want: rather than the awful</p>
<pre class="brush: python"><code>import sys
... sys.argv ...
...
sys.exit(...)</code></pre>
<p>you can simply say</p>
<pre class="brush: python"><code>from sys import argv, exit
... argv ...
exit(...)</code></pre>
<p>and now the very top of your module lets anyone reading it know exactly what functionality you are importing and from where it comes.</p>
<p>It may also be because the whole constructing namespaces thing is a bit hard. Yes, it is indeed a bit hard, but designing programs, of which it is a small but critical part, <em>is</em> a bit hard.</p>
<p>OK, enough.</p>
<hr />
<p>If, after reading the above, you think you should mail me about how wrong it all is and explain some detail of the CL package system to me: don’t, I do not want to hear from you. Really, I don’t.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-10-14-package-local-nicknames-footnote-1-definition" class="footnote-definition">
<p>in particular, if your argument is that someone has used, for instance, the name <code>set</code> in some package to mean, for instance, a set in the sense it is used in maths, and that this clashes with <code>cl:set</code> and perhaps some other packages, don’t. If you are writing a program and you think, ‘I know, I’ll use a symbol with the same name as a symbol exported from CL to mean something else’ in a context where users of your code also might want to use the symbol exported by CL (which in the case of <code>cl:set</code> is ‘almost never’, of course), then my shark pool is just over here: please throw yourself in. <a href="#2022-10-14-package-local-nicknames-footnote-1-return">↩</a></p></li>
<li id="2022-10-14-package-local-nicknames-footnote-2-definition" class="footnote-definition">
<p>Curiously, I think that quote was about Scheme, which I am sure Erik hated. But, for instance, Racket’s module system lets you do just the things which are hard in the package system: renaming things on import, for instance. <a href="#2022-10-14-package-local-nicknames-footnote-2-return">↩</a></p></li></ol></div>Bradshaw's lawsurn:https-www-tfeb-org:-fragments-2022-10-03-bradshaw-s-laws2022-10-03T19:50:51Z2022-10-03T19:50:51ZTim Bradshaw
<p>There are two laws.</p>
<!-- more-->
<h2 id="the-laws">The laws</h2>
<ol>
<li><strong>Bradshaw’s law.</strong> All sufficiently large software systems end up being programming languages.</li>
<li><strong>Zyni’s corollary.</strong> Whenever you think the point is at which the first law will apply, it will apply before that.</li></ol>
<h2 id="implications-of-the-laws">Implications of the laws</h2>
<p>When building software systems you should design them as programming langages. You should do this however small you think they will be. In order to make this practical for small systems you should therefore use a language which allows seamless extension into other languages with insignificant zero-point cost.</p>
<p>But because the laws are not widely known, most large software systems are built without understanding that what is being built is in fact a programming language. Because people don’t know they are building a programming language, don’t know how to build programming languages, and do not use languages which make the seamless construction of programming languages easy, the languages they build are usually terrible: they are hard to use, have opaque and inconsistent semantics and are almost always insecure.</p>Simple logging in Common Lispurn:https-www-tfeb-org:-fragments-2022-09-26-simple-logging-in-common-lisp2022-09-26T11:26:32Z2022-09-26T11:26:32ZTim Bradshaw
<p><code>slog</code> is a simple logging framework for Common Lisp based on the observation that conditions can represent log events.</p>
<!-- more-->
<p><code>slog</code> is based on an two observations about the Common Lisp condition system:</p>
<ul>
<li>conditions do not have to represent errors, or warnings, but can just be a way of a program saying ‘look, something interesting happened’;</li>
<li>handlers can decline to handle a condition, and in particular handlers are invoked <em>before the stack is unwound</em>.</li></ul>
<p>Well, saying ‘look, something interesting happened’ is really quite similar to what logging systems do, and <code>slog</code> is built on this idea.</p>
<p><code>slog</code> is the <em>simple</em> logging system: it provides a framework on which logging can be built but does not itself provide a vast category of log severities &c. Such a thing could be built on top of <code>slog</code>, which aims to provide mechanism, not policy.</p>
<p><code>slog</code> provides a couple of conditions representing log entries, which are designed to be subclassed in real life. Log entries are created using a <code>slog</code> function (this is why <code>slog</code> is called <code>slog</code>: <code>log</code> is already taken) which simply signals an appropriate condition. Handlers are set up by a <code>logging</code> form (this should really be called <code>slogging</code> but it is not), which associates conditions with handlers. There is fairly flexible file handling for logging to files, and in particular you can refer to file names which all get associated with the approprate stream, streams get closed automagically (and you can manually close them, when they will be reopened if need be), and the underlying mechanism for writing entries is exposed by a <code>slog-to</code> generic function which could be extended. Log entry formats can be controlled in various ways.</p>
<p>In addition <code>slog</code> tries to associate log entries with ‘precision time’, which is CL’s universal time expanded to the precision of a millisecond, or of internal time if it is less precise than a millisecond. Setting this up means that <code>slog</code> takes a second or so to load.</p>
<p>Once again: <code>slog</code> is a <em>framework</em>: it has no dealings with log severities, catagories, or anything like that. All that is meant to be provided on top of what <code>slog</code> provides.</p>
<p>Documentation is <a href="https://tfeb.github.io/tfeb-lisp-hax/#simple-logging-slog">here</a>, source code is <a href="https://github.com/tfeb/tfeb-lisp-hax">here</a>. It will be available from Quicklisp in due course.</p>Metatronic macrosurn:https-www-tfeb-org:-fragments-2022-09-26-metatronic-macros2022-09-26T10:54:25Z2022-09-26T10:54:25ZTim Bradshaw
<p>Metatronic macros are a simple hack which makes it a little easier to write less unhygienic macros in Common Lisp.</p>
<!-- more-->
<p>Common Lisp macros require you to avoid variable name capture yourself. So, for a macro which iterates over the lines in a file, this is wrong:</p>
<pre class="brush: lisp"><code>(defmacro with-file-lines ((line file) &body forms)
;; wrong
`(with-open-file (in ,file)
(do ((,line (read-line in nil in)
(read-line in nil in)))
((eq ,line in))
,@forms)))</code></pre>
<p>It’s wrong because it binds <code>in</code> to the stream open to the file, and user code could perfectly legitimately refer to a variable of the same name.</p>
<p>The standard approach to dealing with this is to use gensyms:</p>
<pre class="brush: lisp"><code>(defmacro with-file-lines ((line file) &body forms)
;; righter
(let ((inn (gensym)))
`(with-open-file (,inn ,file)
(do ((,line (read-line ,inn nil ,inn)
(read-line ,inn nil ,inn)))
((eq ,line inn))
,@forms))))</code></pre>
<p>This creates a new symbol bound to <code>inn</code> (<code>in</code>’s name), and then uses it as the name of the variable bound to the stream. Code can’t then use any variable with this unique name.</p>
<p>This works, but it’s ugly. Metatronic macros let you write the above like this:</p>
<pre class="brush: lisp"><code>(defmacro/m with-file-lines ((line file) &body forms)
;; righter, easier
`(with-open-file (<in> ,file)
(do ((,line (read-line <in> nil <in>)
(read-line <in> nil <in>)))
((eq ,line <in>))
,@forms)))</code></pre>
<p>In this macro all symbols which look like <code><</code>…<code>></code> (in any package) are rewritten to unique names, but all references to symbols with the same original name are to the same symbol<sup><a href="#2022-09-26-metatronic-macros-footnote-1-definition" name="2022-09-26-metatronic-macros-footnote-1-return">1</a></sup>. This makes this common case more pleasant to do: macros written using <code>defmacro/m</code> have less noise around their expansion.</p>
<p>Metatronic macros go to some lengths to avoid leaking the rewritten symbols. Given this silly macro</p>
<pre class="brush: lisp"><code>(defmacro/m silly ()
''<silly>)</code></pre>
<p>then <code>(eq (silly) (silly))</code> is false. Similarly given this:</p>
<pre class="brush: lisp"><code>(defmacro/m also-silly (f)
`(eq ,f '<silly>))</code></pre>
<p>Then <code>(also-silly '<silly>)</code> will be false of course.</p>
<p>There is <code>defmacro/m</code>, <code>macrolet/m</code> and <code>define-compiler-macro/m</code>, and the implementation of metatronization is exposed if you need it.</p>
<p>Documentation is <a href="https://tfeb.github.io/tfeb-lisp-hax/#metatronic-macros">here</a>, source code is <a href="https://github.com/tfeb/tfeb-lisp-hax">here</a>. It will be available in Quicklisp in due course.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-09-26-metatronic-macros-footnote-1-definition" class="footnote-definition">
<p>in fact, a symbol whose name is <code><></code> is rewritten as a unique gensym as a special case. I am not sure if this is a good thing but it’s what happens. <a href="#2022-09-26-metatronic-macros-footnote-1-return">↩</a></p></li></ol></div>Macros (from Zyni)urn:https-www-tfeb-org:-fragments-2022-08-27-macros-from-zyni2022-08-27T10:12:33Z2022-08-27T10:12:33ZTim Bradshaw
<blockquote>
<p>It is the business of the future to be dangerous; and it is among the merits of science that it equips the future for its duties. — Alfred Whitehead</p></blockquote>
<!-- more-->
<p>Once upon a time, long ago in a world far away, Lisp had many features which other languages did not have. Automatic storage management, dynamic typing, an interactive environment, lists, symbols … and macros, which allow you to seamlessly extend the language you have into the language you want and need.</p>
<p>But that was long long ago in a world far away where giants roamed the earth, trolls lurked under every bridge and, they say, gods yet lived on certain distant mountains.</p>
<p>Today, and in this world, many many languages have automatic storage management, are dynamically typed, have symbols, lists, interactive environments, and so and so and so. More of these languages arise from the thick, evil-smelling sludge that coats every surface each day: hundreds, if not thousands of them, like flies breeding on bad meat which must be swatted before they lay their eggs on your eyes.</p>
<p>Lisp, today and in this world not another, has <em>exactly one</em> feature which still distinguishes it from the endless buzz of these insect languages. That feature is seamless language extension by macros.</p>
<p>So yes, macros are dangerous, and they are hard and they are frightening. They are dangerous and hard and frightening because all powerful magic is dangerous and hard and frightening. They are dangerous because they are a thing which has escaped here from the future and it is the business of the future to be dangerous.</p>
<p>If macros are too dangerous, too hard and too frightening for you, <em>do not use Lisp</em> because <em>macros are what Lisp is about</em>.</p>
<hr />
<p>This originated as a comment by my friend Zyni: it is used with her permission.</p>Two simple pattern matchers for Common Lispurn:https-www-tfeb-org:-fragments-2022-07-21-two-simple-pattern-matchers-for-common-lisp2022-07-21T09:17:45Z2022-07-21T09:17:45ZTim Bradshaw
<p>I’ve written two pattern matchers for Common Lisp:</p>
<ul>
<li><code>destructuring-match</code>, or <code>dsm</code>, is a <code>case</code>-style construct which can match <code>destructuring-bind</code>-style lambda lists with a couple of extensions;</li>
<li><code>spam</code>, the simple pattern matcher, does not bind variables but lets you match based on assertions about, for instance, the contents of lists.</li></ul>
<p>Both <code>dsm</code> and <code>spam</code> strive to be simple and correct.</p>
<!-- more-->
<h2 id="simplicity">Simplicity</h2>
<p>Both <code>dsm</code> and <code>spam</code> are <em>simple</em>: they do exactly one thing, and try to do that one thing well.</p>
<p>You could think of <code>dsm</code> as being to some other CL pattern matchers as Unix once was to Multics: <code>dsm</code> is the result of me looking at those other systems and thinking ‘please, not that’.</p>
<p>Those systems are vast, have several levels, and are extensible: some subset of them might do what I wanted to be able to do — make writing macros less unpleasant — but I’m not sure<sup><a href="#2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-1-definition" name="2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-1-return">1</a></sup>. They are obsessed with performance.</p>
<p><code>dsm</code> does one thing, and exports a single macro. If you know how to use <code>destructuring-bind</code> and <code>case</code> you already know almost all there is to know about <code>dsm</code>: it’s a <code>case</code> construct whose cases are <code>destructuring-bind</code> lambda lists. <code>dsm</code> doesn’t care about performance at all, because macroexpansion performance never matters.</p>
<p>At least one of those matchers has almost as many commits in its repo as dsm has lines of code.</p>
<p>Like Multics was, those hairy pattern matchers are fine systems. But there was a good reason that Thompson and Ritchie wrote something very different<sup><a href="#2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-2-definition" name="2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-2-return">2</a></sup>.</p>
<h2 id="destructuring-match--dsm"><code>destructuring-match</code> / <code>dsm</code></h2>
<p>in CL <code>destructuring-bind</code> and, mostly equivalently, macro argument lists are both a blessing and a curse. They’re a blessing because they support destructuring, so you can write, for instance</p>
<pre class="brush: lisp"><code>(defmacro with-foo ((var &optional init) &body forms)
...)</code></pre>
<p>They’re a curse because they are so fragile: <code>with-foo</code> can <em>only</em> support that syntax and will fail with an ugly error message from the implementation when it is fed anything else.</p>
<p>Writing robust macros in CL, especially macros which expect various different argument patterns, then turns into a great saga of manually checking argument patterns before using <code>destructuring-bind</code> to actually bind things. The result of that, of course, is that very many CL macros are not robust and have terrible error reporting.</p>
<p><code>destructuring-match</code> does away with all this unpleasentness. It supports a slightly extended version of the lambda lists that <code>destructuring-bind</code> supports, has ‘guard’ clauses which allow additional checks, and will match a form against any number of lambda lists until one matches, with a fallback case.</p>
<p>As an example here is a version of <code>with-foo</code> which allows two patterns:</p>
<pre class="brush: lisp"><code>(defmacro with-foo (&body forms)
(destructuring-match forms
(((var &optional init) &body body)
(:when (symbolp var))
...)
((((var &optional type) &optional init) &body body)
(:when (symbolp var))
...)
(otherwise
(error ...))))</code></pre>
<p>The guard clauses check that <code>var</code> is a symbol before the match succeeds, and will therefore ensure that the second match is the one chosen for <code>(with-foo ((x y) 1) ...)</code>.</p>
<p><code>destructuring-match</code> also supports ‘blank’ variables: any variable whose name is <code>_</code> (in any package) is ignored, and all such variables are distinct. So for instance</p>
<pre><code>(destructuring-match l
((_ _ _) ...))</code></pre>
<p>will match if <code>l</code> is a proper list with exactly three elements.</p>
<p>Using <code>destructuring-match</code> it’s easy to write this macro<sup><a href="#2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-3-definition" name="2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-3-return">3</a></sup>:</p>
<pre class="brush: lisp"><code>(defmacro define-matching-macro (name &body clauses)
(let ((<whole> (make-symbol "WHOLE"))
(<junk> (make-symbol "JUNK")))
(destructuring-match clauses
((doc . the-clauses)
(:when (stringp doc))
`(defmacro ,name (&whole ,<whole> &rest ,<junk>)
,doc
(destructuring-match ,<whole> ,@the-clauses)))
(the-clauses
`(defmacro ,name (&whole ,<whole> &rest ,<junk>)
(destructuring-match ,<whole> ,@the-clauses))))))</code></pre>
<p>And this then allows the above <code>with-foo</code> macro to be written like this:</p>
<pre class="brush: lisp"><code>(define-matching-macro with-foo
((_ (var &optional init) &body forms)
(:when (symbolp var))
...)
((_ ((var &optional type) &optional init) &body forms)
(:when (symbolp var))
...)
(form
(error "~S is bad syntax for with-foo" form)))</code></pre>
<p><code>dsm</code> was not written with performance in mind but it seems to be, typically, around a tenth to a half the speed of <code>destructuring-bind</code> while being far more powerful of course.</p>
<p><code>dsm</code> can be found <a href="https://tfeb.github.io/#destructuring-match-for-common-lisp">here</a>. It will probably end up in Quicklisp in due course but currently it isn’t there, and some of its dependencies are also not up to date there.</p>
<h2 id="spam-the-simple-pattern-matcher"><code>spam</code>, the simple pattern matcher</h2>
<p><code>dsm</code> has a lot of cases where it needs to check what the lambda list it is parsing and compiling looks like. To do this I wrote a bunch of predicate constructors and combinators, which return predicates which will check things. So for example:</p>
<ul>
<li><code>(is 'foo)</code> returns a function which checks its argument is <code>eql</code> to <code>foo</code>;</li>
<li><code>(some-of p1 ... pn)</code> returns a function of one argument which will succeed if one of the predicates which are its arguments succeeds, so <code>(some-of (is 'foo) (is 'bar))</code>;</li>
<li><code>(head-matches p1 ... pn)</code> will succeed if the predicates which are its arguments succeed on the first elements of a list.</li></ul>
<p>There are several other predicate constructrors and predicate combinators, but <code>spam</code> can use any predicate.</p>
<p>There is then a <code>matches</code> macro which uses these to match things, and a <code>matchp</code> function which simply invokes a predicate.</p>
<p>As an example, here’s part of a matcher for <code>&rest</code> specifications in lambda lists.</p>
<pre class="brush: lisp"><code>(matching ll
((head-matches (some-of (is '&rest) (is '&body))
(var)
(is '&key))
;; &rest x &key ...
...)
((head-matches (some-of (is '&rest) (is '&body))
(var)
(any))
;; &rest x with something else
...)
((list-matches (some-of (is '&rest) (is '&body))
(var))
;; &rest x and no more
...)
(otherwise
(error "oops")))</code></pre>
<p><code>spam</code> is pretty useful, and code written using it is much easier to read than doing the equivalent checks manually. It is used extensively in the implementation of <code>dsm</code>.</p>
<p><code>spam</code> is now one of <a href="https://tfeb.github.io/#some-common-lisp-hacks">my CL hax</a>.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-1-definition" class="footnote-definition">
<p>At the time of writing <a href="https://github.com/guicho271828/trivia">Trivia</a> supports lambda lists I think, but not destructuring-lambda lists: <code>(match '(1 (1)) ((lambda-list a (b)) (values a b)))</code> will fail, for instance. I don’t know whether is it <em>meant</em> to support destructuring lambda lists — comments in the sources imply it is, but it clearly does not in fact. <a href="#2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-1-return">↩</a></p></li>
<li id="2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-2-definition" class="footnote-definition">
<p>I am aware of <a href="https://dreamsongs.com/WIB.html">Gabriel’s ‘worse is better’ paper</a> and its various afterthoughts. <code>dsm</code> is not like that: it is smaller and simpler, but is not intended to be worse. <code>dsm</code> is to these other systems perhaps as Scheme was to CL. Gabriel also talks about these two options, of course. <a href="#2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-2-return">↩</a></p></li>
<li id="2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-3-definition" class="footnote-definition">
<p>Note this macro is 12 lines, half of which are handling the possible docstring. <a href="#2022-07-21-two-simple-pattern-matchers-for-common-lisp-footnote-3-return">↩</a></p></li></ol></div>Macroexpansion in Common Lispurn:https-www-tfeb-org:-fragments-2022-07-05-macroexpansion-in-common-lisp2022-07-05T15:16:29Z2022-07-05T15:16:29ZTim Bradshaw
<p>Yet another description of macroexpansion in Common Lisp. There is nothing particuarly new here and it partly duplicates some previous articles: I just wanted to rescue the text.</p>
<!-- more-->
<p>The following description is of how macroexpansion works in Common Lisp<sup><a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-1-definition" name="2022-07-05-macroexpansion-in-common-lisp-footnote-1-return">1</a></sup>. It is slightly simplified and I have not always mentioned when it is<sup><a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-2-definition" name="2022-07-05-macroexpansion-in-common-lisp-footnote-2-return">2</a></sup>. It is at least a partial duplicate of <a href="../../../../2021/11/11/the-proper-use-of-macros-in-lisp/">this previous article</a>.</p>
<h2 id="what-macros-are">What macros are</h2>
<p><strong>Macros in CL are functions, written in ordinary CL, whose argument is source code, and whose value is other source code.</strong></p>
<p>Source code is represented as s-expressions: symbols, conses, and so on. Macros don’t do string-rewriting.</p>
<p>The way to think slightly more abstractly about macros is that they are <em>functions between languages</em>: a macro is a function which takes as an argument fragments of a language which includes that macro, and returns as a value either a fragment of a language which <em>doesn’t</em> include the macro, or a fragment of a language which includes it in some weaker way.</p>
<p>The aim of macros is to build, on top of the language you are given, another language which is closer to the language in which you want to express your programs. CL itself is one such language, built-up using a number of standard macros on top of a substrate language.</p>
<p>People often think of macros as ‘functions which do not evaluate their arguments’: that’s really not right. They are functions — perfectly ordinary functions, written in CL — but their argument is source code, and their value is source code.</p>
<h2 id="how-macroexpansion-happens">How macroexpansion happens</h2>
<p>[This is simplified.]</p>
<p>Given some initial compound form <code>(m ...)</code>, macroexpansion proceeds like this.</p>
<p><strong>Start.</strong> Given a form, it should be one of</p>
<ul>
<li>a compound form <code>(m ...)</code>,</li>
<li>or a non-compound form.</li></ul>
<p><strong>Compound form.</strong> The form is <code>(m ...)</code></p>
<ol>
<li>Look at <code>m</code>: if it has an associated macro function (found using <code>macro-function</code>) then simply call that function on the whole form <code>(m ...)</code>: its result is a new form<sup><a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-3-definition" name="2022-07-05-macroexpansion-in-common-lisp-footnote-3-return">3</a></sup>. Recurse on this form from <strong>Start</strong>.</li>
<li>If <code>m</code> is not a macro, then it may be a special operator, such as <code>setq</code> or <code>if</code>. Consider appropriate forms in the body of this form for expansion: which forms are known by the rules of the special operator. For instance all the forms in <code>(if ...)</code> are considered for expansion, while in <code>(setq <x> <y>)</code> only <code><y></code> is, and so on.</li>
<li>If it is not a macro and not a special form, then <code>(m ...)</code> is assumed to be a function call, with <code>m</code> denoting a function. All the forms in the body are now considered for macro expansion. Once that is done the expansion process is complete.</li>
<li>As a special case of the last case, <code>m</code> may be <code>(lambda (...) ...)</code>, so the whole form will be <code>((lambda (...) ...) ...)</code>. In this case the forms in the body of the <code>lambda</code> are considered for macroexpansion; otherwise this is the same as the last case<sup><a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-4-definition" name="2022-07-05-macroexpansion-in-common-lisp-footnote-4-return">4</a></sup>.</li>
<li>There are no other cases.</li></ol>
<p><strong>Non-compound form.</strong> There is nothing to do here.</p>
<p>As I said, this is simplified: there are local macros for instance, and various other things. However one critical thing is that when expanding some macro form <code>(m ...)</code>, the expansion carries on until it gets something which is not a macro form <em>before</em> looking at whatever is in the body of the form. That’s critical: although it’s tempting to think that expansion should happen inside-out, it can’t work that way, because until the outer macro has done its work you can’t know if the things in its body even <em>should</em> be candidates for macro expansion. There’s an example of this below.</p>
<h2 id="macros-the-hard-way">Macros the hard way</h2>
<p>OK, I said that macros were just functions, and I meant that. Let’s write a macro <code>with-debugging</code> which is like <code>progn</code> but it will perhaps print what it is doing.</p>
<p>So let’s write the macro function:</p>
<pre class="brush: lisp"><code>(defvar *debugging* t)
(defun expand-with-debugging (form environment)
(declare (ignore environment)) ;I'm not mentioning environments
`(progn
,@(loop for thing in (rest form)
collect `(when *debugging*
(format *debug-io* "~&~S~%" ',thing))
collect thing)))</code></pre>
<p>And we can test it:</p>
<pre class="brush: lisp"><code>> (expand-with-debugging '(with-debugging (cons 1 2) 4) nil)
(progn
(when *debugging* (format *debug-io* "~&~S~%" '(cons 1 2)))
(cons 1 2)
(when *debugging* (format *debug-io* "~&~S~%" '4))
4)</code></pre>
<p>And now we can install it as the macro function for <code>with-debugging</code>:</p>
<pre><code>(setf (macro-function 'with-debugging) #'expand-with-debugging)</code></pre>
<p>And now</p>
<pre><code>> (with-debugging
(cons 1 2)
4)
(cons 1 2)
4
4</code></pre>
<p>Or</p>
<pre><code> (setf *debugging* nil)
nil
> (with-debugging
(cons 1 2)
4)
4</code></pre>
<p>OK, here’s another macro done this way, and purpose of this one is to show you why macroexpansion has to happen outside in. Let’s say we want to be able to denote functions by <code>(fun (arg ...) form ...)</code>, but we’d like to be able to debug the body with <code>with-debugging</code>. We can do that:</p>
<pre><code>(defun expand-fun (form environment)
(declare (ignore environment)) ;still not mentioning environments
`(function (lambda ,(second form)
;; Not dealing with declarations
(with-debugging ,@(cddr form)))))
(setf (macro-function 'fun) #'expand-fun)</code></pre>
<p>And now</p>
<pre class="brush: lisp"><code>> (let ((*debugging* t))
(funcall (fun (a) (+ a a)) 1))
(+ a a)
2</code></pre>
<p>Now you can see why the macro expander has to work the way it does: the first form in the body of <code>fun</code> should not be macroexpanded at all, and the remaining forms are going to get wrapped in a macro which isn’t there in the source at all. So macroexpansion has to go outside in, as described above.</p>
<h2 id="a-better-way">A better way</h2>
<p>Well, you could write macros like that. Probably once they were written like that. But it’s a pain, because you almost never care about the first element of the form — the macros own name — and you have to manually take the rest of the form apart yourself. And also you need to deal with questions about making sure macros are defined at compile time and so on.</p>
<p>That’s what <code>defmacro</code> does. It is itself a macro, and its expansion will involve setting the <code>macro-function</code> of the macro to some appropriate thing. So using <code>defmacro</code> I can write the <code>fun</code> macro:</p>
<pre class="brush: lisp"><code>(defmacro fun ((&rest args) &body forms)
;; still not dealing with declarations
`(function (lambda (,@args) (with-debugging ,@forms))))</code></pre>
<p>This is easier to understand of course. But all it is is a (fairly elaborate!) wrapper around what I did above.</p>
<h2 id="watching-the-detectives">Watching the detectives</h2>
<p>Using <a href="https://tfeb.github.io/tfeb-lisp-hax/#tracing-macroexpansion-trace-macroexpand"><code>trace-macroexpand</code></a> you can watch macroexpansion happen.</p>
<pre><code>> (needs (:org.tfeb.hax.trace-macroexpand :compile t :use t))
; Loading [...]
((:org.tfeb.hax.trace-macroexpand t))
> (trace-macroexpand t)
nil
> (trace-macro fun with-debugging)
> (setf *trace-macroexpand-print-length* nil
*trace-macroexpand-print-level* nil)
nil
> (trace-macro fun with-debugging)
(fun with-debugging)
> (setf *debugging* nil)
nil
> (funcall (fun (a) a) 1)
(fun (a) a)
-> #'(lambda (a) (with-debugging a))
(with-debugging a)
-> (progn (when *debugging* (format *debug-io* "~&~S~%" 'a)) a)
(with-debugging a)
-> (progn (when *debugging* (format *debug-io* "~&~S~%" 'a)) a)
1</code></pre>
<p>Note that <code>with-debugging</code> is expanded twice: this is an artifact of the implementation: there’s no promise that macros only get expanded once in interpreted code.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-07-05-macroexpansion-in-common-lisp-footnote-1-definition" class="footnote-definition">
<p>This was once going to be a Stack Overflow answer, and I didn’t want to throw it away. <a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-1-return">↩</a></p></li>
<li id="2022-07-05-macroexpansion-in-common-lisp-footnote-2-definition" class="footnote-definition">
<p>And of course I might just be wrong about some details. <a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-2-return">↩</a></p></li>
<li id="2022-07-05-macroexpansion-in-common-lisp-footnote-3-definition" class="footnote-definition">
<p>I am not talking about the environment objects which get passed to macro functions. <a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-3-return">↩</a></p></li>
<li id="2022-07-05-macroexpansion-in-common-lisp-footnote-4-definition" class="footnote-definition">
<p>Another way of thinking about <code>((lambda (...) ...) ...)</code> is that is is the same as <code>(funcall (function (lambda (...) ...)) ...)</code> and, since <code>function</code> is a special operator, its rules apply, and include expanding the forms in the body of the <code>(lambda (...) ...)</code> form (and of course <code>lambda</code> is itself a macro, so <code>(lambda (...) ...)</code> expands to <code>(function (lambda (...) ...)))</code> and then the rules for <code>function</code> apply again). I am old enough to remember adding the macro for <code>lambda</code> to various antique CLs. <a href="#2022-07-05-macroexpansion-in-common-lisp-footnote-4-return">↩</a></p></li></ol></div>Avoiding circularity: a simple exampleurn:https-www-tfeb-org:-fragments-2022-03-23-avoiding-circularity-a-simple-example2022-03-23T17:54:40Z2022-03-23T17:54:40ZTim Bradshaw
<p>Here’s a simple example of dealing with a naturally circular function definition.</p>
<!-- more-->
<p>Common Lisp has a predicate called <a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_everyc.htm"><code>some</code></a>. Here is what looks like a natural definition of a slightly more limited version of this predicate, which only works on lists, in Racket:</p>
<pre class="brush: racket"><code>(define (some? predicate . lists)
;; Just avoid the spread/nospread problem
(some*? predicate lists))
(define (some*? predicate lists)
(cond
[(null? lists)
;; if there are no elements the predicate is not true
#f]
[(some? null? lists)
;; if any of the lists is empty we've failed
#f]
[(apply predicate (map first lists))
;; The predicate is true on the first elements
#t]
[else
(some*? predicate (map rest lists))]))</code></pre>
<p>Well, that looks neat, right? Except it is very obviously doomed because <code>some*?</code> falls immediately into an infinite recursion.</p>
<p>Well, the trick to avoid this is to check whether the predicate is <code>null?</code> and handle that case explicitly:</p>
<pre class="brush: racket"><code>(define (some*? predicate lists)
(cond
[(null? lists)
;;
(error 'some? "need at least one list")]
[(eq? predicate null?)
;; Catch the circularity and defang it
(match lists
[(list (? list? l))
(cond
[(null? l)
#f]
[(null? (first l))
#t]
[else
(some? null? (rest l))])]
[_ (error 'some? "~S bogus for null?" lists)])]
[(some? null? lists)
;; if any of the lists is empty we've failed
#f]
[(apply predicate (map first lists))
;; The predicate is true on the first elements
#t]
[else
(some*? predicate (map rest lists))]))</code></pre>
<p>And this now works fine.</p>
<p>Of course this is a rather inefficient version of such a predicate, but it’s nice. Well, I think it is.</p>
<hr />
<p>Note: a previous version of this had an extremely broken version of <code>some*?</code> which worked, by coincidence, sometimes.</p>Two understandable deficiencies in Common Lispurn:https-www-tfeb-org:-fragments-2022-03-22-two-understandable-deficiencies-in-common-lisp2022-03-22T09:58:28Z2022-03-22T09:58:28ZTim Bradshaw
<p>Common Lisp is, I think, a remarkably pleasant language, despite what some people like to say. Here are two small deficiencies, both of which are understandable in terms of the history of CL, and both of which ultimately hurt naïve programmers working in CL.</p>
<!-- more-->
<h2 id="the-default-floating-point-type-is-single-float">The default floating-point type is <code>single-float</code></h2>
<p>There are two things that make this true:</p>
<ul>
<li><a href="http://www.lispworks.com/documentation/HyperSpec/Body/v_rd_def.htm"><code>*read-default-float-format*</code></a> is initially <code>single-float</code>, which means that, unless it is changed, <code>1.0</code> reads as <code>1.0f0</code>, a single float<sup><a href="#2022-03-22-two-understandable-deficiencies-in-common-lisp-footnote-1-definition" name="2022-03-22-two-understandable-deficiencies-in-common-lisp-footnote-1-return">1</a></sup>;</li>
<li>The <a href="http://www.lispworks.com/documentation/HyperSpec/Body/f_float.htm"><code>float</code></a> function will convert to a single float unless it is given a prototype which is not a single float: <code>(float 1)</code> is <code>1.0f0</code>, while to get a double float you would need <code>(float 1 1.0d0)</code>.</li></ul>
<p>In addition things like <a href="http://www.lispworks.com/documentation/HyperSpec/Body/m_w_std_.htm"><code>with-standard-io-syntax</code></a> bind <code>*read-default-float-format*</code> to <code>single-float</code>, so you have to do a little more work to make doubles the default.</p>
<p>I think there are probably several historical reasons why this default was chosen:</p>
<ul>
<li>a long time ago memory was very expensive and single floats take, usually, half the memory of double floats, thus pushing people towards single floats;</li>
<li>a long time ago, perhaps, on some machines, single float operations were significantly faster than double float operations even before possible float consing was taken into account;</li>
<li>Lisp hardware companies with significant influence on the standard, notably Symbolics, made hardware which allowed single (32 bit) floats to be immediate objects, while double floats were not, and had simple-minded compilers which were not capable of optimizing double float operations, thus making double float arithmetic extremely slow compared to single float arithmetic, and these companies wanted their machines to seem fast (they never, really, were) for naïve users;</li>
<li>it was not clear that implementations would choose <code>single-float</code> to mean ‘single precision IEEE 754 float’ and <code>double-float</code> to mean ‘double precision IEEE 754 float’, for instance it’s perfectly legal to have the <code>short-float</code> type mean single precision IEEE 754 and all of the <code>single-float</code>, <code>double-float</code> and <code>long-float</code> types mean double precision IEEE 754;</li>
<li>it wasn’t even even clear that <a href="https://en.wikipedia.org/wiki/IEEE_754-1985">IEEE 754</a> would come to dominate how machines implement floating-point: VAXes didn’t, and other machines of interest at the time also did not.</li></ul>
<p>So there are good historical reasons for this. However all implementations I’m aware of now translate <code>short-float</code> to mean <code>single-float</code>, <code>single-float</code> to mean IEEE 754 single precision, <code>double-float</code> to mean IEEE 754 double precision and <code>long-float</code> to be the same as <code>double-float</code>.</p>
<p>So what is the problem with the default float type being <code>single-float</code> in the modern world? The answer is</p>
<pre class="brush: lisp"><code>> (log (/ 1 single-float-epsilon) 10)
7.22472</code></pre>
<p>In other words, single precision IEEE 754 arithmetic has about 7 significant figures of precision. For many purposes, and <em>especially</em> for naïvely-written code that’s at best marginal and at worst less than that. On the other hand</p>
<pre class="brush: lisp"><code>> (log (/ 1 double-float-epsilon) 10)
15.954589770191001D0</code></pre>
<p>which is almost 16 significant figures of precision, more than twice that of single precision.</p>
<p>That’s why the default should have been double precision: it makes naïve code more likely to work, and people who are writing non-naïve code can use single precision if they need it.</p>
<h2 id="the-cl-user-package-is-defined-in-an-implementation-dependent-way">The <code>CL-USER</code> package is defined in an implementation-dependent way</h2>
<p>From <a href="http://www.lispworks.com/documentation/HyperSpec/Body/11_abb.htm">the spec</a>:</p>
<blockquote>
<p>The <code>COMMON-LISP-USER</code> package is the current package when a Common Lisp system starts up. This package uses the <code>COMMON-LISP</code> package. The <code>COMMON-LISP-USER</code> package has the nickname <code>CL-USER</code>. <em>The <code>COMMON-LISP-USER</code> package can have additional symbols interned within it; it can use other implementation-defined packages.</em></p></blockquote>
<p>(My emphasis.)</p>
<p>What this means is that when you start a CL environment, the current package may have all sorts of implementation-dependent symbols visible in it. You can see why this happened: if you’re implementing Super-Whizz-Bang CL which has all sorts of magic extra features, you want at least some of those features to be immediately available to users, rather than requiring them to pore over boring manuals to find them.</p>
<p>But for users, and especially for naïve users, it’s a terrible choice: naïve users don’t know about packages so they write their programs in <code>CL-USER</code>. And they also don’t really know which symbols available in <code>CL-USER</code> come from <code>CL</code> and are thus standard parts of the language, and which come from one of Super-Whizz-Bang CL’s implementation packages, and are <em>not</em> standard parts of the language. So their programs turn into a mess where the portable parts are not distinct from the non-portable parts. The way the <code>CL-USER</code> package is defined thus makes it harder for to write programs whose non-portable parts are well-isolated, and ultimately hurts the language.</p>
<p>This is a direct conflict between implementors and users: implementors both want their extra features immediately available so their implementation is shinier and want to encourage users to use these extra features in a way which makes it hard to move their programs to other implementations; users, when they think about it, generally don’t want this second thing, at least.</p>
<p>Instead, the language should have defined <code>CL-USER</code> as a package which <em>only</em> used <code>CL</code>, and perhaps have defined another standard package, perhaps <code>IMPL-USER</code>, which was defined the way <code>CL-USER</code> is today.</p>
<h2 id="can-these-be-fixed">Can these be fixed?</h2>
<p>While both of these problems could be fixed without changing the standard, I don’t think either can <em>realistically</em> be fixed.</p>
<p>For the <code>single-float</code> problem there is nothing to stop implementations simply defining <code>short-float</code> to mean IEEE 754 single precision and all the other types to mean IEEE 754 double precision. But all the existing code which assumes otherwise will then probably break in exciting ways. So this is unlikely to happen I expect.</p>
<p>The <code>CL-USER</code> problem could be fixed if implementations agree to define <code>CL-USER</code> to use only <code>CL</code> as it is allowed to do, and perhaps to define an <code>IMPL-USER</code> package as above. Of course that will make implementations slightly less convenient to use, so the chances of it happening would be small, even if implementors actually talked to each other in any useful way which I suspect they no longer do. Worse than that, this change will break many programs written by naïve users which live in <code>CL-USER</code>, and there are almost certainly lots of those.</p>
<hr />
<p>A moment of convenience, a lifetime of regret, as the old saying goes.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2022-03-22-two-understandable-deficiencies-in-common-lisp-footnote-1-definition" class="footnote-definition">
<p>An earlier version of this article had single floats written as, for instance <code>1.0s0</code>: that’s wrong, those are <em>short</em> floats, single floats are <code>1.0f0</code> for instance. These are almost certainly the same type on any current implementation (and I think on any implementation I have ever used, hence the mistake) but they don’t have to be. Thanks to Prem Nirved for finding this stupidity. <a href="#2022-03-22-two-understandable-deficiencies-in-common-lisp-footnote-1-return">↩</a></p></li></ol></div>The endless droning: corrections and clarificationsurn:https-www-tfeb-org:-fragments-2021-11-25-the-endless-droning-corrections-and-clarifications2021-11-25T13:05:57Z2021-11-25T13:05:57ZTim Bradshaw
<p>It seems that <a href="https://www.tfeb.org/fragments/2021/11/22/the-endless-droning">my article</a> about the existence in the Lisp community of rather noisy people who seem to enjoy complaining rather than fixing things has atracted some interest. Some things in it were unclear, and some other things seem to have been misinterpreted: here are some corrections and clarifications.</p>
<!-- more-->
<p>First of all some people pointed out, correctly, that LispWorks is expensive if you live in a low-income country. That’s true: I should have been clearer that I believe the phenonenon I am describing is exclusively a rich-world one. I may be incorrect but I have never heard anyone from a non-rich-world country doing this kind of destructuve whining.</p>
<p>It may also have appeared that I am claiming that <em>all</em> Lisp people do this: I’m not. I think the number of people is very small, and that it has always been small. But they are very noisy and even a small number of noisy people can be very destructive.</p>
<p>Some people seem to have interpreted what I wrote as saying that the current situation was fine and that Emacs / SLIME / SLY was in fact the best possible answer. Given that my second sentence was</p>
<blockquote>
<p>[Better IDEs] would obviously be desirable.</p></blockquote>
<p>this is a curious misreading. Just in case I need to make the point any more strongly: I don’t think that Emacs is some kind of be-all and end-all: better IDEs would be very good. But I also don’t think Emacs is this insurmountable barrier that people pretend it is, and I also very definitely think that some small number of people are claiming it is <em>because they want to lose</em>.</p>
<p>I should point out that this claim that it is not an insurmountable barrier comes from some experience: I have taught people Common Lisp, for money, and I’ve done so based on at least three environments:</p>
<ul>
<li>LispWorks;</li>
<li>Something based around Emacs and a CL running under it;</li>
<li>Genera.</li></ul>
<p>None of those environments presented any significant barrier. I think that LW was probably the most liked but none of them got in the way or put people off.</p>
<p>In summary: I don’t think that the current situation is ideal, and if you read what I wrote as saying that you need to read more carefully. I <em>do</em> think that the current situation is not going to deter anyone seriously interested and is very far from the largest barrier to becoming good at Lisp. I <em>do</em> think that, if you want to do something to make the situation better then you should do it, not hang around on reddit complaining about how awful it is, but that there are a small number of noisy people who do exactly that because, for them, <em>no</em> situation would be ideal because what they want is to <em>avoid</em> being able to get useful work done. Those people, unsurprisingly, often become extremely upset when you confront them with this awkward truth about themselves. They are also extremely destructive influences on any discussion around Lisp. (Equivalents of these noisy people exist in other areas, of course.) That’s one of the reasons I no longer participate in the forums where these people tend to exist.</p>
<hr />
<p>(Thanks to an ex-colleague for pointing out that I should perhaps post this.)</p>The endless droningurn:https-www-tfeb-org:-fragments-2021-11-22-the-endless-droning2021-11-22T12:36:25Z2021-11-22T12:36:25ZTim Bradshaw
<p>Someone <a href="https://www.reddit.com/r/lisp/comments/qz0a3j/why_there_is_no_new_modern_common_lisp_ide/">asked about better Lisp IDEs on reddit</a>. Such things would obviously be desirable. But the comments are entirely full the usual sad endless droning from people who need there always to be something preventing them from doing what they pretend to want to do, and are happy to invent such barriers where none really exist. comp.lang.lisp lives on in spirit if not in fact.</p>
<p>[The rest of this article is a lot ruder than the above and I’ve intentionally censored it from the various feeds. See also <a href="https://www.tfeb.org/fragments/2021/11/25/the-endless-droning-corrections-and-clarifications">corrections and clarifications</a>.]</p>
<!-- more-->
<p>First of all it is nice to see people dismissing LispWorks because it’s ‘too expensive’. LW actually <em>has</em> an IDE and it actually <em>does</em> provide an editor which (while an Emacs inside) can pretend to be a native mac or windows editor. And it’s portable: you can develop on Windows and then build and deploy on Linux and that just works, and has done for at least two decades. But it’s ‘too expensive’: a new license for LW might cost the equivalent of a few days of employing a programmer, and the support on that license (which gets you upgrades for ever) might be a day or so. If that’s ‘too expensive’ then your costing is so fucked you might as well give up now and become a beggar. (The announcement of the Haskell IDE which triggered the post is for a commercial one, by the way, so let’s not have any ‘oh, but it’s not ideologically pure’ noise, thanks.)</p>
<p>And then we get the endless ‘things were better on ⟨<em>ancient technology of your choice</em>⟩’. Here’s the thing: I used both Symbolics and Interlisp-D based systems, extensively. They weren’t better than the LW IDE. They had one or two neat features that the LW IDE doesn’t because it’s hard to do on modern hardware, but they were not better. In the case of Interlisp-D systems it took a couple of weeks of practice before you could even use the thing for more than ten minutes without spending most of the time wondering what some front panel code meant (it always meant ‘I have crashed for reasons I cannot explain and you have lost your work and must now reload the sysout and that will take half an hour’) and how to restart it. That was … harder than learning Emacs. Those ancient systems might have been better than Emacs/SLIME … but they might not, I am not sure. But always, always there is the endless mindless droning from people mourning some distant lost golden age: well, I was <em>there</em> and that golden age never existed.</p>
<p>And then there’s the ‘but the new programmers find Emacs hard’. Seriously? Because people starting to learn Lisp are learning a language whose key idea is that it is a programming language <em>in which you write programming languages</em>. Lisp makes doing far more possible than other languages, but nothing is ever going to make it easy because designing programming languages turns out to be hard. Lisp is a language all of whose interesting features are intellectually difficult ideas. If you are put off Lisp by having to learn some different keys to press, <em>give up now</em> and learn Python or some other intellectually undemanding language instead, because Emacs is not remotely the hardest thing you are going to have do deal with. This is like people doing maths degrees complaining about the squiggly Greek characters: if that’s putting you off maths, <em>don’t do maths</em>. OK, ζ and ξ are kind of fiddly to write, but understanding what a Banach space is actually <em>is</em> hard. And, by the way, at some point you <em>are</em> going to have to learn LaTeX, and if you think Emacs is hard, you have a whole other think coming.</p>
<p>Oh, and by the way, I’ve worked somewhere where large numbers of people from non-programming backgrounds wrote vast masses of Python. How did they do it? They used Emacs: some of them probably used vi or vim. But they were actual scientists so they know what hard things are, and knew that learning Emacs was not one of those things.</p>
<p>And finally, there’s a long diatribe from someone listing all the steps they had to go through to get a CL IDE set up on a machine. This same person claims to have run teams of Lisp programmers. Well, there’s this idea called <em>programming</em>: if you have a long laborious set of tasks to do more than once <em>you write a program to do that for you</em>. And yes, I have done just that.</p>
<hr />
<p>All of these people <em>want to lose</em>: they need there always to be something in the way that prevents them getting whatever it is they pretend to want to do done. If such a barrier is removed <em>they will build a new one</em>: I know this because I have done just that and watched them build their new barrier so they could avoid actually doing anything and keep complaining. These barriers <em>do not exist</em>: if you want a cross-platform IDE for Lisp <a href="http://www.lispworks.com/"><em>that IDE exists</em></a>. If you don’t want to use a commercial product, Emacs and SLIME/SLY are free, and fine. And yes there is a learning curve which is somewhat steep, but <em>intellectually difficult things have steep learning curves</em>: if you’re going to become a productive mathematician you are going to go through four years of very steep learning curve indeed, and if you’re going to become a productive Lisp programmer you’re going to go through a learning curve perhaps a tenth or less as hard as that, of which Emacs is one tiny part. If you’re not up to that, <em>don’t write Lisp</em>.</p>
<p>And if what you enjoy doing is whining in public about how things are always in your way then <em>fuck off</em>.</p>The proper use of macros in Lispurn:https-www-tfeb-org:-fragments-2021-11-11-the-proper-use-of-macros-in-lisp2021-11-11T14:32:11Z2021-11-11T14:32:11ZTim Bradshaw
<p>People learning Lisp often try to learn how to write macros by taking an existing function they have written and turning it into a macro. This is a mistake: macros and functions serve different purposes and it is almost never useful to turn functions into macros, or macros into functions.</p>
<!-- more-->
<p>Let’s say you are learning Common Lisp<sup><a href="#2021-11-11-the-proper-use-of-macros-in-lisp-footnote-1-definition" name="2021-11-11-the-proper-use-of-macros-in-lisp-footnote-1-return">1</a></sup>, and you have written a fairly obvious factorial function based on the natural mathematical definition: if \(n \in \mathbb{N}\), then</p>
<p>\[
n! =
\begin{cases}
1 &n \le 1\\
n \times (n - 1)! &n > 1
\end{cases}
\]</p>
<p>So this gives you a fairly obvious recursive definition of <code>factorial</code>:</p>
<pre class="brush: lisp"><code>(defun factorial (n)
(if (<= n 1)
1
(* n (factorial (1- n )))))</code></pre>
<p>And so, you think you want to learn about macros so can you write <code>factorial</code> as a macro? And you might end up with something like this:</p>
<pre class="brush: lisp"><code>(defmacro factorial (n)
`(if (<= ,n 1)
1
(* ,n (factorial ,(1- n )))))</code></pre>
<p>And this superficially seems as if it works:</p>
<pre class="brush: lisp"><code>> (factorial 10)
3628800</code></pre>
<p>But it doesn’t, in fact, work:</p>
<pre class="brush: lisp"><code>> (let ((x 3))
(factorial x))
Error: In 1- of (x) arguments should be of type number.</code></pre>
<p>Why doesn’t this work and can it be fixed so it does? If it can’t what has gone wrong and how are macros meant to work and what are they useful for?</p>
<p>It can’t be fixed so that it works. trying to rewrite functions as macros is a bad idea, and if you want to learn what is interesting about macros you should not start there.</p>
<p>To understand why this is true you need to understand what macros actually <em>are</em> in Lisp.</p>
<h2 id="what-macros-are-a-first-look">What macros are: a first look</h2>
<p><strong>A macro is a function whose domain and range is <em>syntax</em>.</strong></p>
<p>Macros <em>are</em> functions (quite explicitly so in CL: you can get at the function of a macro with <code>macro-function</code>, and this is something you can happily call the way you would call any other function), but they are functions whose domain and range is <em>syntax</em>. A macro is a function whose argument is a language whose syntax includes the macro and whose value, when called on an instance of that language, is a language whose syntax <em>doesn’t</em> include the macro. It may work recursively: its value may be a language which includes the same macro but in some simpler way, such that the process will terminate at some point.</p>
<p>So the job of macros is to provide a family of extended languages built on some core Lisp which has no remaining macros, only functions and function application, special operators & special forms involving them and literals. One of those languages is the language we call Common Lisp, but the macros written by people serve to extend this language into a multitude of variants.</p>
<p>As an example of this I often write in a language which is like CL, but is extended by the presence of a number of extra constructs, one of which is called ITERATE (but it predates the well-known one and is not at all the same):</p>
<pre class="brush: lisp"><code>(iterate next ((x 1))
(if (< x 10)
(next (1+ x))
x)</code></pre>
<p>is equivalent to</p>
<pre class="brush: lisp"><code>(labels ((next (x)
(if (< x 10)
(next (1+ x))
x)))
(next 1))</code></pre>
<p>Once upon a time when I first wrote <code>iterate</code>, it used to manually optimize the recursive calls to jumps in some cases, because the Symbolics I wrote it on didn’t have tail-call elimination. That’s a non-problem in LispWorks<sup><a href="#2021-11-11-the-proper-use-of-macros-in-lisp-footnote-2-definition" name="2021-11-11-the-proper-use-of-macros-in-lisp-footnote-2-return">2</a></sup>. Anyone familiar with Scheme will recognise <code>iterate</code> as named <code>let</code>, which is where it came from (once, I think, it was known as <code>nlet</code>).</p>
<p><code>iterate</code> is implemented by a function which maps from the language which includes it to a language which doesn’t include it, by mapping the syntax as above.</p>
<p>So compare this with a factorial function: factorial is a function whose domain is natural numbers and whose range is also natural numbers, and it has an obvious recursive definition. Well, natural numbers are part of the syntax of Lisp, but they’re a tiny part of it. So implementing factorial as a macro is, really, a hopeless task. What should</p>
<pre class="brush: lisp"><code>(factorial (+ x y (f z)))</code></pre>
<p>Actually do when considered as a mapping between languages? Assuming you are using the recursive definition of the factorial function then the answer is it can’t map to anything useful at all: a function which implements that recursive definition simply has to be called at run time. The very best you could do would seem to be this:</p>
<pre><code>(defun fact (n)
(if (< n 3)
n
(* n (fact (1- n)))))
(defmacro factorial (expression)
`(fact ,expression))</code></pre>
<p>And that’s not a useful macro (but see below).</p>
<p>So the answer is, again, that macros are functions which map between <em>languages</em> and they are useful where you want a new <em>language</em>: not just the same language with extra functions in it, but a language with new control constructs or something like that. If you are writing functions whose range is something which is not the syntax of a language built on Common Lisp, <em>don’t write macros</em>.</p>
<h2 id="what-macros-are-a-second-look">What macros are: a second look</h2>
<p><strong>Macroexpansion is compilation.</strong></p>
<p>A function whose domain is one language and whose range is another is a <em>compiler</em> for the language of the domain, especially when that language is somehow richer than the language of the range, which is the case for macros.</p>
<p>But it’s a simplification to say that <em>macros</em> are this function: they’re not, they’re only part of it. The actual function which maps between the two languages is made up of macros <em>and the macroexpander provided by CL itself</em>. The macroexpander is what arranges for the functions defined by macros to be called in the right places, and also it is the thing which arranges for various recursive macros to actually make up a recurscive function. So it’s important to understand that the macroexpander is a critical part of the process: macros on their own only provide part of it.</p>
<h2 id="an-example-two-versions-of-a-recursive-macro">An example: two versions of a recursive macro</h2>
<p>People often say that you should not write recursive macros, but this prohibition on recursive macros is pretty specious: they’re just fine. Consider a language which only has <code>lambda</code> and doesn’t have <code>let</code>. Well, we can write a simple version of <code>let</code>, which I’ll call <code>bind</code> as a macro: a function which takes this new language and turns it into the more basic one. Here’s that macro:</p>
<pre class="brush: lisp"><code>(defmacro bind ((&rest bindings) &body forms)
`((lambda ,(mapcar #'first bindings) ,@forms)
,@(mapcar #'second bindings)))</code></pre>
<p>And now</p>
<pre class="brush: lisp"><code>> (bind ((x 1) (y 2))
(+ x y))
(bind ((x 1) (y 2)) (+ x y))
-> ((lambda (x y) (+ x y)) 1 2)
3</code></pre>
<p>(These example expansions come via use of my <a href="https://tfeb.github.io/tfeb-lisp-hax/#tracing-macroexpansion-trace-macroexpand">trace-macroexpand package</a>, available in a good Lisp near you: see appendix for configuration).</p>
<p>So now we have a language with a binding form which is more convenient than <code>lambda</code>. But maybe we want to be able to bind sequentially? Well, we can write a <code>let*</code> version, called <code>bind*</code>, which looks like this</p>
<pre class="brush: lisp"><code>(defmacro bind* ((&rest bindings) &body forms)
(if (null (rest bindings))
`(bind ,bindings ,@forms)
`(bind (,(first bindings))
(bind* ,(rest bindings) ,@forms))))</code></pre>
<p>And you can see how this works: it checks if there’s just one binding in which case it’s just <code>bind</code>, and if there’s more than one it peels off the first and then expands into a <code>bind*</code> form for the rest. And you can see this working (here both <code>bind</code> and <code>bind*</code> are being traced):</p>
<pre class="brush: lisp"><code>> (bind* ((x 1) (y (+ x 2)))
(+ x y))
(bind* ((x 1) (y (+ x 2))) (+ x y))
-> (bind ((x 1)) (bind* ((y (+ x 2))) (+ x y)))
(bind ((x 1)) (bind* ((y (+ x 2))) (+ x y)))
-> ((lambda (x) (bind* ((y (+ x 2))) (+ x y))) 1)
(bind* ((y (+ x 2))) (+ x y))
-> (bind ((y (+ x 2))) (+ x y))
(bind ((y (+ x 2))) (+ x y))
-> ((lambda (y) (+ x y)) (+ x 2))
(bind* ((y (+ x 2))) (+ x y))
-> (bind ((y (+ x 2))) (+ x y))
(bind ((y (+ x 2))) (+ x y))
-> ((lambda (y) (+ x y)) (+ x 2))
4</code></pre>
<p>You can see that, in this implementation, which is LW again, some of the forms are expanded more than once: that’s not uncommon in interpreted code: since macros should generally be functions (so, not have side-effects) it does not matter that they may be expanded multiple times. Compilation will expand macros and then compile the result, so all the overhead of macroexpansion happend ahead of run-time:</p>
<pre class="brush: lisp"><code> (defun foo (x)
(bind* ((y (1+ x)) (z (1+ y)))
(+ y z)))
foo
> (compile *)
(bind* ((y (1+ x)) (z (1+ y))) (+ y z))
-> (bind ((y (1+ x))) (bind* ((z (1+ y))) (+ y z)))
(bind ((y (1+ x))) (bind* ((z (1+ y))) (+ y z)))
-> ((lambda (y) (bind* ((z (1+ y))) (+ y z))) (1+ x))
(bind* ((z (1+ y))) (+ y z))
-> (bind ((z (1+ y))) (+ y z))
(bind ((z (1+ y))) (+ y z))
-> ((lambda (z) (+ y z)) (1+ y))
foo
nil
nil
> (foo 3)
9</code></pre>
<p>There’s nothing wrong with macros like this, which expand into simpler versions of themselves. You just have to make sure that the recursive expansion process is producing successively simpler bits of syntax and has a well-defined termination condition.</p>
<p>Macros like this are often called ‘recursive’ but they’re actually not: the function associated with <code>bind*</code> does not call itself. What <em>is</em> recursive is the function implicitly defined by the combination of the macro function and the macroexpander: the <code>bind*</code> function simply expands into a bit of syntax which it knows will cause the macroexpander to call it again.</p>
<p>It is possible to write <code>bind*</code> such that the macro function <em>itself</em> is recursive:</p>
<pre class="brush: lisp"><code>(defmacro bind* ((&rest bindings) &body forms)
(labels ((expand-bind (btail)
(if (null (rest btail))
`(bind ,btail
,@forms)
`(bind (,(first btail))
,(expand-bind (rest btail))))))
(expand-bind bindings)))</code></pre>
<p>And now compiling <code>foo</code> again results in this output from tracing macroexpansion:</p>
<pre class="brush: lisp"><code>(bind* ((y (1+ x)) (z (1+ y))) (+ y z))
-> (bind ((y (1+ x))) (bind ((z (1+ y))) (+ y z)))
(bind ((y (1+ x))) (bind ((z (1+ y))) (+ y z)))
-> ((lambda (y) (bind ((z (1+ y))) (+ y z))) (1+ x))
(bind ((z (1+ y))) (+ y z))
-> ((lambda (z) (+ y z)) (1+ y))</code></pre>
<p>You can see that now all the recursion happens within the macro function for <code>bind*</code> itself: the macroexpander calls <code>bind*</code>’s macro function just once.</p>
<p>While it’s possible to write macros like this second version of <code>bind*</code>, it is normally easier to write the first version and to allow the combination of the macroexpander and the macro function to implement the recursive expansion.</p>
<hr />
<h2 id="two-historical-uses-for-macros">Two historical uses for macros</h2>
<p>There are two uses for macros — both now historical — where they <em>were</em> used where functions would be more natural.</p>
<p>The first of these is <em>function inlining</em>, where you want to avoid the overhead of calling a small function many times. This overhead was a lot on computers made of cardboard, as all computers were, and also if the stack got too deep the cardboard would tear and this was bad. It makes no real sense to inline a recursive function such as the above <code>factorial</code>: how would the inlining process terminate? But you could rewrite a factorial function to be explicitly iterative:</p>
<pre class="brush: lisp"><code>(defun factorial (n)
(do* ((k 1 (1+ k))
(f k (* f k)))
((>= k n) f)))</code></pre>
<p>And now, if you have very many calls to <code>factorial</code>, you wanted to optimise the function call overhead away, <em>and it was 1975</em>, you might write this:</p>
<pre class="brush: lisp"><code>(defmacro factorial (n)
`(let ((nv ,n))
(do* ((k 1 (1+ k))
(f k (* f k)))
((>= k nv) f))))</code></pre>
<p>And this has the effect of replacing <code>(factorial n)</code> by an expression which will compute the factorial of <code>n</code>. The cost of that is that <code>(funcall #'factorial n)</code> is not going to work, and <code>(funcall (macro-function 'factorial) ...)</code> is never what you want.</p>
<p>Well, that’s what you did in 1975, because Lisp compilers were made out of the things people found down the sides of sofas. Now it’s no longer 1975 and you just tell the compiler that you want it to inline the function, please:</p>
<pre class="brush: lisp"><code>(declaim (inline factorial))
(defun factorial (n) ...)</code></pre>
<p>and it will do that for you. So this use of macros is now purely historicl.</p>
<p>The second reason for macros where you really want functions is computing things at compile time. Let’s say you have lots of expressions like <code>(factorial 32)</code> in your code. Well, you could do this:</p>
<pre class="brush: lisp"><code>(defmacro factorial (expression)
(typecase expression
((integer 0)
(factorial/fn expression))
(number
(error "factorial of non-natural literal ~S" expression))
(t
`(factorial/fn ,expression))))</code></pre>
<p>So the <code>factorial</code> macro checks to see if its argument is a literal natural number and will compute the factorial of it at macroexpansion time (so, at compile time or just before compile time). So a function like</p>
<pre class="brush: lisp"><code>(defun foo ()
(factorial 32))</code></pre>
<p>will now compile to simply return <code>263130836933693530167218012160000000</code>. And, even better, there’s some compile-time error checking: code which is, say, <code>(factorial 12.3)</code> will cause a compile-time error.</p>
<p>Well, again, this is what you would do if it was 1975. It’s not 1975 any more, and CL has a special tool for dealing with just this problem: compiler macros.</p>
<pre class="brush: lisp"><code>(defun factorial (n)
(do* ((k 1 (1+ k))
(f k (* f k)))
((>= k n) f)))
(define-compiler-macro factorial (&whole form n)
(typecase n
((integer 0)
(factorial n))
(number
(error "literal number is not a natural: ~S" n))
(t form)))</code></pre>
<p>Now <code>factorial</code> is a function and works the way you expect — <code>(funcall #'factoial ...)</code> will work fine. But the compiler knows that if it comes across <code>(factorial ...)</code> then it should give the compiler macro for <code>factorial</code> a chance to say what this expression should actually be. And the compiler macro does an explicit check for the argument being a literal natural number, and if it is computes the factorial at compile time, and the same check for a literal number which is not a natural, and finally just says ’I don’t know, call the function’. Note that the compiler macro itself calls <code>factorial</code>, but since the argument isn’t a literal there’s no recursive doom.</p>
<p>So this takes care of the other antique use of macros where you would expect functions. And of course you can combine this with inlining and it will all work fine: you can write functions which will handle special cases via compiler macros and will otherwise be inlined.</p>
<p>That leaves macros serving the purpose they are actually useful for: building languages.</p>
<hr />
<h2 id="appendix-setting-up-trace-macroexpand">Appendix: setting up <code>trace-macroexpand</code></h2>
<pre class="brush: lisp"><code>(use-package :org.tfeb.hax.trace-macroexpand)
;;; Don't restrict print length or level when tracing
(setf *trace-macroexpand-print-level* nil
*trace-macroexpand-print-length* nil)
;;; Enable tracing
(trace-macroexpand)
;;; Trace the macros you want to look at ...
(trace-macro ...)
;;; ... and ntrace them
(untrace-macro ...)</code></pre>
<hr />
<div class="footnotes">
<ol>
<li id="2021-11-11-the-proper-use-of-macros-in-lisp-footnote-1-definition" class="footnote-definition">
<p>All the examples in this article are in Common Lisp except where otherwise specified. Other Lisps have similar considerations, although macros in Scheme are not explicitly functions in the way they are in CL. <a href="#2021-11-11-the-proper-use-of-macros-in-lisp-footnote-1-return">↩</a></p></li>
<li id="2021-11-11-the-proper-use-of-macros-in-lisp-footnote-2-definition" class="footnote-definition">
<p>This article originated as a message on the <code>lisp-hug</code> mailing list for <a href="http://www.lispworks.com/">LispWorks</a> users. References to ‘LW’ mean LispWorks, although everything here should apply to any modern CL. (In terms of tail call elimination I would define a CL which does not eliminate tail self-calls in almost all cases under reasonable optimization settings as pre-modern: I don’t use such implementations.) <a href="#2021-11-11-the-proper-use-of-macros-in-lisp-footnote-2-return">↩</a></p></li></ol></div>The best Lispurn:https-www-tfeb-org:-fragments-2021-11-03-the-best-lisp2021-11-03T12:03:44Z2021-11-03T12:03:44ZTim Bradshaw
<p>People sometimes ask <a href="https://www.reddit.com/r/lisp/comments/qlcza4/best_lisp_dialect/">which is the best Lisp dialect</a>? That’s a category error, and here’s why.</p>
<!-- more-->
<p>Programming in Lisp — any Lisp — is about <em>building languages</em>: in Lisp the way you solve a problem is by building a language — a jargon, or a dialect if you like — to talk about the problem and then solving the problem in that language. Lisps are, quite explicitly, language-building languages.</p>
<p>This is, in fact, how people solve large problems in <em>all</em> programming languages: <a href="https://en.wikipedia.org/wiki/Greenspun's_tenth_rule" title="Greenspun's tenth rule">Greenspun’s tenth rule</a> isn’t really a statement about Common Lisp, it’s a statement that all sufficiently large software systems end up having some hacked-together, informally-specified, half-working <em>language</em> in which the problem is actually solved. Often people won’t understand that the thing they’ve built is in fact a language, but that’s what it is. Everyone who has worked on large-scale software will have come across these things: often they are very horrible, and involve much use of language-in-a-string<sup><a href="#2021-11-03-the-best-lisp-footnote-1-definition" name="2021-11-03-the-best-lisp-footnote-1-return">1</a></sup>.</p>
<p>The Lisp difference is two things: when you start solving a problem in Lisp, you <em>know</em>, quite explicitly, that this is what you are going to do; and the language has wonderful tools which let you incrementally build a series of lightweight languages, ending up with one or more languages in which to solve the problem.</p>
<p>So, after that preface, why is this question the wrong one to ask? Well, if you are going to program in Lisp you are going to be building languages, and you want those languages not to be awful. Lisp makes it it far easier to build languages which are not awful, but it doesn’t prevent you doing so if you want to. And again, anyone who has dealt with enough languages built on Lisps will have come across some which are, in fact, awful.</p>
<p>If you are going to build languages then you need to understand how languages work — what makes a language habitable to its human users (the computer does not care with very few exceptions). That means you will need to be a <em>linguist</em>. So the question then is: how do you become a linguist? Well, we know the answer to that, because there are lots of linguists and lots of courses on linguistics. You might say that, well, those people study <em>natural</em> languages, but that’s irrelevant: natural languages have been under evolutionary pressure for a very long time and they’re really <em>good</em> for what they’re designed for (which is not the same as what programming languages are designed for, but the users — humans — are the same).</p>
<p>So, do you become a linguist by learning French? Or German? Or Latin? Or Cuzco Quechua? No, you don’t. You become a linguist by learning enough about enough languages that you can understand how languages work. A linguist isn’t someone who speaks French really well: they’re someone who understands that French is a Romance language, that German isn’t but has many Romance loan words, that English is closer to German than it is French but got a vast injection of Norman French, which in turn wasn’t that close to modern French, that Swiss German has cross-serial dependencies but Hochdeutsch does not and what that means, and so on. A linguist is someone who understands things about the <em>structure</em> of languages: what do you see, what do you never see, how do different languages do equivalent things? And so on.</p>
<p>The way you become a linguist is not by picking a language and learning it: it’s by looking at lots of languages enough to understand how they work.</p>
<p>If you want to learn to program in Lisp, you will need to become a linguist. The very best way to ensure you fail at that is to pick a ‘best’ Lisp and learn that. There is no best Lisp, and in order to program well in <em>any</em> Lisp you must be exposed to as many Lisps and as many other languages as possible.</p>
<hr />
<p>If you think there’s a distinction between a ‘dialect’, a ‘jargon’ and a ‘language’ then I have news for you: there is. A language is a dialect with a standards committee. (This is stolen from a quote due to Max Weinrich that all linguists know:</p>
<blockquote>
<p>אַ שפּראַך איז אַ דיאַלעקט מיט אַן אַרמיי און פֿלאָט</p></blockquote>
<p>a shprakh iz a dyalekt mit an armey un flot.)</p>
<hr />
<div class="footnotes">
<ol>
<li id="2021-11-03-the-best-lisp-footnote-1-definition" class="footnote-definition">
<p>‘Language-in-a-string’ is where a programming language has another programming language embedded in strings in the outer language. Sometimes programs in that inner programming language will be made up by string concatenation in the outer language. Sometimes that inner language will, in turn, have languages embedded in its strings. It’s a terrible, terrible thing. <a href="#2021-11-03-the-best-lisp-footnote-1-return">↩</a></p></li></ol></div>Generic interfaces in Racketurn:https-www-tfeb-org:-fragments-2021-01-08-generic-interfaces-in-racket2021-01-08T18:25:59Z2021-01-08T18:25:59ZTim Bradshaw
<p>Or: things you do to distract yourself from watching an attempted fascist coup.</p>
<!-- more-->
<p>A thing that exists in many languages with a notion of a sequence of objects is a function variously known as <code>fold</code> or <code>reduce</code>: this takes another function of two arguments, some initial value, and walks along the sequence successively reducing it using the function. So, for instance:</p>
<ol>
<li><code>(fold + 0 '(1 2 3))</code> turns into <code>(fold + (+ 0 1) '(2 3))</code> which turns into …</li>
<li><code>(fold + 1 '(2 3))</code> turns into <code>(fold + (+ 1 2) '(3))</code> which turns into …</li>
<li><code>(fold + 3 '(3))</code> turns into <code>(fold + (+ 3 3) '())</code> which turns into …</li>
<li><code>6</code>.</li></ol>
<p>It’s pretty easy to write a version of <code>fold</code> for lists:</p>
<pre><code>(define (fold op initial l)
(if (null? l)
initial
(fold op (op initial (first l)) (rest l))))</code></pre>
<p>Racket calls this (or a more careful version of this) <code>foldl</code>: there is also <code>foldr</code> which works from the other end of the list and is more expensive as a result.</p>
<p>Well, one thing you might want to do is have a version of <code>fold</code> which works on <em>trees</em> rather than just lists. One definition of a tree is:</p>
<ol>
<li>it’s a collection of nodes;</li>
<li>nodes have values;</li>
<li>nodes have zero or more unique children, which are nodes.</li>
<li>no node is the descendant of more than one node;</li>
<li>there is exactly one root node which is the descendant of no other nodes.</li></ol>
<p>A variant of this (which will matter below) is that the children of a node are either nodes or any other object, and there is some way of knowing if something is a node or not<sup><a href="#2021-01-08-generic-interfaces-in-racket-footnote-1-definition" name="2021-01-08-generic-interfaces-in-racket-footnote-1-return">1</a></sup>.</p>
<p>You can obviously represent trees as conses, with the value of a cons being its car, and the children being its cdr. Whatever builds the tree needs to make sure that (3), (4) and (5) are true, or you get a more general graph structure.</p>
<p>But you might want to have other sorts of trees, and you’d want the fold function not to care about what sort of tree it was processing: just that it was processing a tree. Indeed, it would be nice if it was possible to provide special implementations for, for instance, binary trees where rather than iterating over some sequence of children you’d know there were exactly two.</p>
<p>So, I wondered if there was a nice way of expressing this in Racket and it turns out there mostly is. Racket has a notion of <a href="https://docs.racket-lang.org/reference/struct-generics.html">generic interfaces</a> which are really intended as a way for different <a href="https://docs.racket-lang.org/reference/structures.html">structure types</a> to provide common interfaces, I think. But it turns out they can be (ab?)used to do this, as well.</p>
<p>Generic interfaces are not provided by <code>racket</code> but by <code>racket/generic</code>: everything below assumed <code>(require racket/generic)</code>.</p>
<h2 id="a-generic-treelike-interface">A generic <code>treelike</code> interface</h2>
<p>A treelike object supports two operations:</p>
<ul>
<li><code>node-value</code> returns the value of a node;</li>
<li><code>node-children</code> returns a list of the node’s children.</li></ul>
<p>The second of these is a bit nasty: it would be better perhaps to either provide an interface for mapping over a node’s children, or to return some general, possibly lazy, sequence of children. But this is just playing, so I don’t mind.</p>
<p>Here is a definition of a generic <code>treelike</code> interface, which includes default methods for lists:</p>
<pre><code>(define-generics treelike
;; treelike objects have values and children
(node-value treelike)
(node-children treelike)
#:fast-defaults
(((λ (t)
(and (cons? t) (list? t)))
;; non-null proper lists are trees: their value is their car;
;; their children are their cdr.
(define node-value car)
(define node-children cdr))))</code></pre>
<p>Notes:</p>
<ul>
<li>This uses <code>#:fast-defaults</code> instead of <code>#:defaults</code>, which means that the dispatch to objects which satisfy <code>list?</code> happens. This is fine in this case: lists are never going to be confused with any other tree type.</li>
<li>This relies on Racket’s (and Scheme’s?) <code>list?</code> predicate returning true only for proper lists rather than CL’s cheap <code>listp</code> which just returns true for anything which is either <code>nil</code> or a cons.</li>
<li>There are lots of other options to <code>define-generics</code> which I’m not using and many of which I don’t understand.</li></ul>
<p>With this definition:</p>
<pre><code>> (treelike? '())
#f
> (treelike? '(1 2 3))
#t
> (treelike? '(1 2 . 3))
#f
> (node-children '(1 2 3))
'(2 3)</code></pre>
<p>So, OK.</p>
<h2 id="a-treelike-binary-tree">A <code>treelike</code> binary tree</h2>
<p>We could then define a <code>binary-tree</code> type which implements this generic interface:</p>
<pre><code>(struct binary-tree (value left right)
#:transparent
#:methods gen:treelike
((define (node-value bt)
(binary-tree-value bt))
(define (node-children bt)
(list (binary-tree-left bt)
(binary-tree-right bt)))))</code></pre>
<p>The <code>#:methods gen:treelike</code> tells the structure we’re defining the methods needed for this thing to be a <code>treelike</code> object.</p>
<p>And now we can check things:</p>
<pre><code>> (treelike? (binary-tree 1 2 3))
#t
> (node-value (binary-tree 1 2 3))
1
> (node-children (binary-tree 1 2 3))
'(2 3)</code></pre>
<p>OK.</p>
<h2 id="two-attempts-at-a-generic-foldable-interface">Two attempts at a generic <code>foldable</code> interface</h2>
<p>So now I want to define another interface for things which can be folded. And the first thing I tried is this:</p>
<pre><code>(define-generics foldable
;; broken
(fold operation initial foldable)
#:defaults
((treelike?
(define (fold op initial treelike)
(let ([current (op initial (node-value treelike))]
[children (node-children treelike)])
(if (null? children)
current
(fold op (fold op current (first children))
(rest children))))))
((const true)
(define (fold op initial any)
(op initial any)))))</code></pre>
<p>So this tries to define a <code>fold</code> generic function, which has two implementations: one for <code>treelike</code> objects and one for <em>all other objects</em>. So this means that <em>all</em> objects are foldable, and, for instance <code>(fold + 0 1)</code> simply turns into <code>(+ 0 1)</code>. This is a bit odd but it simplifies the implementation of the interface for <code>treelike</code> objects on the assumption that the children of nodes may not themselves be nodes (see above).</p>
<p>There is another complexity: if the list of a <code>treelike</code> node’s children isn’t null, then it’s a <code>treelike</code>, so it can safely be recursed over rather than explicitly iterated over. This is a slightly questionable pun I think, but, well, I am a slightly questionable programmer.</p>
<p>And this … doesn’t work:</p>
<pre><code>> (fold + 0 '(1 2 3))
; node-value: contract violation:
; expected: treelike?
; given: 2
; argument position: 1st</code></pre>
<p>It took me a long time to understand this, and the answer is that the definitions of <code>fold</code> inside the <code>define-generic</code> form <em>aren’t adding methods to a generic function</em>: what they are doing is defining a little local function, <code>fold</code> which <em>then</em> gets glued into the generic function. So references to <code>fold</code> in the definition refer to the little local function. It is exactly as if you had done this, in fact:</p>
<pre><code>(define-generics foldable
;; this is why it's broken
(fold operation initial foldable)
#:defaults
((treelike?
(define fold
(letrec ([fold (λ (op initial treelike)
(let ([current (op initial (node-value treelike))]
[children (node-children treelike)])
(if (null? children)
current
(fold op (fold op current (first children))
(rest children)))))])
fold)))
((const true)
(define (fold op initial any)
(op initial any)))))</code></pre>
<p>And you can see why this can’t work: the <code>fold</code> bound by the <code>letrec</code> calls itself rather than going through the generic dispatch.</p>
<p>The way to fix this is to use the magic <code>define/generic</code> form to get a copy of the generic function, and then call <em>that</em>. This is syntactically horrid, but you can see why it is needed given the above. So a working version of this interface purports to be:</p>
<pre><code>(define-generics foldable
;; not broken
(fold operation initial foldable)
#:defaults
((treelike?
(define/generic fold/g fold)
(define (fold op initial treelike)
(let ([current (op initial (node-value treelike))]
[children (node-children treelike)])
(if (null? children)
current
(fold op (fold/g op current (first children))
(rest children))))))
((const true)
(define (fold op initial any)
(op initial any)))))</code></pre>
<p>And indeed it is not broken:</p>
<pre><code>> (fold + 0 '(1 2 3))
6</code></pre>
<p>and with some tracing added:</p>
<pre><code>> (fold + 0 '(1 2 3))
fold/treelike + 0 (1 2 3)
fold/any + 1 2
fold/treelike + 3 (3)
6</code></pre>
<h2 id="adding-a-special-case-to-fold-for-the-binary-tree">Adding a special case to <code>fold</code> for the binary tree</h2>
<p>So now, finally, we can add a special case to <code>fold</code> to the binary tree defined above, rather than needlessly consing a list of children. We will need the same explicit-generic-function hack as before as the children of a binary tree may not be binary trees.</p>
<pre><code>(struct binary-tree (value left right)
#:transparent
#:methods gen:treelike
((define (node-value bt)
(binary-tree-value bt))
(define (node-children bt)
(list (binary-tree-left bt)
(binary-tree-right bt))))
#:methods gen:foldable
((define/generic fold/g fold)
(define (fold op initial bt)
(fold/g op
(fold/g op (op initial (binary-tree-value bt))
(binary-tree-left bt))
(binary-tree-right bt)))))</code></pre>
<p>And now</p>
<pre><code>> (fold + 0 (binary-tree 1
(binary-tree 2 3 4)
(binary-tree 5 6 7)))
28</code></pre>
<p>and with some tracing</p>
<pre><code>> (fold + 0 (binary-tree 1
(binary-tree 2 3 4)
(binary-tree 5 6 7)))
fold/bt + 0 #(struct:binary-tree 1 #(struct:binary-tree 2 3 4) #(struct:binary-tree 5 6 7))
fold/bt + 1 #(struct:binary-tree 2 3 4)
fold/any + 3 3
fold/any + 6 4
fold/bt + 10 #(struct:binary-tree 5 6 7)
fold/any + 15 6
fold/any + 21 7
28</code></pre>
<h2 id="missing-clos">Missing CLOS</h2>
<p>In some ways this makes me miss CLOS: the explicit-generic-function hack is very annoying, single dispatch is annoying, not being able to define predicate-based methods separately from the <code>define-generics</code> form is annoying. But on the other hand predicate-based dispatch is pretty cool.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2021-01-08-generic-interfaces-in-racket-footnote-1-definition" class="footnote-definition">
<p>Perhaps these should be called ‘sloppy trees’ or something. <a href="#2021-01-08-generic-interfaces-in-racket-footnote-1-return">↩</a></p></li></ol></div>The U combinatorurn:https-www-tfeb-org:-fragments-2020-03-09-the-u-combinator2020-03-09T17:45:22Z2020-03-09T17:45:22ZTim Bradshaw
<p>The U combinator allows you to define recursive functions and I think it is simpler to understand than the Y combinator.</p>
<hr />
<p>It’s not obvious how things like <code>letrec</code> get defined in Scheme, without using secret assignment. In fact I think they <em>are</em> defined using secret assignment:</p>
<pre><code>(letrec ([f (λ (...) ... (f ...) ...)])
...)</code></pre>
<p>turns into</p>
<pre><code>(let ([f ...])
(set! f (λ (...) ... (f ...) ...))
...)</code></pre>
<p>But it’s interesting to see how you can define recursive functions without relying on assignment, including mutually-recursive collections of functions. One way is using the U combinator.</p>
<p>I suspect that there is lots of information about this out there, but it’s seriously hard to search for anything which looks like ’*-combinator’ now (even now I am starting a set of companies called ‘integration by parts’, ‘the quotient rule’ &c).</p>
<p>You can famously do this with the Y combinator, but I didn’t want to do that because Y is something I find I can understand for a few hours at a time and then I have to work it all out again. But it turns out that you can use something much simpler: the U combinator. It seems to be even harder to search for this than Y, but here is a quote about it:</p>
<blockquote>
<p>In the theory of programming languages, the U combinator, \(U\), is the mathematical function that applies its argument to its argument; that is \(U(f) = f(f)\), or equivalently, \(U = \lambda f \cdot f(f)\).</p></blockquote>
<blockquote>
<p>Self-application permits the simulation of recursion in the λ-calculus, which means that the U combinator enables universal computation. (The U combinator is actually more primitive than the more well-known fixed-point Y combinator.)</p></blockquote>
<blockquote>
<p>The expression \(U(U)\) is the smallest non-terminating program.</p></blockquote>
<p>(Text mildly edited from <a href="http://www.ucombinator.org/">here</a>, which unfortunately is not a site all about the U combinator other than this quote.)</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>All of the following code samples are in <a href="https://racket-lang.org/">Racket</a>. The macros are certainly Racket-specific and some of the other code probably is as well. To make the macros work you will need <code>syntax-parse</code> via:</p>
<pre><code>(require (for-syntax syntax/parse))</code></pre>
<p>However note that my use of <code>syntax-parse</code> is naïve in the extreme: I’m really just an unfrozen CL caveman pretending to understand Racket’s macro system.</p>
<p>Also note I have not ruthlessly turned everything into λ: Rather than <code>((λ (...) ...) ...)</code> there is <code>(let ([... ...] ...) ...)</code> in this code; there is use of multiple values including <code>let-values</code>; there is <code>(define (f ...) ...)</code> rather than <code>(define f (λ (...) ...))</code> and so on.</p>
<h2 id="two-versions-of-u">Two versions of U</h2>
<p>The first version of U is the obvious one:</p>
<pre><code>(define (U f)
(f f))</code></pre>
<p>But this will run into some problems with an applicative-order language, which Racket is by default. To avoid that we can make the assumption that <code>(f f)</code> is going to be a function, and wrap that form in another function to delay its evaluation until it’s needed: this is the standard trick that you have to do for Y in an applicative-order language as well. I’m only going to use the applicative-order U when I have to, so I’ll give it a different name:</p>
<pre><code>(define (U/ao f)
(λ args (apply (f f) args)))</code></pre>
<p>Note also that I’m allowing more than one argument rather than doing the pure-λ-calculus thing.</p>
<h2 id="using-u-to-construct-a-recursive-functions">Using U to construct a recursive functions</h2>
<p>To do this we do a similar trick that you do with Y: write a function which, if given a function as argument which deals with the recursive cases, will return a recursive function. And obviously I’ll use the Fibonacci function as the canonical recursive function.</p>
<p>So, consider this thing:</p>
<pre><code>(define fibber
(λ (f)
(λ (n)
(if (<= n 2)
1
(+ ((U f) (- n 1))
((U f) (- n 2)))))))</code></pre>
<p>This is a function which, given another function, <code>U</code> of which computes smaller Fibonacci numbers, will return a function which will compute the Fibonacci number for <code>n</code>.</p>
<p>In other words, <em><code>U</code> of this function is the Fibonacci function</em>!</p>
<p>And we can test this:</p>
<pre><code>> (define fibonacci (U fibber))
> (fibonacci 10)
55</code></pre>
<p>So that’s very nice.</p>
<h2 id="wrapping-u-in-a-macro">Wrapping U in a macro</h2>
<p>So, to hide all this the first thing to do is to remove the explicit calls to <code>U</code> in the recursion. We can lift them out of the inner function completely:</p>
<pre><code>(define fibber/broken
(λ (f)
(let ([fib (U f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))</code></pre>
<p><em>Don’t try to compute <code>U</code> of this</em>: it will recurse endlessly because <code>(U fibber/broken)</code> -> <code>(fibber/broken fibber/broken)</code> and this involves computing <code>(U fibber/broken)</code>, and we’re doomed.</p>
<p>Instead we can use <code>U/ao</code>:</p>
<pre><code>(define fibber
(λ (f)
(let ([fib (U/ao f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))</code></pre>
<p>And this is all fine <code>((U fibber) 10)</code> is <code>55</code> (and terminates!).</p>
<p>Purists can then turn <code>let</code> into <code>λ</code> in the usual way:</p>
<pre><code>(define fibber
(λ (f)
((λ (fib)
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(U/ao f))))</code></pre>
<p>And this is really all you need to be able to write the macro:</p>
<pre><code>(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'(let ([name (U (λ (f)
(let ([name (U/ao f)])
value)))])
form ...)]))</code></pre>
<p>Or, for the pure of heart:</p>
<pre><code>(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'((λ (name)
form ...)
(U (λ (f)
((λ (name)
value)
(U/ao f)))))]))</code></pre>
<p>And this works fine:</p>
<pre><code>(with-recursive-binding (fib (λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(fib 10))</code></pre>
<h2 id="a-caveat-on-bindings">A caveat on bindings</h2>
<p>One fairly obvious thing here is that there are <em>two</em> bindings constructed by this macro: the outer one, and an inner one of the same name. And these are not bound to the same function in the sense of <code>eq?</code>:</p>
<pre><code>(with-recursive-binding (ts (λ (it)
(eq? ts it)))
(ts ts))</code></pre>
<p>is <code>#f</code>. This matters only in a language where bindings can be mutated: a language with assignment in other words. Both the outer and inner bindings, unless they have been mutated, are to functions which are identical <em>as functions</em>: they compute the same values for all values of their arguments. In fact, it’s hard to see what purpose <code>eq?</code> would serve in a language without assignment.</p>
<p>This caveat will apply below as well.</p>
<h2 id="two-versions-of-u-for-many-functions">Two versions of U for many functions</h2>
<p>The obvious generalization of U, U*, to many functions is that \(U^*(f_1, \ldots, f_n)\) is the tuple \((f_1(f_1, \ldots, f_n), f_2(f_1, \ldots, f_n), \ldots)\). And a nice way of expressing that in Racket is to use multiple values:</p>
<pre><code>(define (U* . fs)
(apply values (map (λ (f)
(apply f fs))
fs)))</code></pre>
<p>And we need the applicative-order one as well:</p>
<pre><code>(define (U*/ao . fs)
(apply values (map (λ (f)
(λ args (apply (apply f fs) args)))
fs)))</code></pre>
<p>Note that U* is a true generalization of U: <code>(U f)</code> and <code>(U* f)</code> are the same.</p>
<h2 id="using-u-to-construct-mutually-recursive-functions">Using U* to construct mutually-recursive functions</h2>
<p>I’ll work with a trivial pair of functions:</p>
<ul>
<li>an object is a <em>numeric tree</em> if it is a cons and its car and cdr are numeric objects;</li>
<li>an objct is a <em>numeric object</em> if it is a number, or if it is a numeric tree.</li></ul>
<p>So we can define ‘maker’ functions (with an ’-er’ convention: a function which makes an <em>x</em> is an <em>x</em>er, or, if <em>x</em> has hyphens in it, an <em>x</em>-er) which will make suitable functions:</p>
<pre><code>(define numeric-tree-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(and (cons? o)
(no? (car o))
(no? (cdr o)))))))
(define numeric-object-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(cond
[(number? o) #t]
[(cons? o) (nt? o)]
[else #f])))))</code></pre>
<p>Note that for both of these I’ve raised the call to <code>U*</code> a little, simply to make the call to the appropriate value of <code>U*</code> less opaque.</p>
<p>And this works:</p>
<pre><code>(define-values (numeric-tree? numeric-object?)
(U* numeric-tree-er numeric-object-er))</code></pre>
<p>And now:</p>
<pre><code>> (numeric-tree? 1)
#f
> (numeric-object? 1)
#t
> (numeric-tree? '(1 . 2))
#t
> (numeric-tree? '(1 2 . (3 4)))
#f</code></pre>
<h2 id="wrapping-u-in-a-macro">Wrapping U* in a macro</h2>
<p>The same problem as previously happens when we raise the inner call to <code>U*</code> with the same result: we need to use <code>U*/ao</code>. In addition the macro becomes significantly more hairy and I’m moderately surprised that I got it right so easily. It’s not conceptually hard: it’s just not obvious to me that the pattern-matching works.</p>
<pre><code>(define-syntax (with-recursive-bindings stx)
(syntax-parse stx
[(_ ((name:id value:expr) ...) form ...+)
#:fail-when (check-duplicate-identifier (syntax->list #'(name ...)))
"duplicate variable name"
(with-syntax ([(argname ...) (generate-temporaries #'(name ...))])
#'(let-values
([(name ...) (U* (λ (argname ...)
(let-values ([(name ...)
(U*/ao argname ...)])
value)) ...)])
form ...))]))</code></pre>
<p>And now, in a shower of sparks, we can write:</p>
<pre><code>(with-recursive-bindings ((numeric-tree?
(λ (o)
(and (cons? o)
(numeric-object? (car o))
(numeric-object? (cdr o)))))
(numeric-object?
(λ (o)
(cond [(number? o) #t]
[(cons? o) (numeric-tree? o)]
[else #f]))))
(numeric-tree? '(1 2 3 (4 (5 . 6) . 7) . 8)))</code></pre>
<p>and get <code>#t</code>.</p>
<hr />
<p>As I said, I am sure there are well-known better ways to do this, but I thought this was interesting enough not to lose. This originated as an answer to <a href="https://stackoverflow.com/questions/60460322/implement-a-self-reference-pointer-in-a-pure-functional-language-elm-haskell">this Stack Overflow question</a>.</p>Function calling conventions and bindingsurn:https-www-tfeb-org:-fragments-2019-01-04-function-calling-conventions-and-bindings2019-01-04T10:19:36Z2019-01-04T10:19:36ZTim Bradshaw
<p>An attempt to describe three well-known function calling conventions in terms of bindings.</p>
<!-- more-->
<p>A little while ago I wrote an <a href="../../../../2018/12/11/call-by-value-in-scheme-and-lisp">article on bindings</a> which, in turn, was based on my answer to <a href="https://stackoverflow.com/questions/53694761/pass-by-value-confusion-in-scheme">this Stack Overflow question</a>. I have since written another answer to <a href="https://stackoverflow.com/questions/54018077/in-common-lisp-when-are-objects-referenced-and-when-are-they-directly-accessed">a more recent question</a> and I thought it would be worth summarising part of that to describe how three famous function calling conventions can be described in terms of bindings<sup><a href="#2019-01-04-function-calling-conventions-and-bindings-footnote-1-definition" name="2019-01-04-function-calling-conventions-and-bindings-footnote-1-return">1</a></sup>.</p>
<h2 id="bindings-in-brief">Bindings in brief</h2>
<p>A <em>binding</em> is an association between a name (a variable) and a value, where the value can be any object the language can talk about. In most Lisps (and other languages) bindings are not first-class: the language can not talk about bindings directly, and in particular bindings can not be values. Bindings are, or may be, <em>mutable</em>: their values (but not their names) can be changed by assignment. Many bindings can share the same value. Bindings have scope (where they are accessible) and extent (how long they are accessible for) and there are rules about that.</p>
<h2 id="call-by-value">Call by value</h2>
<p>In call by value the <em>value</em> of a binding is passed to a procedure. This means that the procedure can not mutate the binding itself. If the value is a mutable object it can be altered by the procedure, but the binding can not be.</p>
<p>Call by value is the convention used by all Lisps I know of. Here is a function which demonstrates that call by value can not mutate bindings:</p>
<pre><code>(defun pbv (&optional (fn #'identity))
;; If FN returns then the first value of this function will be T
(let ((c (cons 0 0))) ;first binding
(let ((cc c)) ;second binding, shares value with first
(funcall fn c) ;FN gets the *value* of C
(values (eq c cc) c)))) ;C and CC still refer to the same object</code></pre>
<h2 id="call-by-reference">Call by reference</h2>
<p>In call by reference, procedures get <em>the bindings themselves</em> as arguments. If a procedure modifies the binding by assignment, then it is modified in the calling procedure as well.</p>
<p>Lisp does not use call by reference: Fortran does, or can, use a calling mechanism which is equivalent to call by reference<sup><a href="#2019-01-04-function-calling-conventions-and-bindings-footnote-2-definition" name="2019-01-04-function-calling-conventions-and-bindings-footnote-2-return">2</a></sup>.</p>
<p>It is possible to implement what is essentially call by reference in Lisp (here Common Lisp, but any Lisp with lexical scope, indefinite extent & macros can do this) using some macrology:</p>
<pre><code>(defmacro capture-binding (var)
;; Construct an object which captures a binding
`(lambda (&optional (new-val nil new-val-p))
(when new-val-p
(setf ,var new-val))
,var))
(declaim (inline captured-binding-value
(setf captured-binding-value)))
(defun captured-binding-value (cb)
;; value of a captured binding
(funcall cb))
(defun (setf captured-binding-value) (new cb)
;; change the value of a captured binding
(funcall cb new))</code></pre>
<p>And now, given</p>
<pre><code>(defun mutate-binding (b v)
(setf (captured-binding-value b) v))
(defun sort-of-call-by-reference ()
(let ((c (cons 1 1)))
(let ((cc c))
(mutate-binding (capture-binding cc) 3)
(values c cc))))
> (sort-of-call-by-reference)
(1 . 1)
3</code></pre>
<p>The trick here is that the procedure created by the <code>capture-binding</code> macro has access to the binding being captured, and can mutate it.</p>
<h2 id="call-by-name">Call by name</h2>
<p>Call by name is the same as call by value, except the value of a binding is only computed at the point it is needed. Call by name is a form of delayed evaluation or normal-order evaluation strategy.</p>
<p>Lisp (at least Common Lisp: Lisps which have normal-order evaluation strategies exist) does not have call by name, but again it can be emulated with some macrology:</p>
<pre><code>(defmacro delay (form)
;; simple-minded DELAY. FORM is assumed to return a single value,
;; and will be evaluated no more than once.
(let ((fpn (make-symbol "FORCEDP"))
(vn (make-symbol "VALUE")))
`(let ((,fpn nil) ,vn)
(lambda ()
(unless ,fpn
(setf ,fpn t
,vn ,form))
,vn))))
(declaim (inline force))
(defun force (thunk)
;; forcd a thunk
(funcall thunk))
(defmacro funcall/delayed (fn &rest args)
;; call a function with a bunch of delayed arguments
`(funcall ,fn ,@(mapcar (lambda (a)
`(delay ,a))
args)))</code></pre>
<p>And now</p>
<pre><code>(defun return-first-thunk-value (t1 t2)
(declare (ignorable t2))
(force t1))
(defun surprisingly-quick ()
(funcall/delayed #'return-first-thunk-value
(cons 1 2)
(loop repeat 1000000
collect
(loop repeat 1000000
collect
(loop repeat 1000000
collect 1)))))
> (time (surprisingly-quick))
Timing the evaluation of (surprisingly-quick)
User time = 0.000
System time = 0.000
Elapsed time = 0.001
Allocation = 224 bytes
3 Page faults
(1 . 2)</code></pre>
<p>The second argument to <code>return-first-thunk-value</code> was never forced, and so the function completes in reasonable time.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2019-01-04-function-calling-conventions-and-bindings-footnote-1-definition" class="footnote-definition">
<p>This, in turn, is distantly descended from <a href="https://www.xach.com/naggum/articles/3229347076995853@naggum.net.html">a post on <code>comp.lang.lisp</code> by Erik Naggum</a>. <a href="#2019-01-04-function-calling-conventions-and-bindings-footnote-1-return">↩</a></p></li>
<li id="2019-01-04-function-calling-conventions-and-bindings-footnote-2-definition" class="footnote-definition">
<p>I think Fortran is allowed to implement its ‘by reference’ calls by copying any modified bindings back to the bindings in the parent procedure, and this is largely equivalent, at least for single-threaded code. <a href="#2019-01-04-function-calling-conventions-and-bindings-footnote-2-return">↩</a></p></li></ol></div>Call by value in Scheme and Lispurn:https-www-tfeb-org:-fragments-2018-12-11-call-by-value-in-scheme-and-lisp2018-12-11T10:50:28Z2018-12-11T10:50:28ZTim Bradshaw
<p>I find the best way to think about this is to think in terms of <em>bindings</em>, rather than environments or frames, which are simply containers for bindings.</p>
<!-- more-->
<h2 id="bindings">Bindings</h2>
<p>A binding is an association between a <em>name</em> and a <em>value</em>. The name is often called a ‘variable’ and the value is, well, the value of the variable. The value of a binding can be any object that the language can talk about at all. Bindings, however, are behind-the-scenes things (sometimes this is called ‘not being first-class objects’): they’re not things that can be represented in the language but rather things that you can use as part of the model of how the language works. So <em>the value of a binding can’t be a binding</em>, because bindings are not first-class: the language can’t talk about bindings.</p>
<p>There are some rules about bindings:</p>
<ul>
<li>there are forms which create them, of which the most important two are <code>lambda</code> and <code>define</code>;</li>
<li>bindings are not first-class — the language can not represent bindings as values;</li>
<li>bindings are, or may be, <em>mutable</em> — you can change the value of a binding once it exists — and the form that does this is <code>set!</code>;</li>
<li>there is no operator which destroys a binding;</li>
<li>bindings have <em>lexical scope</em> — the bindings available to a bit of code are the ones you can see by looking at it, not ones you have to guess by running the code and which may depend on the dynamic state of the system;</li>
<li>only one binding for a given name is ever accessible from a given bit of code — if more than one is lexically visible then the innermost one shadows any outer ones;</li>
<li>bindings have <em>indefinite extent</em> — if a binding is ever available to a bit of code, it is always available to it.</li></ul>
<p>Obviously these rules need to be elaborated significantly (especially with regards to global bindings & forward-referenced bindings) and mare formal, but these are enough to understand what happens. In particular I don’t really think you need to spend a lot of time worrying about environments: the environment of a bit of code is just the set of bindings accessible to it, so rather than worry about the environment just worry about the bindings.</p>
<h2 id="call-by-value">Call by value</h2>
<p>So, what ‘call by value’ means is that when you call a procedure with an argument which is a variable (a binding) what is passed to it is the <em>value</em> of the variable binding, not the binding itself. The procedure then creates a <em>new</em> binding with the same value. Two things follow from that:</p>
<ul>
<li>the original binding can not be altered by the procedure — this follows because the procedure only has the value of it, not the binding itself, and bindings are not first-class so you can’t cheat by passing the binding itself as the value;</li>
<li>if the value is itself a mutable object (arrays & conses are example of objects which usually are mutable, numbers are examples of objects which are not) then the procedure can mutate that object.</li></ul>
<h2 id="examples-of-the-rules-about-bindings">Examples of the rules about bindings</h2>
<p>So, here are some examples of these rules.</p>
<pre><code>(define (silly x)
(set! x (+ x 1))
x)
(define (call-something fn val)
(fn val)
val))
> (call-something silly 10)
10</code></pre>
<p>So, here we are creating two top-level bindings, for <code>silly</code> and <code>call-something</code>, both of which have values which are procedures. The value of <code>silly</code> is a procedure which, when called:</p>
<ol>
<li>creates a new binding whose name is <code>x</code> and whose value is the argument to <code>silly</code>;</li>
<li>mutates this binding so its value is incremented by one;</li>
<li>returns the value of this binding, which is one more than the value it was called with.</li></ol>
<p>The value of <code>call-something</code> is a procedure which, when called:</p>
<ol>
<li>creates two bindings, one named <code>fn</code> and one named <code>val</code>;</li>
<li>calls the value of the <code>fn</code> binding with the value of the <code>val</code> binding;</li>
<li>returns the value of the <code>val</code> binding.</li></ol>
<p>Note that <em>whatever</em> the call to <code>fn</code> does, it can not mutate the binding of <code>val</code>, because it has no access to it. So what you can <em>know</em>, by looking at the definition of <code>call-something</code> is that, if it returns at all (it may not return if the call to <code>fn</code> does not return), it will return the value of its second argument. This guarantee is what ‘call by value’ means: a language (such as Fortran) which supports other call mechanisms can’t always promise this.</p>
<pre><code>(define (outer x)
(define (inner x)
(+ x 1))
(inner (+ x 1)))</code></pre>
<p>Here there are four bindings: <code>outer</code> is a top-level binding whose value is a procedure which, when it is called, creates a binding for <code>x</code> whose value is its argument. It then creates another binding called <code>inner</code> whose value is another procedure, which, when it is called, creates a <em>new</em> binding for <code>x</code> to <em>its</em> argument, and then returns the value of that binding plus one. <code>outer</code> then calls this inner procedure with the value of its binding for <code>x</code>.</p>
<p>The important thing here is that, in <code>inner</code>, there are two bindings for <code>x</code> which are potentially lexically visible, but the closest one — the one established by <code>inner</code> — wins, because only one binding for a given name can ever be accessible at one time.</p>
<p>Here is the previous code (this would not be equivalent if <code>inner</code> was recursive) expressed with explicit <code>lambda</code>s:</p>
<pre><code>(define outer
(λ (x)
((λ (inner)
(inner (+ x 1)))
(λ (x)
(+ x 1)))))</code></pre>
<p>And finally an example of mutating bindings:</p>
<pre><code>(define (make-counter val)
(λ ()
(let ((current val))
(set! val (+ val 1))
current)))
> (define counter (make-counter 0))
> (counter)
0
> (counter)
1
> (counter)
2</code></pre>
<p>So, here, <code>make-counter</code> (is the name of a binding whose value is a procedure which, when called,) establishes a new binding for <code>val</code> and then returns a procedure it has created. This procedure makes a new binding called <code>current</code> which catches the current value of <code>val</code>, <em>mutates</em> the binding for <code>val</code> to add one to it, and returns the value of <code>current</code>. This code exercises the ‘if you can ever see a binding, you can always see it’ rule: the binding for <code>val</code> created by the call to <code>make-counter</code> is visible to the procedure it returns for as long as that procedure exists (and that procedure exists at least as long as there is a binding for it), and it also mutates a binding with <code>set!</code>.</p>
<h2 id="why-not-environments">Why not environments?</h2>
<p><a href="https://mitpress.mit.edu/sites/default/files/sicp/index.html">SICP</a>, in <a href="https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book-Z-H-19.html#%_chap_3">chapter 3</a>, introduces the ‘environment model’, where at any point there is an environment, consisting of a sequence of frames, each frame containing bindings. Obviously this is a fine model, but it introduces three kinds of thing — the enviromnent, the frames in the environment and the bindings in the frame — two of which are utterly intangible. At least for a binding you can get hold of it in some way: you can see it being created in the code and you can see references to it. So I prefer not to think in terms of these two extra sorts of thing which you can never get any kind of handle on.</p>
<p>However this is a choice which makes no difference in practice: thinking purely in terms of bindings helps me, thinking in terms of environments, frames & bindings may well help other people more.</p>
<h2 id="shorthands">Shorthands</h2>
<p>In what follows I am going to use a shorthand for talking about bindings, especially top-level ones:</p>
<ul>
<li>’<code>x</code> is a procedure which …’ means ’<code>x</code> is the name of a binding whose value is a procedure which, when called, …’;</li>
<li>’<code>y</code> is …’ means ’<code>y</code> is the name of a binding the value of which is …’;</li>
<li>’<code>x</code> is called with <code>y</code>’ means ‘the value of the binding named by <code>x</code> is called with the value of the binding named by <code>y</code>’;</li>
<li>’… binds <code>x</code> to …’ means ’… creates a binding whose name is <code>x</code> and whose value is …’;</li>
<li>’<code>x</code>’ means ‘the value of <code>x</code>’;</li>
<li>and so on.</li></ul>
<p>Describing bindings like this is common, as the fully-explicit way is just painful: I’ve tried (but probably failed in places) to be fully explicit above.</p>
<h2 id="the-answer">The answer</h2>
<p>And finally, after this long preamble, here’s the answer to the question you asked<sup><a href="#2018-12-11-call-by-value-in-scheme-and-lisp-footnote-1-definition" name="2018-12-11-call-by-value-in-scheme-and-lisp-footnote-1-return">1</a></sup>.</p>
<pre><code>(define (make-withdraw balance)
(λ (amount)
(if (>= balance amount)
(begin (set! balance (- balance amount))
balance)
"Insufficient funds")))</code></pre>
<p><code>make-withdraw</code> binds <code>balance</code> to its argument and returns a procedure it makes. This procedure, when called:</p>
<ol>
<li>binds <code>amount</code> to its argument;</li>
<li>compares <code>amount</code> with <code>balance</code> (which it can still see because it could see it when it was created);</li>
<li>if there’s enough money then it mutates the <code>balance</code> binding, decrementing its value by the value of the <code>amount</code> binding, and returns the new value;</li>
<li>if there’s not enough money it returns <code>"Insuficient funds"</code> (but does <em>not</em> mutate the <code>balance</code> binding, so you can try again with a smaller amount: a real bank would probably suck some money out of the <code>balance</code> binding at this point as a fine).</li></ol>
<p>Now</p>
<pre><code>(define x (make-withdraw 100))</code></pre>
<p>creates a binding for <code>x</code> whose value is one of the procedures described above: in that procedure <code>balance</code> is initially <code>100</code>.</p>
<pre><code>(define (f y) (y 25))</code></pre>
<p><code>f</code> is a procedure (is the name of a binding whose value is a procedure, which, when called) which binds <code>y</code> to its argument and then calls it with an argument of <code>25</code>.</p>
<pre><code>(f x)</code></pre>
<p>So, <code>f</code> is called with <code>x</code>, <code>x</code> being (bound to) the procedure constructed above. In <code>f</code>, <code>y</code> is bound to this procedure (not to a copy of it, to it), and this procedure is then called with an argument of <code>25</code>. This procedure then behaves as described above, and the results are as follows:</p>
<pre><code>> (f x)
75
> (f x)
50
> (f x)
25
> (f x)
0
> (f x)
"Insufficient funds"</code></pre>
<p>Note that:</p>
<ul>
<li>no first-class objects are copied anywhere in this process: there is no ‘copy’ of a procedure created;</li>
<li>no first-class objects are mutated anywhere in this process;</li>
<li>bindings are created (and later become inacessible and so can be destroyed) in this process;</li>
<li>one binding is mutated repeatedly in this process (once for each call);</li>
<li>I have not anywhere needed to mention ‘environments’, which are just the set of bindings visible from a certain point in the code and I think not a very useful concept.</li></ul>
<p>I hope this makes some kind of sense.</p>
<hr />
<h2 id="a-more-elaborate-version-of-the-above-code">A more elaborate version of the above code</h2>
<p>Something you might want to be able to do is to back out a transaction on your account. One way to do that is to return, as well as the new balance, a procedure which undoes the last transaction. Here is a procedure which does that (this code is in <a href="http://racket-lang.org/">Racket</a>):</p>
<pre><code>(define (make-withdraw/backout
balance
(insufficient-funds "Insufficient funds"))
(λ (amount)
(if (>= balance amount)
(let ((last-balance balance))
(set! balance (- balance amount))
(values balance
(λ ()
(set! balance last-balance)
balance)))
(values
insufficient-funds
(λ () balance)))))</code></pre>
<p>When you make an account with this procedure, then calling it returns two values: the first is the new balance, or the value of <code>insufficient-funds</code> (defaultly <code>"Insufficient funds"</code>), the second is a procedure which will undo the transaction you just did. Note that it undoes it by explicitly putting back the old balance, because you can’t necessarily rely on <code>(= (- (+ x y) y) x)</code> being true in the presence of floating-point arithmetic I think. If you understand how this works then you probably understand bindings.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2018-12-11-call-by-value-in-scheme-and-lisp-footnote-1-definition" class="footnote-definition">
<p>This originated as an answer to <a href="https://stackoverflow.com/questions/53694761/pass-by-value-confusion-in-scheme">this Stack Overflow question</a>. <a href="#2018-12-11-call-by-value-in-scheme-and-lisp-footnote-1-return">↩</a></p></li></ol></div>Dynamic scope and macrosurn:https-www-tfeb-org:-fragments-2017-01-26-dynamic-scope-and-macros2017-01-26T13:56:36Z2017-01-26T13:56:36ZTim Bradshaw
<p>I’ve recently been writing some <a href="https://en.wikipedia.org/wiki/Emacs_Lisp">Emacs Lisp</a> code to do some massaging of files. Quite apart from having forgotten how primitive elisp is, I hadn’t realised before how hostile dynamic scope was for macros in particular.</p>
<!-- more-->
<p>A very common pattern for macros is <code>call-with-*</code> / <code>with-*</code>, in which there is a functional level which is wrapped by a more syntacticlly-friendly macro level. For instance, in Common Lisp you can map over lists with <code>mapcar</code>:</p>
<pre><code>(mapcar
(lambda (e)
...)
...)</code></pre>
<p>but you might want to map over them with a syntax like</p>
<pre><code>(mapping (e ...)
...)</code></pre>
<p>Well, it’s easy to implement this:</p>
<pre><code>(defmacro mapping ((e l) &body forms)
`(mapcar (lambda (,e) ,@forms) ,l))</code></pre>
<p>Even with CL’s unhygienic macro system & without a mass of gensymmery such a macro is safe.</p>
<p>A good example where CL exposes one side of a pattern like this is <code>with-open-file</code>: you can easily see how to implement this in terms of a function:</p>
<pre><code>(defun call/open-file (fn filespec &rest keys
&key &allow-other-keys)
(let ((s nil))
(unwind-protect
(progn
(setf s (apply #'open filespec keys))
(funcall fn s))
(when s (close s)))))
(defmacro with-open-file* ((sn filespecn &rest keysn
&key &allow-other-keys)
&body forms)
`(call/open-file (lambda (,sn) ,@forms)
,filespecn ,@keysn))</code></pre>
<p>(This is probably not completely robust code: it’s just meant to get the idea across.)</p>
<p>Scheme exposes the other side of this pattern with <code>call/cc</code>:</p>
<pre><code>(define-syntax-rule (with-cc (c) form ...)
(call/cc (λ (c) form ...)))</code></pre>
<p>(<code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/misc..rkt)._define-syntax-rule))" style="color: inherit">define-syntax-rule</a></code> may be specific to Racket but, again, this is just meant to get the idea across.)</p>
<p>Well, now think about something like the above <code>call/open-file</code> / <code>with-open-file*</code> in a Lisp dialect with dynamic scope. In particular, what does this do:</p>
<pre><code>(let ((s t))
(with-open-file* (h ...)
(when s ...)))</code></pre>
<p>This expands to</p>
<pre><code>(let ((s t))
(call/open-file (lambda (h) (when s ...))))</code></pre>
<p>But <em><code>call/open-file</code> binds <code>s</code></em>: so the binding of <code>s</code> in the called function is <em>different</em> than the outer binding, and nothing works.</p>
<p>Well, of course, this is something that happens pervasively with dynamically-scoped languages: every binding above you (or below you, depending on your viewpoint) matters, and can infect your namespace. But it’s particularly toxic for macros, because macros very often interpose bits of code into your code, and that code can include bindings which are dynamically, but not lexically, visible, even in the expansion of the macro. Dynamic scope enormously increases the hygiene problems of a macro system.</p>
<p>Dynamic scope is really useful as an option, and systems written in languages which don’t have it generally have to reinvent it, usually badly. But it’s just toxic and horrible as the <em>only</em> option. I can’t understand any more how I managed to use lisps with dynamic scope at all: perhaps I never wrote macros or just expected things to behave in a mysterious and strange way occasionally. Fortunately, even elisp <a href="https://www.gnu.org/software/emacs/manual/html_node/elisp/Lexical-Binding.html#Lexical-Binding">now has the option of being lexically scoped</a>.</p>Python instead of Lispurn:https-www-tfeb-org:-fragments-2016-06-09-python-instead-of-lisp2016-06-09T18:43:40Z2016-06-09T18:43:40ZTim Bradshaw
<p>Lots of people, even <a href="http://norvig.com/python-lisp.html">famous Lisp hackers</a>, like to claim that ‘Python can be seen as a dialect of Lisp with “traditional” syntax’.</p>
<p>Being famous does not make them right.</p>
<!-- more-->
<h2 id="python-is-nothing-like-lisp">Python is <em>nothing like</em> Lisp</h2>
<p><strong>Expression language.</strong> Lisp is an expression language: everything in the language is an expression and has a value, and there is no distinction between expressions and statements, because there are no statements. Python is not: it has expressions, such as <code>2+3</code>, <code>lambda x: x*2</code> and statements such as <code>x = 3</code>. If expressions and statements are different things then writing macros and any kind of general-purpose <code>lambda</code> becomes very difficult.</p>
<p><strong>Conses.</strong> Lisp has conses, Python does not. Conses are not everything<sup><a href="#2016-06-09-python-instead-of-lisp-footnote-1-definition" name="2016-06-09-python-instead-of-lisp-footnote-1-return">1</a></sup>, but unless you have them you can’t implement them reasonably, and they are extremely useful data structures for many purposes. In particular for conses to be useful you need two things:</p>
<ul>
<li>a good syntax for them and for lists built from them;</li>
<li>good performance — conses should be extremely cheap, so you can’t implement them as a special case of some heavyweight data structure such as a Python list, because there is an enormous header.</li></ul>
<p>This means that conses need to be wired into the language: you can’t take a language without conses and add them, because even if you can get the first (you can’t in Python) you can’t get the second.</p>
<p><strong>Symbols.</strong> Lisp has symbols, Python does not. You can use strings, and this works sometimes.</p>
<p><strong>Lambda.</strong> Lisp has lambda, Python has an extremely limited version. Not being an expression language (see above) and the lack of scoping and block constructs in Python cripples its lambda.</p>
<p><strong>Source code available as a low-commitment data structure.</strong> Lisp has this, Python does not. ‘Low-commitment’ means that it is available before it has been decided what it means, but after it has been turned from a stream of characters into something more interesting. This matters because it makes macros possible: macros which work by transforming streams of characters are doomed to the sort of unspeakable horror of which <a href="http://jinja.pocoo.org/">Jinja2</a> is a good example, while macros which work after it has been decided what the code means then can’t make their <em>own</em> decision about what it means, which is half the point of macros.</p>
<p><strong>Scoping.</strong> Lisp has a multiplicity of scoping constructs and all modern Lisps have lexical scope, with some (Scheme) extending this to control constructs. Binding and assignment are irreparably confused in Python: scope does not work properly and this can never be fixed. A language which requires a <code>global</code> declaration is not going to be fixed by adding <code>nonlocal</code>.</p>
<p><strong>Macros.</strong> Lisp has them, Python doesn’t. Since macros are <em>the point</em> of Lisp, it is really hard to see how the above quote makes any kind of sense.</p>
<p>There is a terrible truth about the percieved arrogance of Lisp hackers that it has taken me a long time to understand. The arrogance is justified: Lisp is, in fact, a better programming language.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2016-06-09-python-instead-of-lisp-footnote-1-definition" class="footnote-definition">
<p>In particular conses are not a useful universal data structure in the way that, perhaps, early Lisp people thought they were. <a href="#2016-06-09-python-instead-of-lisp-footnote-1-return">↩</a></p></li></ol></div>Macros in Racket, part three: checking boolean operatorsurn:https-www-tfeb-org:-fragments-2015-12-12-macros-in-racket-part-three2015-12-12T10:59:54Z2015-12-12T10:59:54ZTim Bradshaw
<p>I wanted to see if I could write a mildly complicated macro in <a href="http://racket-lang.org/">Racket</a> without becoming too confused. I can, although I am not sure it is terribly idiomatic.</p>
<p>This is the third part of a series on writing macros in Racket for someone used to Common Lisp, although it is mostly independent of the previous parts. The previous parts are <a href="../../../../2015/01/13/macros-in-racket-part-one/">part one</a> & <a href="../../../../2015/01/28/macros-in-racket-part-two">part two</a>.</p>
<!-- more-->
<p>One of the nice things about Lisp-family languages is that you can write your own control constructs, and it’s essentially easy to do so: if <code><a href="http://docs.racket-lang.org/reference/when_unless.html#(form._((lib._racket/private/letstx-scheme..rkt)._when))" style="color: inherit">when</a></code> did not exist then you could write it:</p>
<pre><code>(define-syntax-rule (when test form ...)
(and test
(begin form ...)))</code></pre>
<p>This kind of extensibility is one of the wonders of Lisp and Scheme: it’s tempting to say that it makes them better than programming languages which can’t do this but that’s not correct: it makes them <em>incomparable</em> to such languages: Lisp<sup><a href="#2015-12-12-macros-in-racket-part-three-footnote-1-definition" name="2015-12-12-macros-in-racket-part-three-footnote-1-return">1</a></sup> programs can reason about <em>themselves</em> and often do<sup><a href="#2015-12-12-macros-in-racket-part-three-footnote-2-definition" name="2015-12-12-macros-in-racket-part-three-footnote-2-return">2</a></sup>. Everything about Lisp really leads to this ability.</p>
<p>When I taught (Common) Lisp to people one of the things I would try to get across was this ability of macros to extend the control constructs in the language: people often thought of macros as a way of essentially inlining code<sup><a href="#2015-12-12-macros-in-racket-part-three-footnote-3-definition" name="2015-12-12-macros-in-racket-part-three-footnote-3-return">3</a></sup>, but that’s not what they’re actually good for. If you can add control constructs to your language, then you can make a <em>new language</em>, and <em>that’s</em> what Lisp macros are about, and therefore what <em>Lisp</em> is about.</p>
<p>A good way to get this across to people is to pretend that Lisp doesn’t have some control construct, and write it as a macro. This is easier than inventing new control constructs both because it doesn’t require thinking of a domain where they might be useful and because the existing control constructs have clear semantics. Reimplementing existing control constructs also demonstrates how the language is already built up from a more primitive language by macros and how the approach to solving problems in Lisp is to <em>design and implement a language</em> in which to talk about the problem, where that language is seamlessly built on the underlying Lisp, and can inherit all of its power and flexibiliy, <em>including the ability to extend the language</em>.</p>
<p>An advantage of reimplementing existing control constructs for teaching Lisp is that you can compare the new construct to the existing one, and with some small constraints you can do this exhaustively, so you can know whether you have actually implemented it right. This is, obviously, not possible in general, but if the operator has trivial syntax (so not <code><a href="http://docs.racket-lang.org/reference/if.html#(form._((lib._racket/private/letstx-scheme..rkt)._cond))" style="color: inherit">cond</a></code>) and if you limit the arguments of the operator to booleans then you can enumerate all the possible arguments in the obvious way, and so long as it returns a result for all combinations of arguments (does not fail to halt in other words) and is deterministic then there are only two things you need to check:</p>
<ol>
<li>does the operator produce the same result for all combinations of arguments (\(2^n\) possibilities for \(n\) arguments) as the existing one?</li>
<li>does the operator evaluate its arguments the same number of times as the existing one for all these combinations?</li></ol>
<p>So, for instance, <code><a href="http://docs.racket-lang.org/reference/if.html#(form._((quote._~23~25kernel)._if))" style="color: inherit">if</a></code> takes three arguments (in Racket) and should evaluate the first exactly once, and the others at most once, as well as returning the correct value.</p>
<p>Obviously such a check is not a full check of the operator — it does not tell you what it does with non-boolean arguments for instance. But I was interested in writing the check largely because it’s clearly a reasonably hairy macro which I know how to write in CL and wanted to see if I could write in Racket (I’m not very likely to teach people Lisp again).</p>
<h2 id="what-the-macro-needs-to-do">What the macro needs to do</h2>
<p>The idea is that to compare two boolean operators <code>o1</code> and <code>o2</code> which take <code>n</code> arguments you need to generate code which looks like this:</p>
<pre><code>(for/and ([c (expt 2 n)])
(let ([a1 (bitwise-bit-set? c 0)] ...)
(let ([o1c1 0] ...)
(let ([o2c1 0] ...)
(and (eq? (o1 (begin (set! o1c1 (+ o1c1 1)) a1) ...)
(o2 (begin (set! o2c1 (+ o2c1 1)) a1) ...))
(= o1c1 o2c1) ...)))))</code></pre>
<p>So <code>a1</code> is the first argument, <code>o1c1</code> counts how many times <code>o1</code> evaluates it, and <code>o2c1</code> counts how many times <code>o2</code> evaluates it, and so on. I decided to compare the operators with <code><a href="http://docs.racket-lang.org/reference/Equality.html#(def._((quote._~23~25kernel)._eq~3f))" style="color: inherit">eq?</a></code> rather than <code><a href="http://docs.racket-lang.org/reference/Equality.html#(def._((quote._~23~25kernel)._eqv~3f))" style="color: inherit">eqv?</a></code> for no very good reason except that it works for operators whose results are booleans, which is what I was interested in. I should almost certainly use <code>eqv?</code> I think — certainly the <code>-equivalent</code> in the name would imply that — but I’m not.</p>
<p>It’s clear that a loop like that checks all of the \(2^n\) possibilities for the arguments, where each argument can be either <code>#f</code> or <code>#t</code> only. So this does an exhaustive check of all the possibilities, and provided <code>o1</code> and <code>o2</code> are deterministic and halt on all their arguments it will tell you whether they are equivalent.</p>
<p>And finally, this must be written as a macro, because the operators it is testing are themselves not generally functions: in particular things like <code><a href="http://docs.racket-lang.org/reference/if.html#(form._((quote._~23~25kernel)._if))" style="color: inherit">if</a></code> and <code><a href="http://docs.racket-lang.org/reference/if.html#(form._((lib._racket/private/letstx-scheme..rkt)._or))" style="color: inherit">or</a></code> are obviously themselves not functions.</p>
<h2 id="things-i-did-not-know-how-to-do">Things I did not know how to do</h2>
<p>The big thing I didn’t know how to do here was to make up new identifiers: all the counters need to be created, and possibly also the argument names. In CL you’d do this with <code>make-symbol</code> or <code>gensym</code> or something like that. Assuming I want to use <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax-case))" style="color: inherit">syntax-case</a></code> rather than writing a CL-style construct-the-form-with-backquote-and-use-<code><a href="http://docs.racket-lang.org/reference/stxops.html#(def._((quote._~23~25kernel)._datum-~3esyntax))" style="color: inherit">datum->syntax</a></code> macro (which I very much do want to do) then there are two problems:</p>
<ol>
<li>constructing the names of the counters;</li>
<li>making them available as pattern variables.</li></ol>
<p>Well, (2) is easy: you can use nested <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax-case))" style="color: inherit">syntax-case</a></code>s, or equivalently but much more prettily, <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._with-syntax))" style="color: inherit">with-syntax</a></code> to bind the pattern variables. And it turns out that <code>with-syntax</code> is willing to do a lot of work on your behalf: if you give it something which is not a syntax object it will massage it into one for you. So, in particular, this works:</p>
<pre><code>(with-syntax ([(o1c ...) (list ...)])
...)</code></pre>
<p>It takes the list it is given, turns it into a syntax object (with <code>datum->syntax</code> I suppose) and then does the matching. So you can be really lazy here: all you need to invent is a list of identifier syntax objects, and <code>with-syntax</code> will do the rest, making the program a lot less noisy. This is a really neat feature, although it might lead you to get confused about what is, and what is not, a syntax object I suppose. Anyway, I used it ruthlessly.</p>
<p>So this leaves (1). You could obviously do this with something like <code>(datum->syntax ctx (string->symbol (format ...)))</code>, but Racket provides a nice shorthand for that in the form of <code><a href="http://docs.racket-lang.org/reference/syntax-util.html#(def._((lib._racket/syntax..rkt)._format-id))" style="color: inherit">format-id</a></code>: <code>(format-id ctx "~a-count" v)</code> will construct an identifier syntax object from <code>v</code> using <code>ctx</code> as lexical context. And it will do the appropriate magic if <code>v</code> is an identifier syntax object: extract the symbol from it and use it as the argument to <code><a href="http://docs.racket-lang.org/reference/Writing.html#(def._((quote._~23~25kernel)._format))" style="color: inherit">format</a></code> in the appropriate way.</p>
<p>So it looks pretty straightforward to construct lists of identifiers and bind them to pattern variables. The final thing that confuses me is what lexical context to use for the identifiers. The macro should be hygenic, which means they <em>can’t</em> have the context of the syntax object it is working on, but I think can have more-or-less any other context where they have no existing meaning: I just invented an object for them, which I think is safe, although I am a bit confused about this.</p>
<h2 id="what-users-see">What users see</h2>
<p>I spent a really long time stuck on what the syntax of the macro should be: this is entirely stupid because it just does not matter that much. The reason I got stuck is that it <em>would</em> matter if this was a real library and I am constitutionally incapable of writing things without worrying about that kind of thing. Eventually I decided that it would be best if the user provided the argument names as a list, because they generally make sense to users and because I didn’t want to get into something which looked as if you could pass it an integer when in fact what it needs is a <em>literal</em> integer. So I decided on a syntax like this:</p>
<pre><code>(boolean-operators-equivalent? o1 o2 (a1 ...))</code></pre>
<p>So, for instance:</p>
<pre><code>(boolean-operators-equivalent? if my-if (test then else))</code></pre>
<p>I still don’t really like this; but I’m just playing so, well, it will do.</p>
<h2 id="additional-cleverness">Additional cleverness</h2>
<p>I wanted to report syntax errors in a reasonable way: apparently the proper way to do this is using <code><a href="http://docs.racket-lang.org/syntax/Parsing_Syntax.html#(form._((lib._syntax/parse..rkt)._syntax-parse))" style="color: inherit">syntax-parse</a></code> but I am not ready to understand that yet, so I used <code><a href="http://docs.racket-lang.org/reference/syntax-util.html#(def._((lib._racket/syntax..rkt)._wrong-syntax))" style="color: inherit">wrong-syntax</a></code> and the <code><a href="http://docs.racket-lang.org/reference/syntax-util.html#(def._((lib._racket/syntax..rkt)._current-syntax-context))" style="color: inherit">current-syntax-context</a></code> parameter to get reasonable-looking errors.</p>
<p>I thought it would be nice to be able to report failures of equivalence, so there is a parameter which controls that and the expansion of the macro includes a check for the parameter and prints the failed cases if it’s true. All this happens at run time (phase 0) of course.</p>
<h2 id="the-macro-itself">The macro itself</h2>
<p>So, finally, here it is.</p>
<pre><code>(require (for-syntax (only-in racket/syntax format-id
current-syntax-context wrong-syntax)))
(define boe-report-failure? (make-parameter #f))
(define-syntax (boolean-operators-equivalent? stx)
;; Given the names of two boolean operators and a list of argument
;; names, expand to a form which tests that they are equivalent, by
;; evaluating the with arguments bound to all the combinations of #t
;; and #f, and also checking that they evaluate the same arguments
;; in each case.
;;
(parameterize ([current-syntax-context stx])
(syntax-case stx ()
[(_ o1 o2 (v ...))
(let* ([vars (syntax->list #'(v ...))]
[nvars (length vars)])
;; This check could be a guard, but we need the bindings
;; anyway, so.
(for ([var vars])
(unless (identifier? var)
(wrong-syntax var "not an identifier")))
;; vars is now a list of identifiers, and nvars is how many
;; there are. We need to construct syntax for check
;; variables for each var and and operator, as well as
;; construct 2^n and a list of bit numbers.] This is being
;; fairly fast and loose: it turns out that various things
;; get automagically converted into syntax objects, and I
;; have not cared about the context for numbers (what is
;; it?). In general I am a bit confused about what the
;; context should be here, but it clearly should *not* be
;; stx.
;;
(with-syntax ([(o1c ...) (for/list ([v vars])
(format-id #'boe "~a-1-eval-count" v))]
[(o2c ...) (for/list ([v vars])
(format-id #'boe "~a-2-eval-count" v))]
[2^n (expt 2 nvars)]
[(b ...) (for/list ([i nvars]) i)])
;; And now just write the pattern we want. '...' is pretty
;; clever, it turns out
#'(for/and ([c 2^n])
(let ([v (bitwise-bit-set? c b)] ...)
(let ([o1c 0] ...)
(let ([o2c 0] ...)
(or (and (eq? (o1 (begin (set! o1c (+ o1c 1)) v) ...)
(o2 (begin (set! o2c (+ o2c 1)) v) ...))
(= o1c o2c) ...)
(begin
(when (boe-report-failure?)
(eprintf "Not equivalent:~% ~a~% ~a~%"
(list 'o1 `(,v ,o1c) ...)
(list 'o2 `(,v ,o2c) ...)))
#f))))))))]
[else
(wrong-syntax #'else "expecting o1 o2 (a1 ...)")])))</code></pre>
<p>To my astonishment, this worked pretty much first time (it did not initially have the <code>wrong-syntax</code> stuff, but this was easy compared to the rest of it):</p>
<pre><code>> (define-syntax-rule (if/broken test then else)
(or (and test then) else))
> (boe-report-failure? #t)
> (boolean-operators-equivalent? if if/broken (test then else))
Not equivalent:
(if (#t 1) (#f 1) (#f 0))
(if/broken (#t 1) (#f 1) (#f 1))
#f</code></pre>
<p>The macro, complete with some tests and other infrastructure can be found <a href="https://gist.github.com/tfeb/3d535a2fc755e4ee5dfb">here</a><sup><a href="#2015-12-12-macros-in-racket-part-three-footnote-4-definition" name="2015-12-12-macros-in-racket-part-three-footnote-4-return">4</a></sup>.</p>
<h2 id="notes-and-queries">Notes and queries</h2>
<p>I still don’t know whether this is really idiomatic Racket, although I am reasonably happy that I understand what is going on. There are a couple of things I am not sure about:</p>
<ul>
<li>is the context for the count variables right? I think it is, but I am not sure;</li>
<li>the macro relies heavily on Racket’s extremely smart behaviour with <code>...</code> — I am still unclear just <em>how</em> smart this is and whether I am relying on things which are not actually specified to happen;</li>
<li>similarly it relies on <code>with-syntax</code> being willing to convert things to syntax objects for you, which I am not sure is safe.</li></ul>
<p>However, even with these worries, I think it’s pretty clear that Racket macros are significantly nicer than CL macros, if also significantly more opaque.</p>
<hr />
<div class="footnotes">
<ol>
<li id="2015-12-12-macros-in-racket-part-three-footnote-1-definition" class="footnote-definition">
<p>I am going to use ‘Lisp’ to mean ‘Lisp-family’ from now on. This is not meant to denigrate Scheme — this post is about Racket, after all — I just need a term which is not too clumsy. <a href="#2015-12-12-macros-in-racket-part-three-footnote-1-return">↩</a></p></li>
<li id="2015-12-12-macros-in-racket-part-three-footnote-2-definition" class="footnote-definition">
<p>Of course, programs in other languages often do end up reasoning about themselves: people end up writing little languages all the time. But you only have to look at most examples of this sort of thing to realise how far ahead Lisp is: I’m currently having to deal with a system whose configuration files are in a mutant version of Windows ini file syntax, with a preprocessor which is entirely unaware of that syntax, and an entire other language which lives <em>in strings in the base language</em>. The preprocessor does not know about the string syntax so it pokes down into this inner language as well. I’d like to say that <a href="https://en.wikipedia.org/wiki/Greenspun's_tenth_rule">Greenspun’s tenth law</a> applies, but that would imply a level of sophistication entirely missing in this horrible thing: all I want to do is leave this job and never think about it again. <a href="#2015-12-12-macros-in-racket-part-three-footnote-2-return">↩</a></p></li>
<li id="2015-12-12-macros-in-racket-part-three-footnote-3-definition" class="footnote-definition">
<p>Macros were often used to inline code in the days of primitive compilers of course, but that’s a long time ago now. <a href="#2015-12-12-macros-in-racket-part-three-footnote-3-return">↩</a></p></li>
<li id="2015-12-12-macros-in-racket-part-three-footnote-4-definition" class="footnote-definition">
<p>I may move it somewhere more permanent in due course, so bookmark this at your peril. <a href="#2015-12-12-macros-in-racket-part-three-footnote-4-return">↩</a></p></li></ol></div>Greenspunningurn:https-www-tfeb-org:-fragments-2015-10-08-greenspunning2015-10-08T15:16:56Z2015-10-08T15:16:56ZTim Bradshaw
<p>Three approaches to solving problems on computers.</p>
<!-- more-->
<p>When faced with a computational problem there are three common approaches:</p>
<ol>
<li>write a program to solve the problem;</li>
<li>write a tool to solve the problem and other problems of the same kind;</li>
<li>write a programming language in which you can then write tools which solve problems of the same, and other, kinds.</li></ol>
<p>Most people start by doing the first. Bradshaw’s corollory to <a href="https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule">Greenspun’s tenth law</a> states:</p>
<ol>
<li>for problems of size \(s \ge s_1\), then, regardless of the initial approach, the final result is as if the third approach had been taken, even if this is not understood by the people solving the problem;</li>
<li>there is a problem size \(s_0\) above which it is most efficient to take the third approach from the beginning;</li>
<li>\(s_0 \lt s_1\).</li></ol>
<p>What this means is that, if you have a sufficiently large problem (\(s \ge s_1\)) to solve then, whatever your intentions, you will inevitably end up creating a programming language as part of the solution. And there is a range of problems smaller than this (\(s \in (s_0, s_1)\)) for which the <em>quickest</em> way to solve the problem is to design and implement a programming language.</p>
<p>So, when approaching a problem, it is important to understand the values of \(s_0\) & \(s_1\) and how they compare to \(s\). These values are hard to discover: a good trick is to start with a platform which makes \(s_0\) very small and always take the third approach.</p>Rumours of my deathurn:https-www-tfeb-org:-fragments-2015-02-01-rumours-of-my-death2015-02-01T20:54:34Z2015-02-01T20:54:34ZTim Bradshaw
<p>When I first used Lisp, the common refrain was that Lisp was dead.</p>
<!-- more-->
<p>There was a single free implementation of CL (which required you to physically sign a license of some kind and return it, in exchange for a tape) which was deficient in many respects. The two or three commercial implementations cost about a year’s salary each. Enormous effort had been spent on implementations which ran on special hardware. One variant of these cost more than your house: the other rather less, but turned out to have been implemented by the fey — you seriously did not want to spend too much time with it if you did not want problems involving having your firstborn somehow changed into a strange and somehow <em>absent</em> creature.</p>
<p>(And there was a terrible, unspeakable truth about even the expensive hardware: the people who implemented it didn’t understand computer performance very well with the result you would expect. The systems were faster than a VAX, but <em>everything</em> was faster than a VAX, including some PDP–11s. A Sun 3/260 ate them alive, and you could buy several of those for the cost of a house, with bundled licenses.)</p>
<p>Performance was pretty grim: of course nothing was fast on machines that, on a good day, could execute a few million instructions a second, but Lisp implementations were problematic at best. You spent a lot of time turning recursive code into iterative code by hand and writing macros (no inlining) to get performance to be reasonable and worrying about the primitive garbage collectors.</p>
<p>There was no standard: existing implementations differed in basic details like error handling (not in the aluminium book) and a standard object system was a distant dream. The news from the standards committee was ominous: the special-hardware people were exerting pressure and there were serious worries that the object system would not be efficiently implementable on stock hardware. The language was going to be huge.</p>
<p>Standard or semi-standard libraries were not really thought of.</p>
<p>Everyone knew Lisp was dead: the coming thing was, perhaps, Scheme — tail-call elimination <em>in the language</em>, a small language (yet MIT Scheme somehow had a bigger footprint than the CLs we used) — or C++ or some functional language whose name no-one now remembers. But Lisp was dead: no question about it.</p>
<hr />
<p>Fast forward.</p>
<hr />
<p>I have two high-quality CL implementations on my machine and one Scheme-derived system, also of very high quality, which created this blog: I have long ago stopped counting the number of good-quality free implementations. One of the implementations I use is commercial: the annual support is about 10% of my monthly rent. I can run dozens of instances of each without the machine noticing, and I could happily run a full CL development system on a system less powerful and smaller than my phone. Performance is a solved problem: yes, highly-optimised code is, perhaps, slower than optimised C or Fortran but since almost all performance problems are design problems no-one older than about 19 cares any more. CL has an advanced, performant and standard object system and, in effect, a standard metaobject system as well. The library problem has been solved by Quicklisp and a large number of good-quality standard libraries. I am still using code I wrote over twenty-five years ago with essentially no modification: meanwhile the Python code I wrote ten years ago is long rendered obsolete by gratuitous changes in the language (the Perl code I wrote at the same time is doing fine, however).</p>
<p>And yet still the cry goes up: Lisp is dead; Lisp is dead.</p>Macros in Racket, part twourn:https-www-tfeb-org:-fragments-2015-01-28-macros-in-racket-part-two2015-01-28T19:31:18Z2015-01-28T19:31:18ZTim Bradshaw
<p>The second part of my notes on writing macros in Racket.</p>
<!-- more-->
<p>This is the second part of at least three: the first part is <a href="../../../../2015/01/13/macros-in-racket-part-one/">here</a>, and the third part is <a href="../../../../2015/12/12/macros-in-racket-part-three/">here</a>. This won’t make much sense unless you’ve read that. As before I make no claims to be an expert in Racket’s macro system although I am familiar with Lisp macros in general: this is just some more notes I wrote while learning it.</p>
<h2 id="the-unwashed-lisp-hackers-version-of-collecting">The unwashed Lisp hacker’s version of <code>collecting</code></h2>
<p>So, we can write <code>clet</code>: can we write <code>collecting</code>? Yes, we can:</p>
<pre><code>(require (for-syntax racket/list))
(define-syntax (collecting stx)
(datum->syntax
(quote-syntax collecting)
`(let ([r '()])
(define (,(datum->syntax stx 'collect) it)
(set! r (cons it r)) it)
,@(rest (syntax->list stx))
(reverse r))))</code></pre>
<p>This works because, in the internal definition of <code>collect</code>, we’ve intentionally given it a name which uses the context of the syntax object we’re transforming, not the context of the macro. It’s easy to confirm that this works the way you would expect, and in particular that it’s safe in both directions: for instance</p>
<pre><code>> (let ((reverse (λ (x) x)))
(collecting (collect 1) (collect 2)))
'(1 2)</code></pre>
<p>shows that the binding of <code>reverse</code> when the macro is called has not ‘infected’ the macro definition.</p>
<p>It seems as if that should be all you need: so long as you are careful about which context you choose, and you make sure that the ‘default’ context is the one from the macro not from where it is used. In fact it isn’t, quite: see <a href="#macro-composition">below</a>. However even if it were, it’s clearly a pain to write macros this way.</p>
<h2 id="pattern-matching">Pattern matching</h2>
<p>Pretty much all macros do two things:</p>
<ol>
<li>deconstruct their arguments in some more-or-less complicated way, but almost always in a way which is significantly more complicated than anything that needs to be done for the arguments of a function;</li>
<li>construct a form which is the result of the macro and which, again, may be complicated.</li></ol>
<p>The beauty of traditional Lisp macros is that since the arguments and results of the macro were just what the reader spat out — lists and symbols and so on — and since Lisp was kind of good at doing things to these structures as it was designed for that, and finally since the whole power of the language was available in the macro, this was not horrible even without special tools, although it was not particularly pleasant for complicated macros.</p>
<p>Hygienic macros make this much less pleasant because the objects that need to be deconstructed and constructed are now opaque syntax objects, and there is additional worrying about context to do. The answer to this is to provide special tools which do the boring bits for you: this makes everything simpler, at the cost of making it still more opaque what is actually happening. In almost all cases that’s a tradeoff worth making. Pattern matching is also a fashionable thing amongst the young and hip, of course.</p>
<p>The way this is done in Racket is via <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax-case))" style="color: inherit">syntax-case</a></code>, its slightly simpler friend <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax-rules))" style="color: inherit">syntax-rules</a></code>, and by <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax))" style="color: inherit">syntax</a></code> and variants on it.</p>
<p><code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax-case))" style="color: inherit">syntax-case</a></code> takes a bit of syntax and matches it against patterns, binding matches, which can then be used in <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax))" style="color: inherit">syntax</a></code> forms lexically within it to return syntax objects, whose context is that of the <code>syntax-case</code> form (so hygienic). There is syntactic sugar for <code>syntax</code>: <code>(syntax ...)</code> can be written <code>#'...</code> in the same way that <code>(quote ...)</code> can be written <code>'...</code>. There is also <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/qqstx..rkt)._quasisyntax))" style="color: inherit">quasisyntax</a></code> which works the same way as <code><a href="http://docs.racket-lang.org/reference/quasiquote.html#(form._((lib._racket/private/letstx-scheme..rkt)._quasiquote))" style="color: inherit">quasiquote</a></code>, except that the various unquoting things are preceeded with <code>#</code>. <code>quasisyntax</code>, unsurprisingly also has syntactic sugar coating: <code>(quasisyntax ...)</code> can be written <code>#`...</code>.</p>
<p>I’m not going to describe the patterns in any detail, largely because I only understand the simple cases. However the simple cases are relatively easy to understand and pleasant to use.</p>
<p>Once a case has matched in <code>syntax-case</code> the corresponding expression is evaluated, and its value is the value of the form. Generally that wants to be a bit of syntax.</p>
<p>The first important thing to understand is that <code>syntax</code> is not <code>quote</code>-for-syntax: it interpolates things which matched in a lexically surrounding <code>syntax-case</code>, if there is one (if there isn’t, then I think it <em>is</em> <code>quote</code>-for-syntax).</p>
<p>The second important thing to understand is that <code>syntax-case</code> and <code>syntax</code> turn Racket into a sort of bodged Lisp–2: the things matched by <code>syntax-case</code> can be used <em>only</em> in <code>syntax</code> forms. But it’s not actually a separate namespace, because if you refer to them outwith such a form you get a compile-time error. I don’t know why this is — perhaps to avoid accidentally naming matches outside a <code>syntax</code> form — but it is certainly annoying.</p>
<p>So, here are some examples.</p>
<p>A simple <code>while</code> form:</p>
<pre><code>(define-syntax (while stx)
(syntax-case stx ()
[(_ test body ...)
#'(let loop ()
(when test
body ...
(loop)))]))</code></pre>
<p>A simple implementation of <code>let</code>, leaving out the named-<code>let</code> case, which shows how good the pattern matching is:</p>
<pre><code>(define-syntax (with stx)
(syntax-case stx ()
[(_ ([var val] ...) body ...)
#'((λ (var ...) body ...) val ...)]))</code></pre>
<p>A better implementation which deals with the empty body case (<code>(λ (...))</code> is illegal in Racket) and also optimises a simple case:</p>
<pre><code>(define-syntax (with stx)
(syntax-case stx ()
[(_ () body ...)
;; no vars: trivial case
#'(begin body ...)]
[(_ ([var val] ...))
;; null body: make sure vars are evaluated
#'(begin val ... (void))]
[(_ ([var val] ...) body ...)
#'((λ (var ...) body ...) val ...)]))</code></pre>
<p>One thing which <code>syntax-case</code> allows is the notion of literal names which must occur in the source. So for instance let’s say I wanted to write some mutant <code>loop</code> macro whose syntax was <code>(loop for x in y do ...)</code>: where <code>for</code>, <code>in</code>, <code>do</code> are literals. Well, I can write something to match this:</p>
<pre><code>> (define-syntax (loop stx)
(syntax-case stx (for in do)
[(_ for v in l do body ...)
#'(for ([v (in-list l)]) body ...)]))
> (loop for x in '(1 2 3) do (print x))
123
> (loop with x in '(1 2 3) do (print x))
loop: bad syntax in: (loop with x in (quote (1 2 3)) do (print x))</code></pre>
<p>The syntax object that corresponds to <code>stx</code> here is the whole form: the equivalent to CL’s <code>&WHOLE</code>. It’s almost never necessary to worry about the <code>car</code> of this since it will obviously be <code>loop</code>. However I’m always tempted to provide it as a literal.</p>
<p><code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._syntax-rules))" style="color: inherit">syntax-rules</a></code> is (almost: there is some complexity I think) a wrapper around <code>syntax-case</code> which provides the function wrapper for it and which implicitly wraps the right hand side of the cases, which must be just one form, in a <code>syntax</code> form. So the above definition of <code>with</code> could be written:</p>
<pre><code>(define-syntax with
(syntax-rules ()
[(_ () body ...)
;; no vars: trivial case
(begin body ...)]
[(_ ([var val] ...))
;; null body: make sure vars are evaluated
(begin val ... (void))]
[(_ ([var val] ...) body ...)
((λ (var ...) body ...) val ...)]))</code></pre>
<p><code>syntax-rules</code> can be defined something like this (this is due to <a href="https://gist.github.com/tfeb/0b8531c94cf685824626">bmastenbrook</a>):</p>
<pre><code>(require (for-syntax
(rename-in racket
[syntax-rules racket:syntax-rules])))
(begin-for-syntax
(define-syntax syntax-rules
(racket:syntax-rules ()
[(_ literals (pattern expansion) ...)
(lambda (s)
(syntax-case s literals
(pattern #'expansion) ...))])))</code></pre>
<p><code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/misc..rkt)._define-syntax-rule))" style="color: inherit">define-syntax-rule</a></code> combines <code>define-syntax</code> and a single rule for <code>syntax-rules</code>. I <em>think</em> it might be equivalent to this:</p>
<pre><code>(define-syntax define-syntax-rule
(syntax-rules ()
[(_ (name pat ...) expansion)
(define-syntax name
(syntax-rules ()
[(name pat ...) expansion]))]))</code></pre>
<p>although I am probably missing some complexity here.</p>
<p>There is a useful variant on <code>syntax-case</code> called <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/stxcase-scheme..rkt)._with-syntax))" style="color: inherit">with-syntax</a></code>: it looks more like <code>let</code>-style thing, and <em>all</em> the patterns in the clauses must match, when all the pattern variables will be bound.</p>
<p>So, what about our desirable macros?</p>
<p><code>collect</code> is pretty easy. Here are two different versions. The first uses <code>quasisyntax</code>:</p>
<pre><code>(define-syntax (collecting stx)
(syntax-case stx ()
[(_) #'(void)]
[(_ body ...)
#`(let ([r '()])
(define (#,(datum->syntax stx 'collect) it)
(set! r (cons it r)) it)
body ...
(reverse r))]))</code></pre>
<p>The second uses <code>with-syntax</code>:</p>
<pre><code>(define-syntax (collecting stx)
(syntax-case stx ()
[(_) #'(void)]
[(_ body ...)
(with-syntax ([collect (datum->syntax stx 'collect)])
#'(let ([r '()])
(define (collect it)
(set! r (cons it r)) it)
body ...
(reverse r)))]))</code></pre>
<p>This is pretty nice, I think. Note that you could not do this with <code>syntax-rules</code>, or at least I can’t see how to do it: <code>syntax-rules</code> is quite a lot less general than <code>syntax-case</code>.</p>
<p><code>clet</code> is harder, because each element of the binding list may be either an identifier or a two-element list. If we insisted on a two-element list it would be easy (see above). Here is the best I can do:</p>
<pre><code>(require racket/undefined)
(define-syntax (clet stx)
(syntax-case stx ()
[(_ ()) #'(void)]
[(_ () body ...) #'(begin body ...)]
[(_ (b ...) body ...)
(let-values ([(vars vals)
(for/lists (as vs) ([binding (syntax->list #'(b ...))])
(syntax-case binding ()
[(var val)
(identifier? #'var)
(values #'var #'val)]
[var
(identifier? #'var)
(values #'var #'undefined)]
[_ (raise-syntax-error #f "bad binding" stx)]))])
#`((λ #,vars body ...) #,@vals))]))</code></pre>
<p>Well, this is still quite hairy, but almost all of the hair involves processing the binding list, which is done using <code>syntax-case</code> again, using an additional feature of it whereby it can use a ‘guard’ expression to decide whether a clause matches: <code>identifer?</code> returnt true if a syntax object refers to an identifier. I think there must be a way of using <code>with-syntax</code> to avoid the <code>quasisyntax</code> form.</p>
<p>Even with all this hair, this version of <code>clet</code> is far easier to read than the previous one, and not harder to read than the CL equivalent.</p>
<p>A better version of <code>clet</code> would, I think, need a proper parser for syntax. I think that is what <code><a href="http://docs.racket-lang.org/syntax/Parsing_Syntax.html#(form._((lib._syntax/parse..rkt)._syntax-parse))" style="color: inherit">syntax-parse</a></code> is, although I have not investigated that.</p>
<h2 id="macro-composition">Macro composition</h2>
<p>As mentioned above, we don’t yet have quite all the tools we need to write some kinds of macros: specifically macros which are intentionally slightly unygienic, such as <code>collecting</code>. As an example, let’s suppose we wanted a general purpose, intentionally-unhygenic, <code>with-abort</code> macro which provided an <code>abort</code> function which would, well, abort. Without thinking too hard about the implications of <code><a href="http://docs.racket-lang.org/reference/cont.html#(def._((lib._racket/private/more-scheme..rkt)._call/cc))" style="color: inherit">call/cc</a></code> we could write this as:</p>
<pre><code>(define-syntax (with-abort stx)
(syntax-case stx ()
[(_ body ...)
#`(call/cc (λ (#,(datum->syntax stx 'abort))
body ...))]))</code></pre>
<p>So now <code>(with-abort (abort 2) (end-the-world))</code> returns <code>2</code> and does not end the world.</p>
<p>Well, we might want to use this macro in another macro:</p>
<pre><code>(define-syntax-rule (while/abort test body ...)
(with-abort
(let loop ([r test])
(when r
body ...
(loop test)))))</code></pre>
<p>Now something like the following will work:</p>
<pre><code>> (let ([x 0])
(while/abort (< x 10) (set! x (+ x 1)) (print x)))
12345678910</code></pre>
<p>But the whole point was to be able to use <code>abort</code> in the body, and that <em>doesn’t</em> work:</p>
<pre><code>> (let ([x 0])
(while/abort (< x 10) (set! x (+ x 1)) (when (> x 1) (abort 'done))))
abort: undefined;
cannot reference an identifier before its definition</code></pre>
<p>Oh, dear. The problem here is that <code>while/abort</code> is hygenic, so the <code>abort</code> binding that is introduced by <code>with-abort</code> is not visible in the body.</p>
<p>We could fix this by better design:</p>
<pre><code>(define-syntax-rule (with-named-abort (abort) body ...)
;; a better macro
(call/cc (λ (abort) body ...)))
(define-syntax (with-abort stx)
;; backwards compatible
(syntax-case stx ()
[(_ body ...)
#`(with-abort (#,(datum->syntax stx 'abort)) body ...)]))
(define-syntax (while/abort stx)
;; the end result
(syntax-case stx ()
[(_ test body ...)
#`(with-named-abort (#,(datum->syntax stx 'abort))
(let loop ([r test])
(when r
body ...
(loop test))))]))</code></pre>
<p>But that’s not the solution we’re after.</p>
<p>Racket’s answer to this is <a href="http://www.schemeworkshop.org/2011/papers/Barzilay2011.pdf">syntax parameters</a>. I don’t completely understand these, but they are at least close to dynamic variables, except at macro-expansion time. What you do is to define a syntax parameter, and then rebind it during the expansion: the rebound value is visible to macros which are expanded dynamically within the rebinding form. As with Racket’s <a href="http://docs.racket-lang.org/guide/parameterize.html">ordinary special variables</a> these look like functions (yet another namespace in disguise).</p>
<p>So we can define a syntax parameter called <code>abort</code> using <code><a href="http://docs.racket-lang.org/reference/stxparam.html#(form._((lib._racket/stxparam..rkt)._define-syntax-parameter))" style="color: inherit">define-syntax-parameter</a></code>:</p>
<pre><code>(require racket/stxparam)
(define-syntax-parameter abort
(λ (stx)
(raise-syntax-error #f "not available" stx)))</code></pre>
<p>So now any reference to <code>abort</code> will result in a syntax error:</p>
<pre><code>> (abort)
abort: not available in: (abort)
> abort
abort: not available in: abort</code></pre>
<p>And we can now try to use <code><a href="http://docs.racket-lang.org/reference/stxparam.html#(form._((lib._racket/stxparam..rkt)._syntax-parameterize))" style="color: inherit">syntax-parameterize</a></code>, to rebind <code>abort</code> as a macro:</p>
<pre><code>(define-syntax with-abort
(syntax-rules (with-abort)
[(with-abort) (void)]
[(with-abort body ...)
(call/cc
(λ (a)
(syntax-parameterize ([abort
(syntax-rules ()
[(_ ...) (a ...)])])
body ...)))]))</code></pre>
<p>And this fails horribly, because the outer <code>syntax-rules</code> thinks it owns the patterns and sees <code>...</code>s that it does not expect. So much for that.</p>
<p>Well, we could at least check this works with a specific number of arguments:</p>
<pre><code>(define-syntax with-abort
(syntax-rules (with-abort)
[(with-abort) (void)]
[(with-abort body ...)
(call/cc
(λ (a)
(syntax-parameterize ([abort
(λ (stx)
(syntax-case stx (abort)
[(abort) #'(a)]
[(abort x) #'(a x)]
[_ (raise-syntax-error #f "I give up" stx)]))])
body ...)))]))</code></pre>
<p>But this is obviously just a rubbish answer.</p>
<p>Well, there is an answer to this: all we really need to do is to make the <code>abort</code> macro attach itself to <code>a</code>, and there is a special hack, <code><a href="http://docs.racket-lang.org/reference/stxtrans.html#(def._((quote._~23~25kernel)._make-rename-transformer))" style="color: inherit">make-rename-transformer</a></code>, to do this:</p>
<pre><code>(define-syntax with-abort
(syntax-rules (with-abort)
[(with-abort) (begin)]
[(with-abort body ...)
(call/cc
(λ (a)
(syntax-parameterize ([abort (make-rename-transformer #'a)])
body ...)))]))</code></pre>
<p>And this now works:</p>
<pre><code>> (with-abort (abort 1 2 3))
1
2
3</code></pre>
<p>And we can use this to write a really robust version of <code>collecting</code></p>
<pre><code>(require racket/stxparam)
(define-syntax-parameter collect
(λ (stx)
(raise-syntax-error #f "not collecting" stx)))
(define-syntax collecting
(syntax-rules ()
[(_) (void)]
[(_ body ...)
(let ([r '()])
(define (clct it)
(set! r (cons it r)) it)
(syntax-parameterize ([collect (make-rename-transformer #'clct)])
body ...
(reverse r)))]))</code></pre>
<p>As far as I can see there is still a problem, however: it is very hard to write macros which expand to other macros which themselves do pattern-matching, since the patterns get acquired by the outer macros. There must be some answer to this, but I can’t see what it is.</p>
<p>On the other hand, this is also extremely painful in CL: here is a version of <code>collecting</code> where <code>collect</code> is a local macro:</p>
<pre><code>(defmacro collecting (&body forms)
;; collect lists forwards using a tail pointer
;; local macro version
(let ((rn (make-symbol "R"))
(rtn (make-symbol "RT"))
(itn (make-symbol "IT")))
`(let ((,rn '())
(,rtn nil))
(macrolet ((collect (form)
`(let ((,',itn ,form))
(if (not (null ,',rn))
(setf (cdr ,',rtn) (cons ,',itn nil)
,',rtn (cdr ,',rtn))
(setf ,',rn (cons ,',itn nil)
,',rtn ,',rn))
,',itn)))
,@forms)
,rn)))</code></pre>
<p>This is not easy to understand.</p>
<p>Additionally, the problem almost always comes from ellipses, and in many interesting cases they can be avoided by using dotted pairs as patterns — here is yet another version of <code>with-abort</code> that does this:</p>
<pre><code>(require racket/stxparam)
(define-syntax-parameter abort
(λ (stx)
(raise-syntax-error #f "not available" stx)))
(define-syntax with-abort
(syntax-rules (with-abort)
[(with-abort) (void)]
[(with-abort body ...)
(call/ec
(λ (a)
(syntax-parameterize ([abort
(syntax-rules (abort)
[(abort . args) (a . args)])])
body ...)))]))</code></pre>
<p>This is clearly better than the CL version.</p>
<h2 id="summary">Summary</h2>
<p>Well, I think I now know enough about Racket’s macros to be going on with: I can certainly write the macros I need to be able to write now without it just being cargo-cult programming. There are still things I don’t understand, and the whole system smells to me as if, by trying remain ideologically pure, it has become vast and essentially incomprehensible. This seems to be a common problem with Scheme, unfortunately.</p>
<h2 id="small-notes">Small notes</h2>
<p>Macro definitions scope properly, so you can define a local macro the same way you can define a local function, so this works:</p>
<pre><code>(define (foo ...)
(define-syntax-rule (while test body ...)
(let loop ()
(when test
body ...
(loop))))
... (while ... ...) ...)</code></pre>
<p>This makes the equivalent of CL’s <code>MACROLET</code> easy to do.</p>
<p>For fun, here is a version of <code>with</code> which can deal with named-<code>let</code>: There must be a way of implementing this without assignment, but I can never work out what it is.</p>
<pre><code>(define-syntax (with stx)
(syntax-case stx ()
[(_ ())
;; all null
#'(void)]
[(_ () body ...)
;; no vars: trivial case
#'(begin body ...)]
[(_ ([var val] ...))
;; null body: make sure vars are evaluated
#'(begin val ... (void))]
[(_ ([var val] ...) body ...)
;; normal let
#'((λ (var ...) body ...) val ...)]
[(_ n ())
(identifier? #'n)
;; named null
#'(void)]
[(_ n ([var val] ...))
(identifier? #'n)
;; named null body
#'(begin val ... (void))]
[(_ n ([var val] ...) body ...)
;; named let with arguments
;; (is there an implementation without assignment?
(identifier? #'n)
#'((λ (n)
((λ (l)
(set! n l)
(l val ...))
(λ (var ...) body ...)))
#f)]
[_ (raise-syntax-error #f "bad syntax" stx)]))</code></pre>
<h2 id="things-i-still-do-not-know-or-understand">Things I still do not know or understand</h2>
<p>At this point I’m mostly comfortable writing macros in Racket, but there are things I still do not understand:</p>
<ul>
<li>protecting and arming syntax objects — I just don’t understand what this is about at all;</li>
<li><code><a href="http://docs.racket-lang.org/syntax/Parsing_Syntax.html#(form._((lib._syntax/parse..rkt)._syntax-parse))" style="color: inherit">syntax-parse</a></code> is, I think, not difficult but I have not bothered to learn about it as it seems to add yet another layer.</li>
<li>there are probably other things that I don’t even know I don’t know.</li></ul>
<p>At some point I might write a further part of this series on some of that.</p>
<hr />
<h2 id="pointers">Pointers</h2>
<p><a href="http://www.schemeworkshop.org/2011/papers/Barzilay2011.pdf">Eli Barilay’s paper on <code>syntax-parameterize</code></a>.</p>
<p><a href="http://www.greghendershott.com/fear-of-macros/index.html">Fear of Macros</a>, again.</p>Macros in Racket, part oneurn:https-www-tfeb-org:-fragments-2015-01-13-macros-in-racket-part-one2015-01-13T14:45:48Z2015-01-13T14:45:48ZTim Bradshaw
<p>I’ve written in Lisp for a long time, but I’ve never used a hygienic macro system in any way other than the most simple. Here are some initial notes on my experiences learning <a href="http://racket-lang.org/">Racket</a>’s macro system.</p>
<!-- more-->
<p>This is the first part of several: see <a href="../../../../2015/01/28/macros-in-racket-part-two">part two</a> and <a href="../../../../2015/12/12/macros-in-racket-part-three/">part three</a>. I’m not completely fluent with Racket macros yet: there are almost certainly mistakes and confusions here. Despite appearances, I also have no axe to grind: I’m learning Racket because I want to and I have time. Finally this is not a tutorial: look at Greg Hendershott’s <a href="http://www.greghendershott.com/fear-of-macros/index.html">Fear of Macros</a> for something closer to that. This is just some notes which were useful to me, and might be useful to other CL people.</p>
<h2 id="macros-in-common-lisp">Macros in Common Lisp</h2>
<p><a href="http://www.lispworks.com/documentation/common-lisp.html">Common Lisp</a>’s macro system is, in essence, simple: it’s what you’d end up writing if you had to write a macro system for a Lisp. That’s not surprising because it <em>is</em> the descendent of the first macro systems people wrote for Lisp. In CL what happens is this:</p>
<ol>
<li>the reader ingests the source text and produces data structures which represent the source of the program;</li>
<li>these structures are possibly transformed by macros, which are simply Lisp functions which are given the Lisp representation of the source and return some other representation;</li>
<li>once all macros are expanded, then the code is compiled, evaluated or both.</li></ol>
<p>(I have missed out some subtleties here, but they don’t matter for my purposes.)</p>
<p>In CL, what the reader produces is exactly what you would expect. If it reads <code>"(defun foo (a) a)"</code> then, with standard settings, it returns a list whose car is the symbol <code>DEFUN</code> (in the <code>CL</code> package) and so on. It is this structure that macros transform.</p>
<p>CL provides relatively limited support for writing macros: there is backquote, which is critical to being able to write macros which are even slightly readable, limited pattern matching in the form of destructuring, and there are mechanisms to generate unique names as well a few other things. There is a semi-standard way of enquiring about bindings in the environment at macro expansion time, although this is not in the standard.</p>
<p>In practice, CL’s macro system has turned out to work very well; in theory it has all sorts of problems, the most important being that the programmer is entirely responsible for making sure that macros don’t introduce or accidentally use names they should not. Consider this:</p>
<pre><code>(defmacro collecting (&body forms)
;; collect lists forwards using a tail pointer
;; polluting version
`(let ((r '())
(rt nil))
(flet ((collect (form)
(if (not (null r))
(setf (cdr rt) (cons form nil)
rt (cdr rt))
(setf r (cons form nil)
rt r))
form))
,@forms)
r))</code></pre>
<p>This intentionally introduces a function binding, <code>collect</code>, but also accidentally introduces bindings for <code>r</code> and <code>rt</code>.</p>
<pre><code>(let ((r 2))
(collecting
(+ r r)))</code></pre>
<p>Does not do what it should. One right way to write the <code>collecting</code> macro is like this:</p>
<pre><code>(defmacro collecting (&body forms)
;; collect lists forwards using a tail pointer
;; non-polluting version
(let ((rn (make-symbol "R"))
(rtn (make-symbol "RT")))
`(let ((,rn '())
(,rtn nil))
(flet ((collect (form)
(if (not (null ,rn))
(setf (cdr ,rtn) (cons form nil)
,rtn (cdr ,rtn))
(setf ,rn (cons form nil)
,rtn ,rn))
form))
,@forms)
,rn)))</code></pre>
<p>And now the above form does not signal an error and correctly returns <code>()</code>.</p>
<p>Note that the problem is with <em>names</em> and not just bindings. Consider this CL code:</p>
<pre><code>(defvar *stashes* '())
(defvar *mark* nil)
(defun stash (name thing)
;; Stash something under a name
(setf *stashes* (acons name thing *stashes*))
(values name thing))
(defun retrieve (name)
;; Retrieve the value of a name, dropping everything stashed more
;; recently, and stopping at the mark, if any.
(let ((mark *mark*))
(labels ((rl (tail)
(if (or (null tail)
(eq (first tail) mark))
(values nil nil)
(destructuring-bind ((n . v) . r) tail
(if (eql n name)
(progn
(setf *stashes* r)
(values v t))
(rl r))))))
(rl *stashes*))))
(defmacro with-marked-stash (&body forms)
;; mark the stack of stashes for the dynamic extent of FORMS
(let ((mn (make-symbol "MARK")))
`(let ((*stashes* (cons ',mn *stashes*))
(*mark* ',mn))
,@forms)))</code></pre>
<p>In this code the marks on the stack of stashes established by <code>with-marked-stash</code> are not bound anywhere: they are just names. But it’s important to the correct functioning of the code that they are <em>unique</em> names. (There are better ways of doing this such as using a fresh cons for the mark: I just wanted an example where a name mattered other than as the name of a variable.)</p>
<p>The politically correct way of saying that we’re talking about names is to talk about ‘lexical context’ or ‘lexical information’: it’s the same thing but more confusing to those not initiated into the cult, which is always good.</p>
<p>The disadvantages of the CL macro system are this problem with hygiene and the lack of any clever tools to do pattern matching on macro forms. The second of these is easily overcome by using any of a number of tools, while the first is generally not a problem in practice: CL being a Lisp–2 (separate namespaces for functions and variables) helps here.</p>
<p>The advantage of the CL macro system is that there is no magic: macros get passed the things that the source code looks like — generally a structure whose interesting parts are lists and symbols — which you process using the normal list-processing tools to produce some other structure which is the expansion of the macro. It’s easy enough that you could write it yourself: there are no special opaque objects being handed around.</p>
<p>That being said, having a <em>standard</em> set of tools for pattern matching in macros and a way of dealing with the hygiene problems which is less ugly than in CL might well be worth the cost in transparency.</p>
<h2 id="macros-in-scheme">Macros in Scheme</h2>
<p>I am not a native <a href="https://en.wikipedia.org/wiki/Scheme_%28programming_language%29">Scheme</a> person, but it has clearly taken the whole hygiene thing very seriously: Scheme, as a set of languages, treats purity as much more than CL, which revels in being a fairly grungy language, does. However these posts are not about Scheme: the only reason I am mentioning it is to say that I have not cared at all whether anything here applies generally to Scheme or is specific to Racket.</p>
<h2 id="macros-in-racket-baby-steps">Macros in Racket: baby steps</h2>
<p>For a long time the only kind of macros that I’ve really been able to define in Racket are annoyingly trivial ones using <code><a href="http://docs.racket-lang.org/reference/stx-patterns.html#(form._((lib._racket/private/misc..rkt)._define-syntax-rule))" style="color: inherit">define-syntax-rule</a></code>, things like:</p>
<pre><code>(define-syntax-rule (while test body ...)
(let loop ()
(when test
body ...
(loop))))</code></pre>
<p>That’s all very well, but the ‘obvious’ (and obviously wrong) definition of <code>collect</code> then looks like this:</p>
<pre><code>(define-syntax-rule (collecting body ...)
;; horribly wrong
(let ([s '()])
(define (collect it)
(set! s (cons it s))
it)
body ...
(reverse s)))</code></pre>
<p>(There’s no obvious way to build lists backwards in Racket: reversing the list is probably as cheap as anything). This is either introducing a spurious binding for <code>s</code> or not introducing a deliberate one for <code>collect</code>, and in fact, of course, it’s the latter.</p>
<p>Quite apart from this, <code>define-syntax-rule</code> gives the strong impression that it lets you write only the sort of macros that would give people who write C++ great pride: simple ones. (Actually you can do reasonably hairy things even with this because the pattern matching is very competent:</p>
<pre><code>(define-syntax-rule (mlet ([var val] ...) body ...)
((λ (var ...) body ...) val ...))</code></pre>
<p>is an implementation of simple <code>let</code>, for instance. Indeed we can defined named <code>let</code> as well:</p>
<pre><code>(define-syntax-rule (nlet label ([var val] ...) body ...)
(mlet ()
(define (label var ...) body ...)
(label val ...)))</code></pre>
<p>What I <em>can’t</em> work out how to do is to make <code>mlet</code> do both things: I think this is too hard for <code>define-syntax-rule</code> although I might be wrong.)</p>
<p>But for a long time I was stuck with that: whenever I looked at Racket macros in more detail I walked into a wall of opaque terminology and just decided that I had better things to do that year. This year, I don’t.</p>
<h2 id="two-desirable-macros">Two desirable macros</h2>
<p>There are many ways people use macros in Lisp: some of them are good. I decided that if I could write two macros <em>and understand them</em> then I would be well on my way.</p>
<ul>
<li><code>collecting</code> / <code>collect</code>. This is the macro given above in CL. It’s interesting not for what it does — the tail-pointer stuff is less interesting now than it once was and is hard to implement in Racket anyway — but because it introduces a binding: it is intentionally not completely hygienic, while having an essentially trivial expansion: no complicated destructuring is needed.</li>
<li>CL’s <code>let</code>, which I’ll call <code>clet</code>. This is interesting because it requires destructuring of arguments which is not completely simple, but it does not present problems of hygiene. The reason it’s not just a subset of Racket’s <code><a href="http://docs.racket-lang.org/reference/let.html#(form._((lib._racket/private/letstx-scheme..rkt)._let))" style="color: inherit">let</a></code> is that CL allows variables with no initial value, which get bound to <code>nil</code> and should, I think, become <code>undefined</code> in Racket. So <code>(clet ((x 1) y) body ...)</code> should expand to <code>(let ([x 1] [y undefined]) body ...)</code> or something equivalent to that.</li></ul>
<p>Here is a simple implementation of <code>clet</code> in CL, missing any error checking:</p>
<pre><code>(defmacro clet (bindings &body forms)
(multiple-value-bind (args vals)
(loop for binding in bindings
for consp = (consp binding)
collect (if consp (first binding) binding) into as
collect (if consp (second binding) nil) into vs
finally (return (values as vs)))
`((lambda (,@args) ,@forms) ,@vals)))</code></pre>
<p>Like most macros in CL it’s not particularly pretty but it is reasonably clear what it does.</p>
<p>I will use these two macros as examples below.</p>
<h2 id="phases">Phases</h2>
<p>To understand macros in any Lisp you need to develop a strong idea of the various ‘times’ that things happen and the relationships between them: for CL these are things like read time, macro expansion time, compilation time (compiler-macro expansion time), load time, run time and so on. Racket has formalised the parts of this after read time into a notion of ‘phase’:</p>
<ul>
<li>phase 0 is run-time;</li>
<li>phase 1 is macro expansion time;</li>
<li>phase 2 would, I think, be macros used in macro expansion;</li>
<li>and so on.</li></ul>
<p>However I am not sure how this ties in to read time: is that phase 1? For CL read time is <em>before</em> macro expansion time although the two are, or may be, interleaved at the granularity of forms (rather than a per-file or per-compilation-unit). Also there are negative phases which I don’t understand, although I think they must be to do with code which exists at macro expansion time (phase 1) wanting to make things available at run time (phase 0). All of this is integrated into the module system (and CL gets away without it mostly because it does not have a formalised module system).</p>
<p>Bindings exist at a phase, and the same name can have different bindings at different phases.</p>
<p>Modules can say what they <code><a href="http://docs.racket-lang.org/reference/require.html#(form._((lib._racket/private/base..rkt)._provide))" style="color: inherit">provide</a></code> at which phase, and, importantly, the <code>racket</code> module does indeed provide different things at different phases: if you look at it you’ll find:</p>
<pre><code>(provide ...
(for-syntax (all-from-out racket/base)))</code></pre>
<p>Which means that, at phase 1, what is available is <code>racket/base</code>: a significantly smaller language than <code>racket</code> itself. If you need things in macros which are in <code>racket</code> but not <code>racket/base</code> you need to <code><a href="http://docs.racket-lang.org/reference/require.html#(form._((lib._racket/private/base..rkt)._require))" style="color: inherit">require</a></code> them:</p>
<pre><code>(require (for-syntax ...))</code></pre>
<p>An example of this is <code><a href="http://docs.racket-lang.org/reference/pairs.html#(def._((lib._racket/list..rkt)._first))" style="color: inherit">first</a></code> & <code><a href="http://docs.racket-lang.org/reference/pairs.html#(def._((lib._racket/list..rkt)._rest))" style="color: inherit">rest</a></code>, both of which are provided at phase 0 by <code>racket</code> but <em>not</em> at phase one: if you want them you need to say <code>(require (for-syntax racket/list))</code>.</p>
<h2 id="syntax-objects">Syntax objects</h2>
<p>As in CL, Racket macros are source-to-source functions. The difference is that in Racket the source is represented by a <a href="http://docs.racket-lang.org/reference/syntax-model.html">syntax object</a> and a macro needs to produce another syntax object, while in CL source is represented as it looks: usually as nested lists.</p>
<p>So then a Racket macro is simply a function which maps from syntax objects to other syntax objects. The reason for having an opaque syntax object is that it can carry around all sorts of information around with it, and in particular it can carry information about <em>names</em>, which help the system maintain hygiene. (There is also information about source location and so on, but this isn’t so important.)</p>
<p>So the Racket macro system needs tools to transform syntax objects into other syntax objects, ultimately by digging around inside them to find out what the source code actually was. This is necessarily more complicated than it is in CL both because the objects are opaque and because they contain information which is not present at all in the objects CL macros get.</p>
<p>Additionally, and mostly independently, there is a layer on top of this which does not exist in CL (without libraries) at all: pattern matching and template filling. This means that for many purposes you can write macros in Racket simply by specifying patterns that the source must match and filling templates with the results of those matches. This is a very nice way of writing macros, although it renders what is actually going on even more opaque. For a CL person, used to feeling the bits between their toes, this can be quite disconcerting at first since what is actually <em>happening</em> can become entirely obscure.</p>
<h2 id="syntax-objects-for-the-unwashed-lisp-hacker">Syntax objects for the unwashed Lisp hacker</h2>
<p>Well, of course it is possible to ignore all this terrifyingly modern pattern matching stuff and write macros almost the way you do in CL, and it’s worth doing that at least once, perhaps. So here is <code>clet</code>:</p>
<pre><code>(require (for-syntax racket/list)
racket/undefined)
(define-syntax clet
(λ (stx)
(define ctx (quote-syntax clet))
(define top-level (syntax->list stx))
(define bindings (second top-level))
(define body (rest (rest top-level)))
(define-values (args vals)
(for/lists (as vs) ([binding (syntax->list bindings)])
(define it (syntax->list binding))
(if it
(values (first it) (second it))
(values binding (datum->syntax ctx 'undefined)))))
(datum->syntax
ctx
`((λ (,@args) ,@body) ,@vals))))</code></pre>
<p>So how does this work? Well, it uses some functions provided by Racket to look inside the syntax object (getting the ‘datum’ in the syntax object) and in turn to construct a new one:</p>
<ul>
<li><code><a href="http://docs.racket-lang.org/reference/stxops.html#(def._((quote._~23~25kernel)._syntax-~3elist))" style="color: inherit">syntax->list</a></code> takes a syntax object which wraps a proper list and unpacks one level of it, returning a list of syntax objects, or <code>#f</code> if it does not wrap a proper list;</li>
<li><code><a href="http://docs.racket-lang.org/reference/stxops.html#(def._((quote._~23~25kernel)._datum-~3esyntax))" style="color: inherit">datum->syntax</a></code> takes a context object and a datum and wraps it into a syntax object, leaving any syntax objects in the datum as they are;</li>
<li><code><a href="http://docs.racket-lang.org/reference/Syntax_Quoting__quote-syntax.html#(form._((quote._~23~25kernel)._quote-syntax))" style="color: inherit">quote-syntax</a></code> is like <code><a href="http://docs.racket-lang.org/reference/quote.html#(form._((quote._~23~25kernel)._quote))" style="color: inherit">quote</a></code> but it creates a syntax object, and this object contains the lexical information present in the source.</li></ul>
<p>So the macro pulls apart the syntax object in a fairly straightforward way: making it into a list, extracting the second element and all the remaining elements, which will be the binding specifications, and then grinding over the binding specifications, using <code>syntax->list</code> both to work out if the bindings are a list or not and to extract the variable and value if it is, and then reassembles everything as a call to an anonymous function.</p>
<p>The critical trick is that the context that <code>datum->syntax</code> needs <em>is a syntax object</em> and you need to pick the right one: you can use the syntax object you got given, which provides the context of the place where the macro was expanded, or you can use a syntax object of your own devising which provides that object’s context. And in this case we want our own context, not the context of place where the macro was expanded. This is what <code>ctx</code> is for: providing a suitable context.</p>
<p>Notice the <code>require</code>:</p>
<ul>
<li>we need <code>racket/list</code> at phase 1 (macro expansion time) because the macro uses <code>first</code> and so on;</li>
<li>we need <code>racket/undefined</code> at phase 0 (run time) as the expansion of the macro uses <code>undefined</code>.</li></ul>
<p>So we can try this:</p>
<pre><code>(clet ((x 12) y) (values x y))
12
#<undefined>
> (let ((undefined 'hello)) (clet (x) x))
#<undefined>
> (clet ((undefined 'hello)) (clet (x) x))
#<undefined>
> (clet ((x 1)))
λ: bad syntax in: (λ (x))
> (clet (1) 1)
λ: not an identifier, identifier with default, or keyword in: 1</code></pre>
<p>The second and third examples show why we need the macro context: we don’t want a binding of <code>undefined</code> to alter what the <code>clet</code> picks as the undefined value. The fourth and fifth examples show that the macro isn’t very robust, and has terrible error reporting.</p>
<p>Some notes:</p>
<ul>
<li>I’ve deliberately written <code>(define-syntax clet (λ (stx) ...)</code> rather than the more pleasant <code>(define-syntax (clet stx) ...)</code> to make it clear that <code>clet</code> is a function which transforms a syntax object;</li>
<li>but I’ve used internal <code><a href="http://docs.racket-lang.org/reference/define.html#(form._((lib._racket/private/base..rkt)._define))" style="color: inherit">define</a></code> where in CL there would be <code>let*</code> or nested <code>let</code>s — I’m not sure why other than reducing indentation;</li>
<li>the destructuring of the syntax object is done in a way which is primitive even by the standards of CL;</li>
<li>it should be evident that the macro is not very robust — something like <code>(clet ((x 1) 2) ...)</code> will fail horribly;</li>
<li>it’s not <em>much</em> less clear than the CL version, although I think it is a bit less clear.</li></ul>
<p>I am fairly but not completely sure that this macro is right: I am slightly confused by the handling of <code>undefined</code>: although it is easy to check, by wrapping <code>clet</code> into a module, that clients of that module don’t themselves need to import <code>racket/undefined</code> and do get the right initial values in forms like <code>(clet (x) ...)</code> I am still a bit queasy about what it’s doing.</p>
<p>What is very clear is that this macro is just horrible: even by the standards of CL macros it’s horrible, because there is so much explcit unpacking and repacking going on. Things would be even worse if there was any significant error checking. Something better than this is needed to deal with syntax objects, in a way that it isn’t needed for CL macros. In <a href="../../../../2015/01/28/macros-in-racket-part-two">next week’s exciting episode</a> I’ll look at ways of making this better.</p>
<hr />
<h2 id="pointers">Pointers</h2>
<p><a href="http://blog.racket-lang.org/2011/04/writing-syntax-case-macros.html">Writing ‘syntax-case’ Macros</a> by Eli Barzilay. This was the article that first helped me understand what was going on.</p>
<p><a href="http://www.greghendershott.com/fear-of-macros/index.html">Fear of Macros</a> by Greg Greg Hendershott. This is an introduction to macros, and macros in Racket in particular, by the author of Frog.</p>The cult of programmingurn:https-www-tfeb-org:-fragments-2015-01-05-the-cult-of-programming2015-01-05T19:24:26Z2015-01-05T19:24:26ZTim Bradshaw
<p>Programming is <em>not meant to be easy</em> and it’s important to make sure that it is as cryptic as possible otherwise people other than cult members might be able to understand it. Of course, you also need to make sure it’s <em>pure</em>, because otherwise cult members will laughingly throw you into a pit full of spikes and the rotting remains of other heretics.</p>
<!-- more-->
<p>For instance, you can’t be writing this sort of thing:</p>
<pre><code>(defun ss (n)
(let ((s 0) (i 0))
(tagbody
loop
(when (> i n) (go done))
(setf s (+ s (* i i))
i (+ i 1))
(go loop)
done
(return-from ss s))))</code></pre>
<p>This is just terrible code. Non cult members may well be able to understand it, and the cultists will have you in the pit before you know it.</p>
<p>You might think this was better</p>
<pre><code>(defun ss (n)
(loop for i from 0 to n
summing (* i i)))</code></pre>
<p>But in fact it’s far worse. Fellow cultists will definitely still be at the laughing and pit-throwing, and the others will certainly understand it <em>and laugh at you</em> because you don’t know the closed form.</p>
<p>Instead, you must write this:</p>
<pre><code>(define (ss n)
(let-values ([(a i l) (call/cc (λ (c) (values 0 0 c)))])
(l (+ a (* i i))
(+ i 1)
(if (< i (- n 1))
l
(λ (a i l) a)))))</code></pre>
<p>This is almost a perfect solution. It’s so achingly pure and cryptic that you will be immediately appointed king of the cult and be able to do your own laughing, and throw other members into pits you have first made them dig, for which they will thank you as they slide down the spikes. Non cult members stand essentially no chance of understanding what it does and sniping about the whole silly closed-form thing: certainly the only way they will be able to learn what it does is by first joining the cult, at which point, as king, you can just throw them straight into the pit.</p>
<p>It’s important you understand this.</p>