Democracy

:: politics, doomed

Sometime in the middle of 2019, the UK will have a new prime minister. He1 will have considerable power to control whether, when and how the UK leaves the EU.

Function calling conventions and bindings

:: computer, lisp

An attempt to describe three well-known function calling conventions in terms of bindings.


A little while ago I wrote an article on bindings which, in turn, was based on my answer to this Stack Overflow question. I have since written another answer to a more recent question and I thought it would be worth summarising part of that to describe how three famous function calling conventions can be described in terms of bindings1.

Bindings in brief

A binding is an association between a name (a variable) and a value, where the value can be any object the language can talk about. In most Lisps (and other languages) bindings are not first-class: the language can not talk about bindings directly, and in particular bindings can not be values. Bindings are, or may be, mutable: their values (but not their names) can be changed by assignment. Many bindings can share the same value. Bindings have scope (where they are accessible) and extent (how long they are accessible for) and there are rules about that.

Call by value

In call by value the value of a binding is passed to a procedure. This means that the procedure can not mutate the binding itself. If the value is a mutable object it can be altered by the procedure, but the binding can not be.

Call by value is the convention used by all Lisps I know of. Here is a function which demonstrates that call by value can not mutate bindings:

(defun pbv (&optional (fn #'identity))
  ;; If FN returns then the first value of this function will be T
  (let ((c (cons 0 0)))                 ;first binding
    (let ((cc c))                       ;second binding, shares value with first
      (funcall fn c)                    ;FN gets the *value* of C
      (values (eq c cc) c))))           ;C and CC still refer to the same object

Call by reference

In call by reference, procedures get the bindings themselves as arguments. If a procedure modifies the binding by assignment, then it is modified in the calling procedure as well.

Lisp does not use call by reference: Fortran does, or can, use a calling mechanism which is equivalent to call by reference2.

It is possible to implement what is essentially call by reference in Lisp (here Common Lisp, but any Lisp with lexical scope, indefinite extent & macros can do this) using some macrology:

(defmacro capture-binding (var)
  ;; Construct an object which captures a binding
  `(lambda (&optional (new-val nil new-val-p))
     (when new-val-p
       (setf ,var new-val))
     ,var))

(declaim (inline captured-binding-value
                 (setf captured-binding-value)))

(defun captured-binding-value (cb)
  ;; value of a captured binding
  (funcall cb))

(defun (setf captured-binding-value) (new cb)
  ;; change the value of a captured binding
  (funcall cb new))

And now, given

(defun mutate-binding (b v)
  (setf (captured-binding-value b) v))

(defun sort-of-call-by-reference ()
  (let ((c (cons 1 1)))
    (let ((cc c))
      (mutate-binding (capture-binding cc) 3)
      (values c cc))))

> (sort-of-call-by-reference)
(1 . 1)
3

The trick here is that the procedure created by the capture-binding macro has access to the binding being captured, and can mutate it.

Call by name

Call by name is the same as call by value, except the value of a binding is only computed at the point it is needed. Call by name is a form of delayed evaluation or normal-order evaluation strategy.

Lisp (at least Common Lisp: Lisps which have normal-order evaluation strategies exist) does not have call by name, but again it can be emulated with some macrology:

(defmacro delay (form)
  ;; simple-minded DELAY.  FORM is assumed to return a single value,
  ;; and will be evaluated no more than once.
  (let ((fpn (make-symbol "FORCEDP"))
        (vn (make-symbol "VALUE")))
    `(let ((,fpn nil) ,vn)
       (lambda ()
         (unless ,fpn
           (setf ,fpn t
                 ,vn ,form))
         ,vn))))

(declaim (inline force))

(defun force (thunk)
  ;; forcd a thunk
  (funcall thunk))

(defmacro funcall/delayed (fn &rest args)
  ;; call a function with a bunch of delayed arguments
  `(funcall ,fn ,@(mapcar (lambda (a)
                            `(delay ,a))
                          args)))

And now

(defun return-first-thunk-value (t1 t2)
  (declare (ignorable t2))
  (force t1))

(defun surprisingly-quick ()
  (funcall/delayed #'return-first-thunk-value
                   (cons 1 2)
                   (loop repeat 1000000
                         collect
                         (loop repeat 1000000
                               collect
                               (loop repeat 1000000
                                     collect 1)))))

> (time (surprisingly-quick))
Timing the evaluation of (surprisingly-quick)

User time    =        0.000
System time  =        0.000
Elapsed time =        0.001
Allocation   = 224 bytes
3 Page faults
(1 . 2)

The second argument to return-first-thunk-value was never forced, and so the function completes in reasonable time.


  1. This, in turn, is distantly descended from a post on comp.lang.lisp by Erik Naggum

  2. I think Fortran is allowed to implement its ‘by reference’ calls by copying any modified bindings back to the bindings in the parent procedure, and this is largely equivalent, at least for single-threaded code. 

Call by value in Scheme and Lisp

:: computer, lisp

I find the best way to think about this is to think in terms of bindings, rather than environments or frames, which are simply containers for bindings.


Bindings

A binding is an association between a name and a value. The name is often called a ‘variable’ and the value is, well, the value of the variable. The value of a binding can be any object that the language can talk about at all. Bindings, however, are behind-the-scenes things (sometimes this is called ‘not being first-class objects’): they’re not things that can be represented in the language but rather things that you can use as part of the model of how the language works. So the value of a binding can’t be a binding, because bindings are not first-class: the language can’t talk about bindings.

There are some rules about bindings:

  • there are forms which create them, of which the most important two are lambda and define;
  • bindings are not first-class — the language can not represent bindings as values;
  • bindings are, or may be, mutable — you can change the value of a binding once it exists — and the form that does this is set!;
  • there is no operator which destroys a binding;
  • bindings have lexical scope — the bindings available to a bit of code are the ones you can see by looking at it, not ones you have to guess by running the code and which may depend on the dynamic state of the system;
  • only one binding for a given name is ever accessible from a given bit of code — if more than one is lexically visible then the innermost one shadows any outer ones;
  • bindings have indefinite extent — if a binding is ever available to a bit of code, it is always available to it.

Obviously these rules need to be elaborated significantly (especially with regards to global bindings & forward-referenced bindings) and mare formal, but these are enough to understand what happens. In particular I don’t really think you need to spend a lot of time worrying about environments: the environment of a bit of code is just the set of bindings accessible to it, so rather than worry about the environment just worry about the bindings.

Call by value

So, what ‘call by value’ means is that when you call a procedure with an argument which is a variable (a binding) what is passed to it is the value of the variable binding, not the binding itself. The procedure then creates a new binding with the same value. Two things follow from that:

  • the original binding can not be altered by the procedure — this follows because the procedure only has the value of it, not the binding itself, and bindings are not first-class so you can’t cheat by passing the binding itself as the value;
  • if the value is itself a mutable object (arrays & conses are example of objects which usually are mutable, numbers are examples of objects which are not) then the procedure can mutate that object.

Examples of the rules about bindings

So, here are some examples of these rules.

(define (silly x)
  (set! x (+ x 1))
  x)

(define (call-something fn val)
  (fn val)
  val))

> (call-something silly 10)
10

So, here we are creating two top-level bindings, for silly and call-something, both of which have values which are procedures. The value of silly is a procedure which, when called:

  1. creates a new binding whose name is x and whose value is the argument to silly;
  2. mutates this binding so its value is incremented by one;
  3. returns the value of this binding, which is one more than the value it was called with.

The value of call-something is a procedure which, when called:

  1. creates two bindings, one named fn and one named val;
  2. calls the value of the fn binding with the value of the val binding;
  3. returns the value of the val binding.

Note that whatever the call to fn does, it can not mutate the binding of val, because it has no access to it. So what you can know, by looking at the definition of call-something is that, if it returns at all (it may not return if the call to fn does not return), it will return the value of its second argument. This guarantee is what ‘call by value’ means: a language (such as Fortran) which supports other call mechanisms can’t always promise this.

(define (outer x)
  (define (inner x)
    (+ x 1))
  (inner (+ x 1)))

Here there are four bindings: outer is a top-level binding whose value is a procedure which, when it is called, creates a binding for x whose value is its argument. It then creates another binding called inner whose value is another procedure, which, when it is called, creates a new binding for x to its argument, and then returns the value of that binding plus one. outer then calls this inner procedure with the value of its binding for x.

The important thing here is that, in inner, there are two bindings for x which are potentially lexically visible, but the closest one — the one established by inner — wins, because only one binding for a given name can ever be accessible at one time.

Here is the previous code (this would not be equivalent if inner was recursive) expressed with explicit lambdas:

(define outer
  (λ (x)
    ((λ (inner)
       (inner (+ x 1)))
     (λ (x)
       (+ x 1)))))

And finally an example of mutating bindings:

(define (make-counter val)
  (λ ()
    (let ((current val))
      (set! val (+ val 1))
      current)))

> (define counter (make-counter 0))
> (counter)
0
> (counter)
1
> (counter)
2

So, here, make-counter (is the name of a binding whose value is a procedure which, when called,) establishes a new binding for val and then returns a procedure it has created. This procedure makes a new binding called current which catches the current value of val, mutates the binding for val to add one to it, and returns the value of current. This code exercises the ‘if you can ever see a binding, you can always see it’ rule: the binding for val created by the call to make-counter is visible to the procedure it returns for as long as that procedure exists (and that procedure exists at least as long as there is a binding for it), and it also mutates a binding with set!.

Why not environments?

SICP, in chapter 3, introduces the ‘environment model’, where at any point there is an environment, consisting of a sequence of frames, each frame containing bindings. Obviously this is a fine model, but it introduces three kinds of thing — the enviromnent, the frames in the environment and the bindings in the frame — two of which are utterly intangible. At least for a binding you can get hold of it in some way: you can see it being created in the code and you can see references to it. So I prefer not to think in terms of these two extra sorts of thing which you can never get any kind of handle on.

However this is a choice which makes no difference in practice: thinking purely in terms of bindings helps me, thinking in terms of environments, frames & bindings may well help other people more.

Shorthands

In what follows I am going to use a shorthand for talking about bindings, especially top-level ones:

  • x is a procedure which …’ means ’x is the name of a binding whose value is a procedure which, when called, …’;
  • y is …’ means ’y is the name of a binding the value of which is …’;
  • x is called with y’ means ‘the value of the binding named by x is called with the value of the binding named by y’;
  • ’… binds x to …’ means ’… creates a binding whose name is x and whose value is …’;
  • x’ means ‘the value of x’;
  • and so on.

Describing bindings like this is common, as the fully-explicit way is just painful: I’ve tried (but probably failed in places) to be fully explicit above.

The answer

And finally, after this long preamble, here’s the answer to the question you asked1.

(define (make-withdraw balance)
  (λ (amount)
    (if (>= balance amount)
        (begin (set! balance (- balance amount))
               balance)
        "Insufficient funds")))

make-withdraw binds balance to its argument and returns a procedure it makes. This procedure, when called:

  1. binds amount to its argument;
  2. compares amount with balance (which it can still see because it could see it when it was created);
  3. if there’s enough money then it mutates the balance binding, decrementing its value by the value of the amount binding, and returns the new value;
  4. if there’s not enough money it returns "Insuficient funds" (but does not mutate the balance binding, so you can try again with a smaller amount: a real bank would probably suck some money out of the balance binding at this point as a fine).

Now

(define x (make-withdraw 100))

creates a binding for x whose value is one of the procedures described above: in that procedure balance is initially 100.

(define (f y) (y 25))

f is a procedure (is the name of a binding whose value is a procedure, which, when called) which binds y to its argument and then calls it with an argument of 25.

(f x)

So, f is called with x, x being (bound to) the procedure constructed above. In f, y is bound to this procedure (not to a copy of it, to it), and this procedure is then called with an argument of 25. This procedure then behaves as described above, and the results are as follows:

> (f x)
75
> (f x)
50
> (f x)
25
> (f x)
0
> (f x)
"Insufficient funds"

Note that:

  • no first-class objects are copied anywhere in this process: there is no ‘copy’ of a procedure created;
  • no first-class objects are mutated anywhere in this process;
  • bindings are created (and later become inacessible and so can be destroyed) in this process;
  • one binding is mutated repeatedly in this process (once for each call);
  • I have not anywhere needed to mention ‘environments’, which are just the set of bindings visible from a certain point in the code and I think not a very useful concept.

I hope this makes some kind of sense.


A more elaborate version of the above code

Something you might want to be able to do is to back out a transaction on your account. One way to do that is to return, as well as the new balance, a procedure which undoes the last transaction. Here is a procedure which does that (this code is in Racket):

(define (make-withdraw/backout
         balance
         (insufficient-funds "Insufficient funds"))
  (λ (amount)
    (if (>= balance amount)
        (let ((last-balance balance))
          (set! balance (- balance amount))
              (values balance
                      (λ ()
                       (set! balance last-balance)
                       balance)))
            (values
             insufficient-funds
             (λ () balance)))))

When you make an account with this procedure, then calling it returns two values: the first is the new balance, or the value of insufficient-funds (defaultly "Insufficient funds"), the second is a procedure which will undo the transaction you just did. Note that it undoes it by explicitly putting back the old balance, because you can’t necessarily rely on (= (- (+ x y) y) x) being true in the presence of floating-point arithmetic I think. If you understand how this works then you probably understand bindings.


  1. This originated as an answer to this Stack Overflow question

No excuses

:: politics

As card-carrying members of the liberal elite we have to understand why so many people are so cross. Obviously it is our fault: with our awful progressive views we have prospered at their expense and it is only natural that they should express their anger by supporting politicians who are explicitly racist and misogynistic. That’s just a natural reaction: the people we have oppressed so horribly aren’t actually racists and misogynists, no, they just support politicians who are. It’s all our fault1.