The U combinator
The U combinator allows you to define recursive functions and I think it is simpler to understand than the Y combinator.
It’s not obvious how things like letrec
get defined in Scheme, without using secret assignment. In fact I think they are defined using secret assignment:
(letrec ([f (λ (...) ... (f ...) ...)])
...)
turns into
(let ([f ...])
(set! f (λ (...) ... (f ...) ...))
...)
But it’s interesting to see how you can define recursive functions without relying on assignment, including mutually-recursive collections of functions. One way is using the U combinator.
I suspect that there is lots of information about this out there, but it’s seriously hard to search for anything which looks like ’*-combinator’ now (even now I am starting a set of companies called ‘integration by parts’, ‘the quotient rule’ &c).
You can famously do this with the Y combinator, but I didn’t want to do that because Y is something I find I can understand for a few hours at a time and then I have to work it all out again. But it turns out that you can use something much simpler: the U combinator. It seems to be even harder to search for this than Y, but here is a quote about it:
In the theory of programming languages, the U combinator, \(U\), is the mathematical function that applies its argument to its argument; that is \(U(f) = f(f)\), or equivalently, \(U = \lambda f \cdot f(f)\).
Self-application permits the simulation of recursion in the λ-calculus, which means that the U combinator enables universal computation. (The U combinator is actually more primitive than the more well-known fixed-point Y combinator.)
The expression \(U(U)\) is the smallest non-terminating program.
(Text mildly edited from here, which unfortunately is not a site all about the U combinator other than this quote.)
Prerequisites
All of the following code samples are in Racket. The macros are certainly Racket-specific and some of the other code probably is as well. To make the macros work you will need syntax-parse
via:
(require (for-syntax syntax/parse))
However note that my use of syntax-parse
is naïve in the extreme: I’m really just an unfrozen CL caveman pretending to understand Racket’s macro system.
Also note I have not ruthlessly turned everything into λ: Rather than ((λ (...) ...) ...)
there is (let ([... ...] ...) ...)
in this code; there is use of multiple values including let-values
; there is (define (f ...) ...)
rather than (define f (λ (...) ...))
and so on.
Two versions of U
The first version of U is the obvious one:
(define (U f)
(f f))
But this will run into some problems with an applicative-order language, which Racket is by default. To avoid that we can make the assumption that (f f)
is going to be a function, and wrap that form in another function to delay its evaluation until it’s needed: this is the standard trick that you have to do for Y in an applicative-order language as well. I’m only going to use the applicative-order U when I have to, so I’ll give it a different name:
(define (U/ao f)
(λ args (apply (f f) args)))
Note also that I’m allowing more than one argument rather than doing the pure-λ-calculus thing.
Using U to construct a recursive functions
To do this we do a similar trick that you do with Y: write a function which, if given a function as argument which deals with the recursive cases, will return a recursive function. And obviously I’ll use the Fibonacci function as the canonical recursive function.
So, consider this thing:
(define fibber
(λ (f)
(λ (n)
(if (<= n 2)
1
(+ ((U f) (- n 1))
((U f) (- n 2)))))))
This is a function which, given another function, U
of which computes smaller Fibonacci numbers, will return a function which will compute the Fibonacci number for n
.
In other words, U
of this function is the Fibonacci function!
And we can test this:
> (define fibonacci (U fibber))
> (fibonacci 10)
55
So that’s very nice.
Wrapping U in a macro
So, to hide all this the first thing to do is to remove the explicit calls to U
in the recursion. We can lift them out of the inner function completely:
(define fibber/broken
(λ (f)
(let ([fib (U f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))
Don’t try to compute U
of this: it will recurse endlessly because (U fibber/broken)
-> (fibber/broken fibber/broken)
and this involves computing (U fibber/broken)
, and we’re doomed.
Instead we can use U/ao
:
(define fibber
(λ (f)
(let ([fib (U/ao f)])
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))))
And this is all fine ((U fibber) 10)
is 55
(and terminates!).
Purists can then turn let
into λ
in the usual way:
(define fibber
(λ (f)
((λ (fib)
(λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(U/ao f))))
And this is really all you need to be able to write the macro:
(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'(let ([name (U (λ (f)
(let ([name (U/ao f)])
value)))])
form ...)]))
Or, for the pure of heart:
(define-syntax (with-recursive-binding stx)
(syntax-parse stx
[(_ (name:id value:expr) form ...+)
#'((λ (name)
form ...)
(U (λ (f)
((λ (name)
value)
(U/ao f)))))]))
And this works fine:
(with-recursive-binding (fib (λ (n)
(if (<= n 2)
1
(+ (fib (- n 1))
(fib (- n 2))))))
(fib 10))
A caveat on bindings
One fairly obvious thing here is that there are two bindings constructed by this macro: the outer one, and an inner one of the same name. And these are not bound to the same function in the sense of eq?
:
(with-recursive-binding (ts (λ (it)
(eq? ts it)))
(ts ts))
is #f
. This matters only in a language where bindings can be mutated: a language with assignment in other words. Both the outer and inner bindings, unless they have been mutated, are to functions which are identical as functions: they compute the same values for all values of their arguments. In fact, it’s hard to see what purpose eq?
would serve in a language without assignment.
This caveat will apply below as well.
Two versions of U for many functions
The obvious generalization of U, U*, to many functions is that \(U^*(f_1, \ldots, f_n)\) is the tuple \((f_1(f_1, \ldots, f_n), f_2(f_1, \ldots, f_n), \ldots)\). And a nice way of expressing that in Racket is to use multiple values:
(define (U* . fs)
(apply values (map (λ (f)
(apply f fs))
fs)))
And we need the applicative-order one as well:
(define (U*/ao . fs)
(apply values (map (λ (f)
(λ args (apply (apply f fs) args)))
fs)))
Note that U* is a true generalization of U: (U f)
and (U* f)
are the same.
Using U* to construct mutually-recursive functions
I’ll work with a trivial pair of functions:
- an object is a numeric tree if it is a cons and its car and cdr are numeric objects;
- an objct is a numeric object if it is a number, or if it is a numeric tree.
So we can define ‘maker’ functions (with an ’-er’ convention: a function which makes an x is an xer, or, if x has hyphens in it, an x-er) which will make suitable functions:
(define numeric-tree-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(and (cons? o)
(no? (car o))
(no? (cdr o)))))))
(define numeric-object-er
(λ (nter noer)
(λ (o)
(let-values ([(nt? no?) (U* nter noer)])
(cond
[(number? o) #t]
[(cons? o) (nt? o)]
[else #f])))))
Note that for both of these I’ve raised the call to U*
a little, simply to make the call to the appropriate value of U*
less opaque.
And this works:
(define-values (numeric-tree? numeric-object?)
(U* numeric-tree-er numeric-object-er))
And now:
> (numeric-tree? 1)
#f
> (numeric-object? 1)
#t
> (numeric-tree? '(1 . 2))
#t
> (numeric-tree? '(1 2 . (3 4)))
#f
Wrapping U* in a macro
The same problem as previously happens when we raise the inner call to U*
with the same result: we need to use U*/ao
. In addition the macro becomes significantly more hairy and I’m moderately surprised that I got it right so easily. It’s not conceptually hard: it’s just not obvious to me that the pattern-matching works.
(define-syntax (with-recursive-bindings stx)
(syntax-parse stx
[(_ ((name:id value:expr) ...) form ...+)
#:fail-when (check-duplicate-identifier (syntax->list #'(name ...)))
"duplicate variable name"
(with-syntax ([(argname ...) (generate-temporaries #'(name ...))])
#'(let-values
([(name ...) (U* (λ (argname ...)
(let-values ([(name ...)
(U*/ao argname ...)])
value)) ...)])
form ...))]))
And now, in a shower of sparks, we can write:
(with-recursive-bindings ((numeric-tree?
(λ (o)
(and (cons? o)
(numeric-object? (car o))
(numeric-object? (cdr o)))))
(numeric-object?
(λ (o)
(cond [(number? o) #t]
[(cons? o) (numeric-tree? o)]
[else #f]))))
(numeric-tree? '(1 2 3 (4 (5 . 6) . 7) . 8)))
and get #t
.
As I said, I am sure there are well-known better ways to do this, but I thought this was interesting enough not to lose. This originated as an answer to this Stack Overflow question.