Generic interfaces in Racket
Or: things you do to distract yourself from watching an attempted fascist coup.
A thing that exists in many languages with a notion of a sequence of objects is a function variously known as fold
or reduce
: this takes another function of two arguments, some initial value, and walks along the sequence successively reducing it using the function. So, for instance:
(fold + 0 '(1 2 3))
turns into(fold + (+ 0 1) '(2 3))
which turns into …(fold + 1 '(2 3))
turns into(fold + (+ 1 2) '(3))
which turns into …(fold + 3 '(3))
turns into(fold + (+ 3 3) '())
which turns into …6
.
It’s pretty easy to write a version of fold
for lists:
(define (fold op initial l)
(if (null? l)
initial
(fold op (op initial (first l)) (rest l))))
Racket calls this (or a more careful version of this) foldl
: there is also foldr
which works from the other end of the list and is more expensive as a result.
Well, one thing you might want to do is have a version of fold
which works on trees rather than just lists. One definition of a tree is:
- it’s a collection of nodes;
- nodes have values;
- nodes have zero or more unique children, which are nodes.
- no node is the descendant of more than one node;
- there is exactly one root node which is the descendant of no other nodes.
A variant of this (which will matter below) is that the children of a node are either nodes or any other object, and there is some way of knowing if something is a node or not1.
You can obviously represent trees as conses, with the value of a cons being its car, and the children being its cdr. Whatever builds the tree needs to make sure that (3), (4) and (5) are true, or you get a more general graph structure.
But you might want to have other sorts of trees, and you’d want the fold function not to care about what sort of tree it was processing: just that it was processing a tree. Indeed, it would be nice if it was possible to provide special implementations for, for instance, binary trees where rather than iterating over some sequence of children you’d know there were exactly two.
So, I wondered if there was a nice way of expressing this in Racket and it turns out there mostly is. Racket has a notion of generic interfaces which are really intended as a way for different structure types to provide common interfaces, I think. But it turns out they can be (ab?)used to do this, as well.
Generic interfaces are not provided by racket
but by racket/generic
: everything below assumed (require racket/generic)
.
A generic treelike
interface
A treelike object supports two operations:
node-value
returns the value of a node;node-children
returns a list of the node’s children.
The second of these is a bit nasty: it would be better perhaps to either provide an interface for mapping over a node’s children, or to return some general, possibly lazy, sequence of children. But this is just playing, so I don’t mind.
Here is a definition of a generic treelike
interface, which includes default methods for lists:
(define-generics treelike
;; treelike objects have values and children
(node-value treelike)
(node-children treelike)
#:fast-defaults
(((λ (t)
(and (cons? t) (list? t)))
;; non-null proper lists are trees: their value is their car;
;; their children are their cdr.
(define node-value car)
(define node-children cdr))))
Notes:
- This uses
#:fast-defaults
instead of#:defaults
, which means that the dispatch to objects which satisfylist?
happens. This is fine in this case: lists are never going to be confused with any other tree type. - This relies on Racket’s (and Scheme’s?)
list?
predicate returning true only for proper lists rather than CL’s cheaplistp
which just returns true for anything which is eithernil
or a cons. - There are lots of other options to
define-generics
which I’m not using and many of which I don’t understand.
With this definition:
> (treelike? '())
#f
> (treelike? '(1 2 3))
#t
> (treelike? '(1 2 . 3))
#f
> (node-children '(1 2 3))
'(2 3)
So, OK.
A treelike
binary tree
We could then define a binary-tree
type which implements this generic interface:
(struct binary-tree (value left right)
#:transparent
#:methods gen:treelike
((define (node-value bt)
(binary-tree-value bt))
(define (node-children bt)
(list (binary-tree-left bt)
(binary-tree-right bt)))))
The #:methods gen:treelike
tells the structure we’re defining the methods needed for this thing to be a treelike
object.
And now we can check things:
> (treelike? (binary-tree 1 2 3))
#t
> (node-value (binary-tree 1 2 3))
1
> (node-children (binary-tree 1 2 3))
'(2 3)
OK.
Two attempts at a generic foldable
interface
So now I want to define another interface for things which can be folded. And the first thing I tried is this:
(define-generics foldable
;; broken
(fold operation initial foldable)
#:defaults
((treelike?
(define (fold op initial treelike)
(let ([current (op initial (node-value treelike))]
[children (node-children treelike)])
(if (null? children)
current
(fold op (fold op current (first children))
(rest children))))))
((const true)
(define (fold op initial any)
(op initial any)))))
So this tries to define a fold
generic function, which has two implementations: one for treelike
objects and one for all other objects. So this means that all objects are foldable, and, for instance (fold + 0 1)
simply turns into (+ 0 1)
. This is a bit odd but it simplifies the implementation of the interface for treelike
objects on the assumption that the children of nodes may not themselves be nodes (see above).
There is another complexity: if the list of a treelike
node’s children isn’t null, then it’s a treelike
, so it can safely be recursed over rather than explicitly iterated over. This is a slightly questionable pun I think, but, well, I am a slightly questionable programmer.
And this … doesn’t work:
> (fold + 0 '(1 2 3))
; node-value: contract violation:
; expected: treelike?
; given: 2
; argument position: 1st
It took me a long time to understand this, and the answer is that the definitions of fold
inside the define-generic
form aren’t adding methods to a generic function: what they are doing is defining a little local function, fold
which then gets glued into the generic function. So references to fold
in the definition refer to the little local function. It is exactly as if you had done this, in fact:
(define-generics foldable
;; this is why it's broken
(fold operation initial foldable)
#:defaults
((treelike?
(define fold
(letrec ([fold (λ (op initial treelike)
(let ([current (op initial (node-value treelike))]
[children (node-children treelike)])
(if (null? children)
current
(fold op (fold op current (first children))
(rest children)))))])
fold)))
((const true)
(define (fold op initial any)
(op initial any)))))
And you can see why this can’t work: the fold
bound by the letrec
calls itself rather than going through the generic dispatch.
The way to fix this is to use the magic define/generic
form to get a copy of the generic function, and then call that. This is syntactically horrid, but you can see why it is needed given the above. So a working version of this interface purports to be:
(define-generics foldable
;; not broken
(fold operation initial foldable)
#:defaults
((treelike?
(define/generic fold/g fold)
(define (fold op initial treelike)
(let ([current (op initial (node-value treelike))]
[children (node-children treelike)])
(if (null? children)
current
(fold op (fold/g op current (first children))
(rest children))))))
((const true)
(define (fold op initial any)
(op initial any)))))
And indeed it is not broken:
> (fold + 0 '(1 2 3))
6
and with some tracing added:
> (fold + 0 '(1 2 3))
fold/treelike + 0 (1 2 3)
fold/any + 1 2
fold/treelike + 3 (3)
6
Adding a special case to fold
for the binary tree
So now, finally, we can add a special case to fold
to the binary tree defined above, rather than needlessly consing a list of children. We will need the same explicit-generic-function hack as before as the children of a binary tree may not be binary trees.
(struct binary-tree (value left right)
#:transparent
#:methods gen:treelike
((define (node-value bt)
(binary-tree-value bt))
(define (node-children bt)
(list (binary-tree-left bt)
(binary-tree-right bt))))
#:methods gen:foldable
((define/generic fold/g fold)
(define (fold op initial bt)
(fold/g op
(fold/g op (op initial (binary-tree-value bt))
(binary-tree-left bt))
(binary-tree-right bt)))))
And now
> (fold + 0 (binary-tree 1
(binary-tree 2 3 4)
(binary-tree 5 6 7)))
28
and with some tracing
> (fold + 0 (binary-tree 1
(binary-tree 2 3 4)
(binary-tree 5 6 7)))
fold/bt + 0 #(struct:binary-tree 1 #(struct:binary-tree 2 3 4) #(struct:binary-tree 5 6 7))
fold/bt + 1 #(struct:binary-tree 2 3 4)
fold/any + 3 3
fold/any + 6 4
fold/bt + 10 #(struct:binary-tree 5 6 7)
fold/any + 15 6
fold/any + 21 7
28
Missing CLOS
In some ways this makes me miss CLOS: the explicit-generic-function hack is very annoying, single dispatch is annoying, not being able to define predicate-based methods separately from the define-generics
form is annoying. But on the other hand predicate-based dispatch is pretty cool.
-
Perhaps these should be called ‘sloppy trees’ or something. ↩