Two understandable deficiencies in Common Lisp
Common Lisp is, I think, a remarkably pleasant language, despite what some people like to say. Here are two small deficiencies, both of which are understandable in terms of the history of CL, and both of which ultimately hurt naïve programmers working in CL.
The default floating-point type is single-float
There are two things that make this true:
*read-default-float-format*
is initiallysingle-float
, which means that, unless it is changed,1.0
reads as1.0f0
, a single float1;- The
float
function will convert to a single float unless it is given a prototype which is not a single float:(float 1)
is1.0f0
, while to get a double float you would need(float 1 1.0d0)
.
In addition things like with-standard-io-syntax
bind *read-default-float-format*
to single-float
, so you have to do a little more work to make doubles the default.
I think there are probably several historical reasons why this default was chosen:
- a long time ago memory was very expensive and single floats take, usually, half the memory of double floats, thus pushing people towards single floats;
- a long time ago, perhaps, on some machines, single float operations were significantly faster than double float operations even before possible float consing was taken into account;
- Lisp hardware companies with significant influence on the standard, notably Symbolics, made hardware which allowed single (32 bit) floats to be immediate objects, while double floats were not, and had simple-minded compilers which were not capable of optimizing double float operations, thus making double float arithmetic extremely slow compared to single float arithmetic, and these companies wanted their machines to seem fast (they never, really, were) for naïve users;
- it was not clear that implementations would choose
single-float
to mean ‘single precision IEEE 754 float’ anddouble-float
to mean ‘double precision IEEE 754 float’, for instance it’s perfectly legal to have theshort-float
type mean single precision IEEE 754 and all of thesingle-float
,double-float
andlong-float
types mean double precision IEEE 754; - it wasn’t even even clear that IEEE 754 would come to dominate how machines implement floating-point: VAXes didn’t, and other machines of interest at the time also did not.
So there are good historical reasons for this. However all implementations I’m aware of now translate short-float
to mean single-float
, single-float
to mean IEEE 754 single precision, double-float
to mean IEEE 754 double precision and long-float
to be the same as double-float
.
So what is the problem with the default float type being single-float
in the modern world? The answer is
> (log (/ 1 single-float-epsilon) 10)
7.22472
In other words, single precision IEEE 754 arithmetic has about 7 significant figures of precision. For many purposes, and especially for naïvely-written code that’s at best marginal and at worst less than that. On the other hand
> (log (/ 1 double-float-epsilon) 10)
15.954589770191001D0
which is almost 16 significant figures of precision, more than twice that of single precision.
That’s why the default should have been double precision: it makes naïve code more likely to work, and people who are writing non-naïve code can use single precision if they need it.
The CL-USER
package is defined in an implementation-dependent way
From the spec:
The
COMMON-LISP-USER
package is the current package when a Common Lisp system starts up. This package uses theCOMMON-LISP
package. TheCOMMON-LISP-USER
package has the nicknameCL-USER
. TheCOMMON-LISP-USER
package can have additional symbols interned within it; it can use other implementation-defined packages.
(My emphasis.)
What this means is that when you start a CL environment, the current package may have all sorts of implementation-dependent symbols visible in it. You can see why this happened: if you’re implementing Super-Whizz-Bang CL which has all sorts of magic extra features, you want at least some of those features to be immediately available to users, rather than requiring them to pore over boring manuals to find them.
But for users, and especially for naïve users, it’s a terrible choice: naïve users don’t know about packages so they write their programs in CL-USER
. And they also don’t really know which symbols available in CL-USER
come from CL
and are thus standard parts of the language, and which come from one of Super-Whizz-Bang CL’s implementation packages, and are not standard parts of the language. So their programs turn into a mess where the portable parts are not distinct from the non-portable parts. The way the CL-USER
package is defined thus makes it harder for to write programs whose non-portable parts are well-isolated, and ultimately hurts the language.
This is a direct conflict between implementors and users: implementors both want their extra features immediately available so their implementation is shinier and want to encourage users to use these extra features in a way which makes it hard to move their programs to other implementations; users, when they think about it, generally don’t want this second thing, at least.
Instead, the language should have defined CL-USER
as a package which only used CL
, and perhaps have defined another standard package, perhaps IMPL-USER
, which was defined the way CL-USER
is today.
Can these be fixed?
While both of these problems could be fixed without changing the standard, I don’t think either can realistically be fixed.
For the single-float
problem there is nothing to stop implementations simply defining short-float
to mean IEEE 754 single precision and all the other types to mean IEEE 754 double precision. But all the existing code which assumes otherwise will then probably break in exciting ways. So this is unlikely to happen I expect.
The CL-USER
problem could be fixed if implementations agree to define CL-USER
to use only CL
as it is allowed to do, and perhaps to define an IMPL-USER
package as above. Of course that will make implementations slightly less convenient to use, so the chances of it happening would be small, even if implementors actually talked to each other in any useful way which I suspect they no longer do. Worse than that, this change will break many programs written by naïve users which live in CL-USER
, and there are almost certainly lots of those.
A moment of convenience, a lifetime of regret, as the old saying goes.
-
An earlier version of this article had single floats written as, for instance
1.0s0
: that’s wrong, those are short floats, single floats are1.0f0
for instance. These are almost certainly the same type on any current implementation (and I think on any implementation I have ever used, hence the mistake) but they don’t have to be. Thanks to Prem Nirved for finding this stupidity. ↩