Wild pathnames in Common Lisp

:: lisp

Common Lisp’s pathname system has many problems. Here is proposal to make the situation a little better in one respect. This is not a general fix: it’s just trying to solve one problem.

The problem

The underlying problem is that on many platforms pathnames which ‘look like’ they contain wildcards are perfectly legal pathnames to the filesystem. So, on Unix & related systems [foo].* is a legal filename. On these platforms wildcard handling is generally implemented in a library, or often in multiple semi-compatible libraries1.

CL then has two problems:

  1. there is no portable way to construct pathnames which look wild but are not;
  2. there is no portable way to parse a string which looks like a wild pathname but in fact should not be interpreted as one, for instance a string coming from some other application or library, or a filename stored in some file, such as an archive.

(1) happens because 19.2.2.3 says, in part

When examining wildcard components of a wildcard pathname, conforming programs must be prepared to encounter any of the following additional values in any component or any element of a list that is the directory component: […] A string containing implementation-dependent special wildcard characters. […]

That means that implementations are allowed to represent wildcard components of pathnames as strings, and that means that you can’t portably construct a non-wildcard pathname.

(2) happens because there’s no way to tell parse-namestring or pathname that the string you’ve handed to them is not wild, even though it looks like it is. That in turn means that to deal with this case you need to either write or find a pathname-parsing library which doesn’t have this problem.

These problems arise in practice: for instance some programs create filenames which look like [foo].xml: SBCL at least parses strings like this as wild, as it is allowed to do. This then breaks programs which want to, for instance, process zip files, tar files or other archive formats.

A proposed solution

For (1) change 19.2.2.3 to say that wildcard components are never strings. Change the description of make-pathname to say that if the corresponding components to it are strings (or suitably-constrained lists for the directory component) then the pathname is not wild, except if the default provides a component which is wild.

For (2) add an extra argument to both parse-namestring and pathname named wild with a default of true. If given as nil this will force string parsing to construct a non-wild pathname. If that is not possible, such as when pathname is handed a pathname which is already wild, then an error will be signalled.

Notes

This is the smallest change I can think of which will solve the problem. Some implementations, SBCL for instance, already solve (1) in the suggested way. None, I think, solve (2).

For added value, it might be useful to specify that wildcard components can be given either as symbols or as lists whose first element is a symbol, and encourage implementations to return them as such if possible. So, for instance (:sequence "foo-" (:alternation "bar" "zap")) might represent a wild name which matches "foo-bar" and "foo-zap". I am not suggesting this particular notation however.


  1. Let me introduce you to the joys of Unix.