[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Issue: PATHNAME-COMPONENT-CASE (version 4)
- To: CL-Cleanup@sail.stanford.edu
- Subject: Issue: PATHNAME-COMPONENT-CASE (version 4)
- From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
- Date: Tue, 23 May 89 13:19 EDT
This issue is on the agenda for the June X3J13 meeting. KMP and I
have prepared a revised writeup which we think is ready for release.
I'd like to distribute this to X3J13 as soon as discussion, if any,
in the cleanup subcommittee is completed.
Issue: PATHNAME-COMPONENT-CASE
References: Pathnames (pp410-413),
MAKE-PATHNAME (p416),
PATHNAME-HOST (p417),
PATHNAME-DEVICE (p417),
PATHNAME-DIRECTORY (p417),
PATHNAME-NAME (p417),
PATHNAME-TYPE (p417)
Related-issues: PATHNAME-WILD-TRANSLATE
Category: CHANGE
Edit history: 1-Jul-88, Version 1 by Pitman
22-Mar-89, Version 2 by Moon, update and rewrite
9-May-89, Version 3 by Moon, remove alternate proposals
9-May-89, Version 4 by Moon, respond to discussion with KMP
Problem Description:
Issues of alphabetic case in pathnames are a major source of problems.
In some file systems, the customary case is lowercase, in some uppercase,
in some mixed. In some file systems, case matters, in others it does
not.
There are two kinds of pathname case portability problems: moving
programs from one Common Lisp to another, and moving pathname component
values from one file system to another. To solve the first problem, all
Common Lisp implementations that support a particular file system must
use compatible representations for pathname component values. To solve
the second problem, there must be a common representation for the least
common denominator pathname component values that exist on all
interesting file systems.
This desire for a common representation directly conflicts with the
desire among programmers who only use one file system to work with the
local conventions and not think about issues of porting to other file
systems. The common representation cannot be the same as every local
convention, since they vary.
In the current anarchy of pathname component case conventions:
(NAMESTRING (MAKE-PATHNAME :NAME "FOO" :TYPE "LISP"))
will produce foo.lisp in some Unix Common Lisp implementations
and will produce FOO.LISP in other Unix Common Lisp implementations.
(NAMESTRING (MAKE-PATHNAME :NAME "foo" :TYPE "lisp"))
will produce FOO.LISP in some Tops-20 Common Lisp implementations
and will produce "â??Vfâ??Voâ??Vo.â??Vlâ??Viâ??Vsâ??Vp"in other Tops-20 Common
Lisp implementations.
Problems like this make it difficult to use MAKE-PATHNAME for much of
anything without corrective (non-portable) code.
Other problems occur in merging because doing
(NAMESTRING (MERGE-PATHNAMES (MAKE-PATHNAME :HOST "MY-TOPS-20" :NAME "FOO")
(PARSE-NAMESTRING "MY-UNIX:x.lisp")))
should probably return "MY-TOPS-20:FOO.LISP" but in fact might return
"MY-TOPS-20:FOO.â??Vlâ??Viâ??Vsâ??Vp" in some implementations.
Problems like this make it difficult to use any merging primitives for
much of anything without corrective (non-portable) code.
Proposal (PATHNAME-COMPONENT-CASE:KEYWORD-ARGUMENT):
Add a keyword argument :CASE to MAKE-PATHNAME, PATHNAME-HOST,
PATHNAME-DEVICE, PATHNAME-DIRECTORY, PATHNAME-NAME, and PATHNAME-TYPE.
The possible values for the argument are :COMMON and :LOCAL.
:LOCAL means strings input to MAKE-PATHNAME or output by PATHNAME-xxx
follow the local file system's conventions for alphabetic case.
:COMMON means those strings follow this common convention:
- all uppercase means to use a file system's customary case.
- all lowercase means to use the opposite of the customary case.
- mixed case represents itself.
The second and third bullets exist so that translation from local to
common and back to local is information-preserving.
The default is :COMMON.
Namestrings always use local file system case conventions.
MERGE-PATHNAMES and TRANSLATE-WILD-PATHNAME map customary case in the
input pathnames into customary case in the output pathname.
Implications of the proposal:
Unix is case-sensitive and prefers lowercase, so it translates between
common and local by inverting the case of non-mixed-case strings.
Tops-20 is case-sensitive and prefers uppercase, so it uses identical
representations for common and local.
VAX/VMS is upper-case-only, so it translates common to local by upcasing,
and translates local to common with no change.
Macintosh is case-insensitive and prefers lowercase, so it translates
between common and local by inverting the case of non-mixed-case strings,
and ignores case in EQUAL of two pathnames.
Test Case/Examples:
Under PATHNAME-COMPONENT-CASE:KEYWORD-ARGUMENT:
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/foo.lisp")
:CASE :COMMON) => "FOO"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-TOPS-20:<ME>FOO.LISP")
:CASE :COMMON) => "FOO"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/foo.lisp")
:CASE :LOCAL) => "foo"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-TOPS-20:<ME>FOO.LISP")
:CASE :LOCAL) => "FOO"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/TeX.lisp")
:CASE :COMMON) => "TeX"
(PATHNAME-NAME (PARSE-NAMESTRING "MY-UNIX:/me/TeX.lisp")
:CASE :LOCAL) => "TeX"
(NAMESTRING (MAKE-PATHNAME :HOST "MY-UNIX" :NAME "FOO"
:CASE :COMMON) => "MY-UNIX:foo"
Rationale:
This does not solve the whole pathname problem, but it does improve
the situation for a clearly defined set of very common problems.
Together with the other pathname proposals, the behavior of pathnames
should be sufficiently consistent across Common Lisp implementations
and across file systems to allow portability of pathname-manipulating
programs.
Upper case is chosen as the common case for no better reason than
consistency with Lisp symbols.
The :CASE keyword argument provides access to both common and local
conventions without introducing any new functions. The default
convention is the common one, assuming that most programs are fully
portable and therefore :COMMON will be more frequently used.
Current Practice:
There are no known implementations of exactly what is proposed.
Symbolics Genera uses common case normally, and provides a way to
access the local case (called "raw") that in practice is rarely used.
Symbolics Genera's own file system uses lower case as the customary
case, but transparent network access is available to file systems
using all known case conventions.
Several Common Lisp implementations behave as if :CASE :LOCAL was
specified (but accept no :CASE argument).
Cost to Implementors:
The :CASE feature is easily added, but some implementations may have
to change the default behavior when :CASE is not specified. No
implementation need change its internal representation, nor the way
pathnames print, just the interface functions listed above.
Cost to Users:
Technically, this change is upward compatible.
In fact, since the existing CLtL spec is so poor, nearly everyone relies
heavily on implementation-specific behavior since there is little other
choice. As such, any change is almost certain to break lots of programs,
in usually superficial but nevertheless important ways. However, if we
really make the pathname facility more portable, the user community may
be willing to bear the consequences of these changes.
Cost of Non-Adoption:
We would be contributing to the perpetuation of the existing fiasco of a
pathname system.
Performance Impact:
None.
Benefits:
One step closer to a usable pathname system.
Aesthetics:
Anything that simplifies the user model of pathnames is an improvement.
Discussion:
Some people would rather use lowercase as the common case. The
decision is essentially arbitrary. Everywhere else in Common Lisp
where case matters, uppercase was chosen.
It has been proposed that the Common Lisp specification should include
specifications of the exact behavior of pathnames for several popular
operating systems, so that multiple implementations for those
operating systems would be compatible with each other. This proposal
does that for alphabetic case.
Some people want the default for :CASE to be :LOCAL instead of :COMMON.
See Rationale.
There should probably be a remark somewhere that says that portable
programs shouldn't expect to be able to create and/or access distinct
files whose pathname components differ only in case.