[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue: PATHNAME-COMPONENT-CASE (Version 1)



    Date: Thu, 21 Jul 88 16:00:02 EDT
    From: Scott.Fahlman@B.GP.CS.CMU.EDU

    ... I still think that the principle of least astonishment suggests
    that these field names ought to be used verbatim unless the user
    specifically asks for canonicalization of case.  We can make it easy
    to ask for the canonicalizing behavior (my proposal 2 does that), but
    it shouldn't be the default.

There's another theory that says that if he's going to be astonished,
let it happen early. The stated purpose of CL is to provide for the
development of portable programs. If you do a non-portable thing that
kinda feels right and implementations are encouraged to accept it without
warning and you have to know to say something magic to get something
portable, then you don't have a portable language and you don't encourage
portable programs.

Consider that we could have had a :PORTABLE keyword argument on all 
functions. Any time you wanted to opt for portable behavior, you could
ask for it. Then everyone could write

 (DEFUN FOO (X Y Z)
   (* (+ X Y :PORTABLE T) Z :PORTABLE T))

and when people didn't ask for portability, they wouldn't have to put up
with it. You'd get a lot jazzier functionality out of things, and people
would stop complaining about implementations that provided `gratuitous
extra functionality' because that would be the encouraged thing. For example,

 (DEFUN FOO (X Y Z)
   (* (+ X Y) Z))

-- without the :PORTABLE T arguments -- could be a lot faster because then
you could always use native arithmetic. Don't think there aren't CL
implementations which do almost just this (eg, they don't implement fixnums).

Plenty of people find it unintuitive that in present-day CL
 (* (THE FIXNUM X) (THE FIXNUM Y))
doesn't use a fixnum-multiply instruction (because you didn't type
declare the return value) but we just went and defined it in a way that we
knew it was important for it to work, even if it didn't come up too often.
Sorry about those short-sighted who get confused but having the language be
well-defined in a portable way is just more important. Some people are mad,
too, because they think they know the processor type they're on and know
what to expect from the fixnum-multiply instruction, but we've already made
the decision that supporting that activity just wasn't CL's priority.

We invented CL just -exactly- to get away from nonsense where an
implementor's interpretations were preferred over the needs of a community.
The default just has to be the thing which promotes portable applications.
If it's not, you can't test programs in one environment and have any
hope that it will therefore run in another.

It's true that in general there is no reference implementation of CL,
nor is one possible, and running your program in one implementation 
cannot be a guarantee that it will run in another, but that's nothing to
cheer about. That's just a sad thing we should be trying to minimize
rather than institutionalize.

And you just never know when your company or university might fold,
you might get tired of what you're doing and decide to move, or your
company might find itself on different hardware/os/file-system than
it ever thought possible ... and you might be happy you were made to
design in a feature that you never originally thought had a personal
meaning to you. [Certainly this happened to me when I moved from 
ITS/Tops-20 file servers at MIT to Lispm/VMS/Unix ones at Symbolics.]

    ... Whether we go with KMP's proposal or something like my proposal 2,
    I think that we should use all-lower-case to indicate canonical case,
    and all-upper to indicate anti-canonical case. ...

There are already several places where this arbitrary decision has been
made in Lisp. The decision has been made consistently, and I think that's
useful. I would hate to go against the grain:

 * Uppercase is the canonical case for non-backslashed symbols seen by
   the reader. eg, (symbol-name 'xyz) => "XYZ". (p168, p367, ...)
 
 * Uppercase is the canonical case for dispatch readmacros. eg, the chars
   received as arguments by # readmacro functions will have been upcased.
   The "a" in #a will be seen by the function supporting #a as #\A, not
   #\a. eg, see SET-DISPATCH-MACRO-CHARACTER (p364).
 
 * Uppercase is the default case for the Lisp printer. The default value 
   of *PRINT-CASE* is :UPCASE (p372).

You may be surprised to learn that I almost exclusively use file systems
where lowercase or mixed case is culturally preferred. My intent here is
not to force people to use uppercase filenames, or to assert that using
uppercase as a canonical internal case is no problem when I have no
experience with it. I use it daily and can't recall any problem with using
uppercase because
 - the only situations where it comes up is when I'm dealing with things
   I conceptualize as abstract primitives. I write (load "s:>kmp>foo.lisp")
   or (load (make-pathname :name "FOO" :type :lisp)), depending on what is 
   appropriate for the application.
 - it really does feel consisent with the canonical case for other things
   elsewhere in the language.

You can make whatever arguments you want about the prevalence of Unix, but
I don't think our language design has any business catering to a particular
style. The argument we make for the choice of canonical case should be
defensible in the abstract.

To me, the really compelling argument in this regard is the following:

 * Uppercase is the canonical case for spoken language.

   this is not a syntactically well-formed english sentence.
   THIS IS A SYNTACTICALLY WELL-FORMED ENGLISH SENTENCE.

Compatible with Unix or not, this is at least a position which can be defended
in the abstract.