[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Issue: READ-CASE-SENSITIVITY (Version 1)
- To: CL-Cleanup@sail.stanford.edu
- Subject: Issue: READ-CASE-SENSITIVITY (Version 1)
- From: Jeff Dalton <jeff%aiai.edinburgh.ac.uk@NSS.Cs.Ucl.AC.UK>
- Date: Wed, 15 Feb 89 20:34:12 GMT
- Cc: richard%aiai.edinburgh.ac.uk@NSS.Cs.Ucl.AC.UK
It may be rather late for new issues, but I think this one should
still be considered. The proposal below is fairly minimal. More
far-reaching proposals, or ones that are better in other ways, are
also possible; but this seems a good place to start.
A longstanding problem in Common Lisp is that the reader always
converts unescaped characters in symbol names to upper case.
This makes a number of applications more difficult than they
should be and looks like an oversight (especially given the
existence of *PRINT-CASE*).
This problem is an easy target for critics of Common Lisp and
tends to imflame opinion. Here are three comments I have seen
recently in the UK:
[...] particular small lunacies (e.g. the printer can be parameterised
to case fold everything to lower case, but the reader ALWAYS folds
to upper case, which is a real nasty for people who would like
to write their own code in case sensitive styles with Car, CAR and car
baing different. E.g. it means that using the Lisp reader to accept
(as a cheap way of so doing) natural language stuff from the user
necessarily loses capitalisation, which is STUPID. I also happen to
consider it ill-judged and a throw-back the TTY33s and punched cards
to work in UPPER CASE FOR ALL LISP INTERNAL DATASTRUCTURES, SINCE I
RATHER EXPECT MOST CURRENT PROGRAMMING STYLE IS BASED AROUND USE OF
LOWER OR MIXED CASE.
I also have a feeling the it is perpetuating the punched card era to
have a so called modern language work entirely in upper case inside
itself, and amazing to go to the trouble for case folding on both
input and output to paper over this, and even more bizzare to make the
output folding optional but not the input...)
The single thing which is left which most upsets me is the bloody
upper-case only reader - I *can't believe* that anyone thinks that's
a good idea.
Of course, it is not just to answer such criticism that I propose
this change.
-----
Issue: READ-CASE-SENSITIVITY
Forum: Cleanup
References: CLtL p 334 ff: What the Read Function Accepts;
especially p 337.
*PRINT-CASE* (CLtL, p 372)
Category: ADDITION/CHANGE
Edit history: 15-Feb-89, Version 1 by Dalton
Status: For Internal Discussion
Problem Description:
The Common Lisp reader always converts unescaped constituent
characters to upper case. (See CLtL, p 337, step 8, point 1.)
Proposal (READ-CASE-SENSITIVITY:LIKE-PRINT-CASE)
Add a new read parameter, *READ-CASE*, to control case
sensitivity. By analogy with *PRINT-CASE*, it may take the
following values:
:UPCASE -- convert unescaped characters to upper-case, as now.
:DOWNCASE -- don't convert, leaving lower-case letters in lower
case.
This proposal does not provide a way to tell the reader to convert
upper-case letters to lower-case. This is consistent with *PRINT-
CASE*, which does not provide a way to print lower-case letters in
upper-case.
Note that an isomorphic proposal that would not emphasise an
analogy with *PRINT-CASE* would be to add a parameter called
*READ-CASE-SENSITIVE*, taking the values T (to preserve case)
or NIL (to convert to upper case). Yet another proposal might
be to add a function READ-PRESERVING-CASE (hee hee).
Rationale:
There are several reasons for this proposal.
1. Lisp applications often use the Lisp reader to read their data.
This is often significantly easier than writing input routines
from scratch, especially if the input can be structured as lists.
However, certain applications want to make use of case distinctions,
and Common Lisp makes this unreasonably difficult. (You must define
every letter as a read macro and have the macro function read the
rest of the symbol.)
2. Some programming languages distinguish between upper and lower
case in identifiers, and useful conventions are often built around
such distinctions. For example, in C, constants are often written
in upper case and variables in lower. In Mesa(?) and Smalltalk(?),
a capital letter is used to indicate the beginning of a new word
in identifiers made up of several words. In Edinburgh Prolog,
variables begin with upper-case letters and constant symbols do
not. The case-insensitivity of the Common Lisp reader makes
it difficult to use conventions of this sort.
3. Among Lisp dialects, Common Lisp gives an unusual degree of
control over the case of output. However, there is no control over
the treatment of case on input. This makes the language unbalanced.
We live in a mixed-case world, and it should be possible to make use
of case distinctions in Common Lisp.
Test Case:
(let ((*read-case* :downcase))
(read-from-string "Zebra"))
= ZEBRA ;under CLtL
= |Zebra| ;under this proposal
Current Practice:
I do not know of any implementation that implements this proposal.
Franz Inc's ExCL has (or at least had) a function, excl:set-case-mode,
that set both the "preferred case" (the case of character in the print
names of standard symbols such as CAR) and whether or not the reader
was case-sensitive.
Cost to Implementors:
Fairly small.
Cost to Users:
None, this is a compatible change from the user's standpoint.
Cost of Non-Adoption:
Applications that want to read mixed-case expressions will not
be able to use the Common Lisp reader to do so (except, perhaps,
by tortuous use of read macros).
Programming styles that rely on case distinctions (without escape
characters) will be impossible.
Benefits:
Applications will be able to read mixed-case expressions.
Programmers will be able to make use of case distinctions.
Aesthetics:
For the proposal:
The language will have greater symmetry.
The language will look less old-fashioned.
Against the proposal:
The addition of another global parameter to control yet another
aspect of I/O is inelegant and increases clutter.
In favor of additional or more far-reaching changes:
Anyone wishing to make use of case distinctons in Lisp programs
will have to write the names of symbols in the "LISP" package in
upper case.
Discussion:
Larry Masinter suggested at one point that case sensitivity could
be an aspect of read tables rather than a separate parameter.
There may be several ways of doing this. For example, it might
be a property of each character or of the table as a whole.
An interesting possibility would be to disguise the preferred
internal case by defining a value for *READ-CASE* called :INVERT.
If the value were :INVERT, mixed-case symbols would remain the same
(or perhaps they would be inverted too) but all upper case input
would specify a lower-case name internally, and vice versa.
You may recall that something similar was suggested for pathnames.
Dalton supports the proposal LIKE-PRINT-CASE.