[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue: READ-CASE-SENSITIVITY (Version 3)



> 			     Unfortunately, I don't know which
>     proposal (keyword or function) most people prefer.
> 
> I don't either, but I know that at Symbolics we strongly prefer
> the keyword proposal.  Since the discussion section says you prefer
> the keyword proposal, perhaps as the author you could use your
> discretion to opt for that proposal and send out a version 4
> omitting the function proposal and the discussion of the relative
> merits of the two proposals.

OK, I'll do that unless people start speaking up for the function
proposal.

>     It has also been remarked that :INVERT is somewhat strange in that it
>     would have Zebra read as zEBRA; and it was suggested that inversion
>     should happen only if the entire name were single-case.
> 
>     Unfortunately, the processing has to happen a character at a time,
>     because READ has to do it only for characters that are not escaped.
> 
> I don't believe that the conclusion follows from the premise.

You're probably right, since it looks like you have a better
understanding of this that I do.

What I wanted to do was just to change point 1 of step 8 on page 337
of CLtL, which is where unescaped constituents are converted to upper
case.  At that point, characters are being considered one at a time.
If :INVERT's going to work on entire extended tokens, it looks like it
requires a more complicated change to the specification of READ.

>     For example, |zebra| should always read as zebra.
>     However, READ has to take escape characters into account (so that,
>     for example, |zebra| always reads as zebra), and then it is
>     difficult to know what rules to apply to the entire token.
>     Moreover, the description of READ in CLtL does not provide a
>     convenient place to insert processing of that sort (by the time
>     the full token is considered, the escape characters have been
>     forgotten).
> 
> The parenthesized clause is clearly false, since escapedness is
> remembered for purposes of package syntax and number/potential number
> syntax.

How it seemed to me was that escapedness was remembered only to the
extent that escaped characters were considered to have only the
attribute alphabetic.  But maybe that's enough, since letters
normally have the attribute alphadigit.  But then, strictly
speaking, that attribute depends on the setting of *READ-BASE*.

> I believe it would be possible to make :INVERT apply only if
> the entire name is single-case.  It remains true that we have to decide
> whether "the entire name" includes escaped characters or not (in either
> case, only the unescaped characters would be inverted).  The Genera
> reader does case processing character at a time, as you outlined, but
> could very easily implement :INVERT based on either the escaped
> characters or all the characters, so that's an existence proof (hint:
> one state in its finite state machine would be split into four states,
> depending on whether any letters have been seen so far and on whether
> they are all upper case, all lower case, or mixed case).  Unless someone
> cleverer than me comes up with an algorithm, it would be necessary to
> retain the symbol-name in both inverted and uninverted form until
> the entire token has been parsed, but the cost of that is negligible.

> I think it would be best to base the decision on only the unescaped
> characters.  That just seems more consistent with the rest of the
> reader.

The CLtL model seems to be that the characters are accumulated then
processed as a "extended token".  So the rule would be something like
this:

   If the read case of the current readtable is :INVERT and
   if if all of the unescaped letters in the extended token are
   of the same case, those letters are converted to the opposite
   case.

I'm a bit worried about the use of "unescaped" at this point,
since the other parts of the processing of extended token depend
only on attributes such as alphabetic or digit.  

I'm also a bit worried about things like this:

  \abc reads as aBC, but prints as Abc

> Can you / will you write a version 4 that contains only the keywords
> proposal and that specifies :INVERT to invert case only when all the
> unescaped letters are in a single case?

Yes to at least the first.  I'd like to have some more feedback on the
second.

-- Jeff