[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[masinter.pa: Issue: COERCE-INCOMPLETE]



This issue is for discussion in the subcommittee.

     ----- Begin Forwarded Messages -----

 From: masinter.pa
Date: 28 Feb 88 8:07:18 PST
Subject: Issue: COERCE-INCOMPLETE
To: cl-cleanup@sail.stanford.edu
Message-ID: <880228-080805-2138@Xerox>

I'm not really back until next week, but this came in for consideration...


----------

Return-Path: <@RELAY.CS.NET:a37078%tansei.cc.u-tokyo.junet@UTOKYO-RELAY.CSNET>
Received: from RELAY.CS.NET by Xerox.COM ; 25 FEB 88 23:39:52 PST
Received: from relay2.cs.net by RELAY.CS.NET id aa07232; 26 Feb 88 2:16 EST
Received: from utokyo-relay by RELAY.CS.NET id aa26262; 26 Feb 88 2:06 EST
Received: by ccut.cc.u-tokyo.junet (5.51/6.3Junet-1.0/CSNET-JUNET)
	id AA19019; Fri, 26 Feb 88 15:54:40 JST
Received: by tansei.cc.u-tokyo.junet (4.12/6.3Junet-1.0)
	id AA12493; Fri, 26 Feb 88 15:55:57+0900
Date: Fri, 26 Feb 88 15:55:57+0900
 From: Masayuki Ida <a37078%tansei.cc.u-tokyo.junet@UTOKYO-RELAY.CSNET>
Return-Path: <a37078@tansei.cc.u-tokyo.junet>
Message-Id: <8802260655.AA12493@tansei.cc.u-tokyo.junet>
To: ida%tansei.cc.u-tokyo.junet@UTOKYO-RELAY.CSNET, masinter.pa
Subject: Coercion

Dear Larry Masinter,

I wrote my opinion on coercion which is attached consulting your suggestion on the format.
Please find and use if you think it is valuable to have.

Masayuki Ida

--------------------------------------------------------------------
Issue:	Coerce
Reference:	coerce (CLtL p50)
Category:	change
Edit history:	version 1 by M.Ida,  26-Feb-1988

Problem Description:
--------------------
Problem 1:
Coerce is not symmetric or not generic among data types.
In CLtL, Coerce is defined in page 50 and 51 that, 
1) a sequence type may be converted to any other sequence type. 
2)Some strings, symbols, and integers may be converted to characters. 
 2-1) If object is a string of length 1,
      then the sole element of the string is returned.
 2-2) If object is a symbol whose print name is of length 1,
      then the sole element of the print name is returned. 
 2-3) If object is an integer n,
      then (int-char n) is returned.
3) any non-complex number can be converted to a XXX-float.
4) any number can be converted to a complex number.

The next table shows that how coerce is not symmetric among character,
string, symbol and integer.

    TABLE 1. Possible Conversions among character, string,symbol, integer
type of conversion      provided functions              coercion under the CLtL
 character -> string    string                                      X
 character <- string    coerce (if the string has only one char.)   O
 character -> symbol    (intern (string @i[ch]))                    X
 character <- symbol    coerce (if pname length is 1)               O
 character -> integer   char-code, char-int                         X
 character <- integer   code-char (zero font-, zero bits- attrib.)  O 
                        int-char (any font- and any bits-)
 string -> symbol       intern, make-symbol                         X
 string <- symbol       string, symbol-name                         X
 string -> integer      (char-code (coerce @i[string] 'character))  X
 string <- integer      (string (code-char @i[integer]))            X
 symbol -> integer      (char-code (coerce @i[symbol] 'character))  X
 symbol <- integer      (intern (string (code-char @i[integer])))   X

Problem 2:
The function of coerce for character is defined to act as char-int/int-char
 not as char-code/code-char.

Proposal: Coerce :replace
-------------------------

COERCE should be more generalized for string, symbol, integer, and character
data types. The observations show there are 
no problem if extensions are fully written out in the details.
Here is an extension to the current coerce definition using the CLOS.

(defmethod coerce ((x character) (y (eql string))) (string x))
(defmethod coerce ((x character) (y (eql symbol))) (intern (string x)))
(defmethod coerce ((x character) (y (eql integer))) (char-code x))
(defmethod coerce ((x string) (y (eql symbol))) (intern x))
(defmethod coerce ((x symbol) (y (eql string))) (string x))
(defmethod coerce ((x string) (y (eql integer))) 
          (char-code (coerce x 'character)))
(defmethod coerce ((x integer) (y (eql string))) (string (code-char x)))
(defmethod coerce ((x symbol) (y (eql integer))) 
          (char-code (coerce x 'character)))
(defmethod coerce ((x integer) (y (eql symbol))) 
          (intern (sting (code-char x))))
(defmethod coerce ((x integer) (y (eql character)))
   (code-char x)) ; redefinition. CLtL defines as int-char

The keys are 
a) ignore char-bits and char-font upon the conversion of characters, 
assuming font-attribute will be flushed from the language spec.
b) ignore the package name upon the conversion of symbols.
(package name has no role upon the conversion.)
c) the created symbol will be interned to the current package.

Rationale:
----------
By extending the definition as this document suggests, the functionality
of coerce is symmetric among characters, strings, symbols and integers.


Current Practice:


Cost to implementors:

Cost to users:

Benefits:

Aesthetics:

Discussion:

Among the functions in Table 1, we can pick up the role of @t[STRING] function.
@T[STRING] has odd definition. It was also
the starting point of discussions described in the following.

The problems or the awkwards are mainly on the design of the symmetry of 
the function names.
We would start the discussion with the following two observations.@*
 i) @t[(string @i(x))] is OK. But, @t[(coerce @i(x) 'string)] is illegal.@*
While, @t[(character @i(x))] is OK. And @t[(coerce @i(x) 'character)] is OK too..@*
ii) To convert from a symbol to a string, use @t[SYMBOL-NAME] or @t[STRING]. 
While, to convert from a string to a symbol,
 use @t[MAKE-SYMBOL] to an uninterned symbol, or
 use @t[INTERN] to an interned symbol.@*
@*
@*
@b[ Discussions on Coercion in Common-Lisp E-mails 1986]

The awkward were discussed already in Common-lisp E-mails.
The author checked the 10M bytes E-mails on disk.
The discussions around @t[coerce] were almost in 1986, and not much in 1985 or before.
The sequence of our concern started by a mail of fateman@@dali.berkeley.edu,
dated Fri, 16 May 1986 15:40 PDT as follows.@*

1) fateman@@dali.berkeley.edu fri 16 may 1986 15:40 PDT:@*
This mail describes the same issue as for STRING function.

2) jeff@@aiva.edinburgh.ac.uk sun 18 may 17:17 GMT@*
@t[ ...  'string' applied to a sequence of characters gives an error, typically 
saying the sequence can't be coerced to a string, but 'coerce' will in fact coerce it...]

3) gls@@think.com, Mon 19 may 1986 12:20 EDT@*
@begin[t]Research shows that the Common Lisp archives up to a couple of months
ago contains 18 messages that mention @t[COERCE].  None explicitly addresses
this issue, but the general tone of the messages is one of conservatism.
I now remember that this issue was tied up with the design of the
sequence functions.  There was real resistance to letting symbols be
treated as general sequences, and so the general decision was made that
string-specific functions would accept symbols, but general sequence
functions would not. ...@end[t] To check his talk, @b[3.3] shows all the 
early discussions on @t[coerce] the author can find.

4) fahlman@@c.cs.cmu.edu  Mon, 19 May 1986 20:44 EDT@*
@t[... I would not object to generalizing coerce to handle some of the 
additional cases that people imagine it ought to handle.]
@begin[verbatim]
5) cfry@@oz.ai.mit.edu, Tue, 20 May 1986 03:21 EDT@*
... Coercion is a powerful, easy to remember concept. I think it should be 
extended as much as possible.  ...  :
   (coerce #\a 'string) => "a"
   (coerce 123 'string) => "123"
   (coerce #\a 'integer) => (char-code #\a) ; explicitly not provided CLtL p52.
 It appears that the only reason is that no one could decide on using 
char-code or char-int for the conversion so they chose not to do it 
at all. This reasoning is odd. Pick the most frequently used way, 
document it, and permit it. Same argument for coercion of numeric types.
Further out extensions might be:
 (coerce #'foo 'compiled-function) => returns a compiled function object
 ...
 (coerce string-1 'pathname)
 (coerce bit-vector-1 'integer) ...
Undoubtedly there are other coercions which would make sense. ... 
Users would save a lot of manual searching if coerce was extended.
@end[verbatim]
6) Eliot@@umass-cs.csnet,  Tue 20 May 1986 15:31 EST@*
@t[Coercing symbols to stings is fine, as long as NIL is treated as
the empty SEQUENCE, rather than as a symbol.]
@begin[verbatim]
7) REM@@IMSSS, 19 May 09:34 PST  referring to the mail of gls saying
that "@t[COERCE] was limited to ... sequence and numerical coercions".
This is one of the bad points of many LISPs, including CL, functions that
are miss-named or otherwise don't do what you'd expect from their name.
 ... I hope the international standards committee will fix this kind of
problem so programmers reading somebody else's code can have the meaning
apparent in most cases form general programming tradition rather than
having to constantly check the manual to see if the function does what
it seems to say it would do.

8) DCP@@scrc-quabbin.arpa, Wed, 21 May 1986 16:45 EDT,@*
Does (coerce @i(symbol) 'string) return
     (symbol-name @i(symbol)), or   (format nil "~S" @i(symbol)), 
or   (format nil "~A::~A"
       (package-name (symbol-package @i(symbol))) (symbol-name @i(symbol)))
or what ?  If this weren't enough, added in with my personal views of
style and functionality, for me to want to help veto this coercion, the
special casing of NIL certainly would.  Programs should reflect their
meaning.  A string is a sequence, a symbol is not.  Why shouldn't
@#  (coerce :ascii-EOT 'integer)
work?  The answer is that the requested behavior is not a coercion
between compatible types, it is a functional translation between human
understandable names for the ascii control characters and the integers
that are their corresponding values.
@end[verbatim]
We found that there is a possibility to extend the semantics of @t[coerce]
to cope with more generic types. It should be noted that the two key designers of Common Lisp
mentioned their opinions, and they do not always be against to the extension.
The discussion was stopped at this point and we cannot find their continuation yet.
We find from this story that @*
1) If we don't care about the package, we may extend the coerce
to the function from a symbol to a string, @*
2) If we are free to care about the font- and bits- attribute, 
we may extend the coerce to include the function from a character to other types.@*
@*
@*
@b[ Early discussions on coercion]

The following sequence was picked up from the archives. They are almost all the meaningful 
talk the author can find. They were in 1983, one year before @i(CLtL) was published.
@begin(verbatim)
1) Guy Steele, dated 27 May 83 01:23:14 EDT, (in the summary of the discussion 
with DLW and moon upon the laser edition update) 
a)No.38: it is noted that DLW said "coercing of characters to integers
should probably be allowed too." and this point was marked as {controversial}.
b) Moon's comment. "if (string x) <=> (coerce x 'string) exactly, say so.
Both of these should coerce a character into a 1-element string; neither says 
so now. The old argument against this, that people might expect string of a 
number to give not numbers." and Guy Steele agreed.
N.197: {gloss} string is required to signal an error if its argument is not a
string and not a symbol.  This is wrong; it should convert a character to a
one-character string.  Also it says that string will not convert a sequence of
characters into a string, but coerce will.  This might be all right, except
under coerce it says that when coercing to a string the implementation is
allowed to return a more general type. ... Also the coerce writeup
doesn't say anything for or against coercing from a character to a 1-long string.
{controversial} {will clarify}

2) Fahlman, dated Sat, 28 May 1983 22:34 EDT;
At least, I would like you to consider dropping these. ...
Page 38: Coercing of characters to integers is not a very useful operation in 
portable code if the language spec does not specify the correspondence (and 
Common Lisp does not).  Do you still want to propose that we add this coercion?
I'm not particularly opposed, I just worry that users will find it too easy to 
inadvertently do non-portable things if given this coercion.

3) Moon, date: sat, 28 May 1983, 22:40-EDT
I don't think coercion of characters to integers should be allowed, because
how do you know whether you want the char-code or what.  Dan was probably
just stuck in our old mindset from not having character objects on the Lisp
machine. Coercion of characters to strings, however, I believe is useful.

4) Daniel L.Weinreb, Date: Tuesday, 31 May 1983, 15:40-EDT
... It's OK with me if COERCE doesn't coerce characters to integers. However, 
I strongly suggest putting in a clarification note to that effect... Or maybe
a Rationale note saying that it doesn't because it wouldn't be clear what to do
about the char-code versus whole character, and telling you to use the particularly
function instead. This note is for the benefit of users more than of implementors.

5) Scott E. Fahlman, Date: Wed, 1 Jun 1983  01:47 EDT < referring 4)>
    It's OK with me if COERCE doesn't coerce characters to integers.
    However, I strongly suggest putting in a clarification note to that ...
<< I assume Guy will do this. >>
@end(verbatim)
  As far as we can see from this talk, that the process of making a coercion between
characters and integers be restricted such that char-to-integer conversion is not provided,
while integer-to-char is. The coercions from characters to integers are
purposely not provided; @t[char-code] or @t[char-int] may be used explicitly to
perform such conversions (See @b[Appendix] for the definitions of @t[char-code]
and @t[char-int]).
The difference between @t(char-int) and @t(char-code) is on the treatment of
@i(font) and @i(bits) attributes.
If these two attributes have no significant meaning and are ignored by everyone,
we can make the story much simpler. (And @b[4.] describes at least font- is
not alive).


     ----- End Forwarded Messages -----