[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: COERCE-INCOMPLETE (Version 2)



Issue:		 COERCE-INCOMPLETE
Reference:	 COERCE (p50)
Category:	 ADDITION
Edit history:	 Version 1 of COERCE-INCOMPLETE, 26-Feb-88 by M. Ida
		 Version 1 of COERCE-FROM-TYPE,  20-Jun-88 by Pitman
		 Version 2 of COERCE-INCOMPLETE, 21-Nov-88 by Pitman
Required-Issues: TYPE-OF-UNDERCONSTRAINED

Problem Description:

  COERCE is difficult to extend because ambiguities arise about the
  source type of the coercion.

  For example, if the symbol STRING were permitted as a second argument
  to coerce, as in (COERCE NIL 'STRING), there would be two posssible
  return values: "" or "NIL". The choice would be arbitrary and would
  have to be specified by the documentation. No matter which was chosen,
  it would probably turn out to be a problem for some applications at
  some times.

  Another example is (COERCE (CHAR-CODE #\A) 'STRING). This might
  return the same as (FORMAT NIL "~D" (CHAR-CODE #\A)) -- "65" in
  most ASCII-based implementations -- or it might return "A". Again,
  the choice would be arbitrary.

  There is clear desire on the part of the user community to lift some of
  the existing restrictions on arguments to COERCE, but because of legitimate
  concerns about ambiguities, the Common Lisp designers have thus far
  refused to do so.

  Unfortunately, the failure of COERCE to handle these cases means it is
  very difficult to learn to use COERCE. And the fact that COERCE is not
  easily learned contributes to difficulty in learning Common Lisp because
  instead of a single coercion operator with general purpose semantics, a
  number of very special purpose coercion operators must be learned instead.

  Some middle ground needs to be found, which neither compromises the
  clear semantics and portable nature of COERCE nor complicates COERCE
  in a way that makes it unlearnable.

  Also, some people have expressed a desire for COERCE to be more 
  `symmetric.' Usually they seem to mean that they want it to be the case
  that if (COERCE x y) works, then (COERCE (COERCE x y) (TYPE-OF x)) 
  should also work. Although this is not an essential desire, it would
  certainly be nice to achieve.

Proposal:

  Add an extra optional argument to COERCE which specifies the type
  from which the coercion is to be done. The new syntax would be:

   COERCE object to &optional (from (TYPE-OF object))

  Constrain that the FROM argument must be such that (TYPEP OBJECT FROM)
  is true.

  Define new types as follows:

	CHAR-CODE					[Type]

	The subrange of the integers which are valid character codes.
	This type could be defined by:

	  (DEFTYPE CHAR-CODE () `(INTEGER 0 (,CHAR-CODE-LIMIT)))

	CHAR-INT					[Type]

	The subrange of the integers which are valid char ints.
	This would probably want to be defined in an 
	implementation-dependent way since there is no
	CHAR-INT-LIMIT, but a crude approximation might be:

	  (DEFTYPE CHAR-CODE ()
	    `(INTEGER 0 ,(CODE-CHAR (- CHAR-CODE-LIMIT 1)
				    (- CHAR-BITS-LIMIT 1)
				    (- CHAR-FONT-LIMIT 1))))

	ACCESSIBLE-SYMBOL				[Type]

	The set of symbols accessible the current package.
	This type could be defined by:
	
	  (DEFUN SYSTEM::ACCESSIBLE-SYMBOL-P (X)
	    (AND (SYMBOLP X)
	         (MULTIPLE-VALUE-BIND (SYMBOL FOUND)
		     (FIND-SYMBOL (SYMBOL-NAME X) *PACKAGE*)
		   (AND FOUND (EQ SYMBOL X)))))

	  (DEFTYPE ACCESSIBLE-SYMBOL () 
	    `(SATISFIES SYSTEM::ACCESSIBLE-SYMBOL-P))

	INTEGER-STRING					[Type]

	The set of strings which can be successfully parsed by
	PARSE-NAMESTRING. This type could be defined by:

	  (DEFUN SYSTEM::INTEGER-STRING-P (X)
	    (AND (STRINGP X)
	         (DOTIMES (I (LENGTH X) T)
		   (LET ((CHAR (AREF X I)))
		     (UNLESS (OR (DIGIT-CHAR-P CHAR)
			         (AND (= I 0) (MEMBER CHAR '(#\+ #\-))))
		       (RETURN NIL))))))

	  (DEFTYPE INTEGER-STRING () `(SATISFIES SYSTEM::INTEGER-STRING-P))

	INTEGRAL-FLOAT					[Type]

	The set of floats which have no fraction. Put another way, the set of
	floats X for which there is some integer Y such that (= X Y).
	This could be defined by:

	  (DEFUN SYSTEM::INTEGRAL-FLOAT-P (X)
	    (AND (TYPEP X 'FLOAT)
		 (OR (ZEROP X)
	             (MULTIPLE-VALUE-BIND (QUOTIENT REMAINDER)
		         (TRUNCATE X)
		       (DECLARE (IGNORE QUOTIENT))
		       (ZEROP REMAINDER)))))

	 (DEFTYPE INTEGRAL-FLOAT-P () `(SATISFIES SYSTEM::INTEGRAL-FLOAT-P))

  Extend COERCE to handle at least the following cases:

   1. CHARACTER <-> STRING

      a. (COERCE x 'STRING 'CHARACTER) == (STRING x)
      b. (COERCE x 'CHARACTER 'STRING) == (CHARACTER x)

   2. CHAR-CODE <-> CHARACTER

      a. (COERCE x 'CHARACTER 'CHAR-CODE) == (CODE-CHAR x)
      b. (COERCE x 'CHAR-CODE 'CHARACTER) == (CHAR-CODE x)

   3. CHAR-INT <-> CHARACTER

      a. (COERCE x 'CHARACTER 'CHAR-INT) == (INT-CHAR x)
      b. (COERCE x 'CHAR-INT 'CHARACTER) == (CHAR-INT x)

   4. STRING <-> SYMBOL

      a. (COERCE x 'SYMBOL 'STRING) == (MAKE-SYMBOL x)
      b. (COERCE x 'STRING 'SYMBOL) == (SYMBOL-NAME x)

   5. STRING <-> ACCESSIBLE-SYMBOL

      a. (COERCE x 'ACCESSIBLE-SYMBOL 'STRING) == (INTERN x)
      b. (COERCE x 'STRING 'ACCESSIBLE-SYMBOL) == (SYMBOL-NAME x)

   6. INTEGER <-> INTEGER-STRING

      a. (COERCE x 'INTEGER-STRING 'INTEGER)
	 == (WRITE-TO-STRING X :BASE 10 :RADIX NIL)

      b. (COERCE x 'INTEGER 'INTEGER-STRING)
	 == (PARSE-INTEGER x)

   7. CHAR-CODE <-> STRING

      a. (COERCE x 'STRING 'CHAR-CODE) ==
	 == (COERCE (COERCE x 'CHARACTER 'CHAR-CODE) 'STRING 'CHARACTER)
	 == (STRING (CODE-CHAR x))

      b. (COERCE x 'CHAR-CODE 'STRING)
	 == (COERCE (COERCE x 'CHARACTER 'STRING) 'CHAR-CODE 'CHARACTER)
	 == (CHAR-CODE (CHARACTER x))

   8. CHAR-INT <-> STRING

      a. (COERCE x 'STRING 'CHAR-INT) ==
	 == (COERCE (COERCE x 'CHARACTER 'CHAR-INT) 'STRING 'CHARACTER)
	 == (STRING (INT-CHAR x))

      b. (COERCE x 'CHAR-INT 'STRING)
	 == (COERCE (COERCE x 'CHARACTER 'STRING) 'CHAR-INT 'CHARACTER)
	 == (CHAR-INT (CHARACTER x))

   9. CHAR-CODE <-> SYMBOL

      a. (COERCE x 'SYMBOL 'CHAR-CODE)
	 == (COERCE (COERCE x 'STRING 'CHAR-CODE) 'SYMBOL 'STRING)
	 == (MAKE-SYMBOL (STRING (CODE-CHAR x)))

      b. (COERCE x 'CHAR-CODE 'SYMBOL)
	 == (COERCE (COERCE x 'STRING 'SYMBOL) 'CHAR-CODE 'STRING)
	 == (CHAR-CODE (CHARACTER (SYMBOL-NAME x)))

  10. CHAR-INT <-> SYMBOL

      a. (COERCE x 'SYMBOL 'CHAR-INT)
	 == (COERCE (COERCE x 'STRING 'CHAR-INT) 'SYMBOL 'STRING)
	 == (MAKE-SYMBOL (STRING (INT-CHAR x)))

      b. (COERCE x 'CHAR-INT 'SYMBOL)
	 == (COERCE (COERCE x 'STRING 'SYMBOL) 'CHAR-INT 'STRING)
	 == (CHAR-INT (CHARACTER (SYMBOL-NAME x)))

  11. CHAR-CODE <-> ACCESSIBLE-SYMBOL

      a. (COERCE x 'ACCESSIBLE-SYMBOL 'CHAR-CODE)
	 == (COERCE (COERCE x 'STRING 'CHAR-CODE) 'ACCESSIBLE-SYMBOL 'STRING)
	 == (INTERN (STRING (CODE-CHAR x)))

      b. (COERCE x 'CHAR-CODE 'ACCESSIBLE-SYMBOL)
	 == (COERCE (COERCE x 'STRING 'ACCESSIBLE-SYMBOL) 'CHAR-CODE 'STRING)
	 == (CHAR-CODE (CHARACTER (SYMBOL-NAME x)))

  12. CHAR-INT <-> ACCESSIBLE-SYMBOL

      a. (COERCE x 'ACCESSIBLE-SYMBOL 'CHAR-INT)
	 == (COERCE (COERCE x 'STRING 'CHAR-INT) 'ACCESSIBLE-SYMBOL 'STRING)
	 == (INTERN (STRING (INT-CHAR x)))

      b. (COERCE x 'CHAR-INT 'ACCESSIBLE-SYMBOL)
	 == (COERCE (COERCE x 'STRING 'ACCESSIBLE-SYMBOL) 'CHAR-INT 'STRING)
	 == (CHAR-INT (CHARACTER (SYMBOL-NAME x)))

  13. PATHNAME <-> STRING

      a. (COERCE x 'STRING 'PATHNAME)
	 == (NAMESTRING x)

      b. (COERCE x 'PATHNAME 'STRING)
	 == (PARSE-NAMESTRING x)

  14. INTEGRAL-FLOAT <-> INTEGER

      a. (COERCE x 'INTEGER 'INTEGRAL-FLOAT)
	 == (TRUNCATE x)

      b. (COERCE x 'INTEGRAL-FLOAT 'INTEGER)
	 == (FLOAT x)

  Note that restrictions on the X argument to COERCE are as they would
  be for the corresponding function. For example, in 1b only strings of
  length 1 can be converted by CHARACTER.

  Observe that in some cases, such as (COERCE NIL 'STRING) where the first
  argument has a type which is a subtype of more than one of the above
  cases, the result may not be what the user expects. To get a safe result
  from COERCE, use of the third argument is strongly recommended.
  (COERCE NIL 'STRING 'SYMBOL) and (COERCE NIL 'STRING 'LIST) are not
  subject to the confusion that (COERCE NIL 'STRING) is.

Rationale:

  These proposed extensions make COERCE able to deal with a much larger
  space of type coercions without the problems of ambiguity raised in
  the Problem Description.

  The proposed extensions are, for the most part, fairly symmetric.
  Nearly every coercion that COERCE could do could be undone in a 
  fairly straightforward and reasonably reliable way.

  Compatibility with the old style of coerce is handled by making the
  third argument optional.

  This proposal is upward compatible with the existing semantics of COERCE.
  (The previous version of this proposal, by M. Ida, proposed incompatible
  changes.)

Current Practice:

  Probably no one implements the proposed behavior at this time.

Cost to Implementors:

  The more optimization a compiler does (or might do) of COERCE, the more
  work might be necessary. In general, however, the changes would probably
  not involve a major amount of work.

Cost to Users:

  This change is upward compatible.

Cost of Non-Adoption:

  Various proposals to extend COERCE would probably not pass because
  not everyone can agree on how to view the type of the first argument.

Benefits:

  Currently, whenver documentation refers to something being `coerced'
  from one type or another, it might mean that COERCE is called, or it
  might mean that some more specialized operator is called. This proposal
  brings us much closer to having the meaning of "is coerced to" be
  "the COERCE function is called", which would make things easier on
  the consumers of that documentation.

Aesthetics:

  This proposal regularizes the semantics of COERCE by making it more
  predictable and more 

  Pitman thinks this proposal would greatly improve the aesthetics of
  COERCE. Does anyone want to present  

Discussion:

  For purposes of getting this proposal through in parallel, we have not
  any assumptions about the impending character proposal. Nothing proposed
  in this document is inherently in conflict with any potential output of
  the Characters Committee. The fact that characters may have multiple
  representations as integers is an important way to highlight the value of
  this proposal, which is why character issues are included.

  Needless to say, the proposal TYPE-OF-UNDERCONSTRAINED will need to be
  dealt with. If a CL implementation is permitted to always return T for 
  (TYPE-OF anything), then the defaulting of the third argument in this
  proposal is not going to work so well. That's why that issue is listed
  as a `required issue' in the heading above.

  Pitman supports this proposal.

  Although this proposal is not the same in detail as M. Ida's original
  proposal, it does address the same issues.  Pitman is hopeful that the
  alternative solutions proposed here would still be satisfactory to the
  Japanese community, and looks forward to feedback from that community
  about this issue.