[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

FORMAT-OP-C (Version 3)



Changes since version 2:
 * Pavel's request to strike reference to ~Q at the end of the
   Problem Description field.
 * Masinter's request to reword proposal to not talk directly about
   modifying CLtL in the Proposal field.
 * The last paragraph of the Discussion field is new.
-kmp

-----Proposal Follows-----
Issue:	      FORMAT-OP-C
References:   WRITE-CHAR (p384), ~C (p389)
Category:     CHANGE/CLARIFICATION
Edit History: 23-Feb-87, Version 1 by Pitman
	      29-Apr-87, Version 2 by Pitman (merge Moon's suggestion)
	      29-Apr-87, Version 3 by Pitman (misc editing)
Status:	      For Internal Discussion

Problem Description:

  The manual is not adequately specific about the function of the format
  operation ~C. The description on p389 says that "~C prints the character 
  in an implementation-dependent abbreviated format. This format should
  be culturally compatible with the host environment." This description
  is not very useful in practice.

  Presumably the authors intended the `cultural compatibility' part to
  gloss issues like how the SAIL character set printed, but unfortunately
  another completely reasonable (albeit unplanned) interpretation arose
  that wasn't planned on:
    (FORMAT NIL "~C" #\Space) might "Space" rather than " ".
  [Anyone who would argue that the word `abbreviated' in the definition
  was supposed to prevent this should just be happy that some implementors 
  didn't choose to interpret that word to mean that "Sp" should come back.]

  Some implementations have (FORMAT NIL "~C" #\Space) => "Space".
  Others have the same form return " ".

  Users can use (FORMAT NIL "~:C" #\Space) to get "Space" if they want it.
  It seems as if the implementations which return "Space" treat ~C and
  ~:C equivalently or very similarly, which seems like a waste of a FORMAT op.

  Since the behavior of ~A is also vague on characters (a separate 
  proposal will address this), the only way to safely output a literal
  character is to WRITE-CHAR; FORMAT does not suffice.

Proposal (FORMAT-OP-C:WRITE-CHAR):

  Change the behavior of ~C to say that, when given a character with zero
  bits, it will perform the same action as WRITE-CHAR. Leave the behavior
  of ~C with non-zero bits incompletely specified. For example, the
  description of ~C on p389 of CLTL might read:

       ~C prints the character using WRITE-CHAR if it has zero bits.
     Characters with bits are not necessarily printed as WRITE-CHAR
     would do, but are displayed in an implementation-dependent
     abbreviated format that is culturally compatible with the host
     environment.

  Clarify that WRITE-CHAR puts only one character on its argument stream,
  but which allows that stream to perform arbitrary destination-dependent
  actions based upon that character:

     Note: The glyphs used to present characters which are not in
     the standard character set may vary from implementation to
     implementation or output device to output device. WRITE-CHAR
     will always output a single character to the indicated stream.
     On some streams, super-quoting, character substitution, or
     substitution of a string for a single character may be 
     necessary; it is appropriate for the stream to decide to do
     this, but WRITE-CHAR itself will never do this.

Rationale:

  This was probably the intent of the authors. 

  It makes things clear enough that programmers can know what to
  expect in the normal case (standard characters with zero bits)
  while leaving some flexibility to implementors about what to do in
  the case of bits (which are not particularly well-defined across
  different implementations anyway).

Current Practice:

  Implementations are divided. Some implementations have
     (FORMAT NIL "~C" #\Space) => "Space".
  Others have the same form return " ".

Adoption Cost:

  Those implementations which did not already implement ~C as WRITE-CHAR
  would suffer an incompatible change.

Benefits:

  User code that uses ~C would have a chance of being portable.
  As things stand, users who use ~C can't reliably port their code.

  ~C and ~:C would perform usefully distinct operations.

Conversion Cost:

  Standard ``Query Replace'' technology for finding occurrences of
  "~C" and changing them to "~:C" semi-automatically should suffice.

Aesthetics:

  Making ~C do something well-defined will probably be perceived as
  a simplification.

Discussion:

  KMP and Pavel support this proposal.

  KMP thinks it's important to get this cleared up as soon as possible.

  Moon's comment on Version 1 (which tried to make WRITE-CHAR and ~C
  identical in all cases) was:
    I believe the error in CLtL is that it was not stated explicitly
    that the "implementation-dependent abbreviated format" applies only
    to characters with non-zero char-bits. Thus instead of removing the
    mumbling about cultural compatibility, I suggest simply adding a
    sentence saying that ~C is the same as write-char for characters
    with zero char-bits.  I don't think we want to require ~C and
    write-char to do the same thing for characters with bits.

  Steele and Fahlman seemed to like the idea of the proposal if amended
  as Moon suggested. Pitman did the merge, creating Version 2. If he didn't
  blow it somehow, they should now be happy.

  Moon and Fahlman voiced support for version 2.
  Fahlman thinks the problem description is too long.
  KMP isn't sure if he agrees that the problem description is too long,
  but doesn't think it's worth anyone's time to edit it.