[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: DATA-IO (version 6)



This is a new issue.  It arose from an investigation of features
that are plausibly needed but missing from draft ANSI Common Lisp.
This issue seems sufficiently simple and noncontroversial that
I would like to see it on the agenda for the June X3J13 meeting.
Let's use the cleanup subcommittee to test the assertion that this
is a simple and noncontroversial issue.  If it's controversial,
let's just drop it, otherwise let's give X3J13 a chance to vote
for or against it.

Issue:          DATA-IO
References:     CLtL pp.360, 370, 382
Related issues: none
Category:       ADDITION
Edit history:   Version 1,  9-May-89, by Moon
                Version 2, 10-May-89, by Moon
                        (clarify ambiguities, add PRINT-UNREADABLE-OBJECT)
                Version 3, 18-May-89, by Moon (respond to KMP's comments)
                Version 4, 21-May-89, by Moon (almost-final cleanup)
                Version 5, 22-May-89, by Pitman (``never say never'')
                Version 6, 23-May-89, by Moon (final cleanup)

Problem description:

  Storing data in textual form in files, as Lisp expressions, is common
  practice but has some pitfalls.  Files can be unreadable if #<...> syntax
  is written by the printer, or if the reader syntax or package varies
  between writing and reading.  Files of data intended to be carried from
  one Lisp implementation to another can fail to read correctly if
  implementation-dependent syntax extensions get used when not intended.

  CLtL p.370 recommends that unreadable objects be printed with #<...>
  syntax including implementation-dependent information.  Now that users
  can write their own PRINT-OBJECT methods, a way is needed for such
  methods to print this syntax without any implementation-dependent coding.

Proposal (DATA-IO:ADD-SUPPORT):

  Add a new variable *PRINT-READABLY*.  Add a corresponding keyword
  argument :READABLY to WRITE.  The default value of *PRINT-READABLY* is
  NIL.  If :READABLY (which defaults to *PRINT-READABLY*) is true, then
  printing any object produces a printed representation that the reader
  will accept.  If this is not possible, the printer signals an error of
  type PRINT-NOT-READABLE rather than using an unreadable syntax such as
  #<...>.  The printed representation produced when :READABLY is true might
  or might not be the same as the printed representation produced when
  :READABLY is false.

  All methods for PRINT-OBJECT must obey *PRINT-READABLY*.  This includes
  both user-defined methods and implementation-defined methods.

  Printed representations produced when *PRINT-READABLY* is true and
  *PRINT-ESCAPE* is false might or might not be readable.

  Setting *PRINT-ESCAPE* to false might or might not prevent errors of type
  PRINT-NOT-READABLE from being signalled.

  Add two new macros:

    WITH-STANDARD-IO-SYNTAX &body body                             [Macro]

    Within the dynamic extent of <body>, all reader/printer control
    variables, including any implementation-defined ones not specified by
    Common Lisp, are bound to values that produce standard read/print
    behavior.  The values for Common Lisp specified variables are:

      *PACKAGE*                            The USER package
      *PRINT-ARRAY*                        T
      *PRINT-BASE*                         10
      *PRINT-CASE*                         :UPCASE
      *PRINT-CIRCLE*                       NIL
      *PRINT-ESCAPE*                       T
      *PRINT-GENSYM*                       T
      *PRINT-LENGTH*                       NIL
      *PRINT-LEVEL*                        NIL
      *PRINT-PRETTY*                       NIL
      *PRINT-RADIX*                        NIL
      *PRINT-READABLY*                     T
      *READ-BASE*                          10
      *READ-DEFAULT-FLOAT-FORMAT*          SINGLE-FLOAT
      *READ-SUPPRESS*                      NIL
      *READTABLE*                          The standard readtable

    PRINT-UNREADABLE-OBJECT (object stream &key type identity)      [Macro]
                            &body body

    Output a printed representation of <object> on <stream>, beginning with
    "#<" and ending with ">".  Everything output to <stream> by the <body>
    forms is enclosed in the angle brackets.  If :type is true, the body
    output is preceded by a brief description of the object's type and a
    space character.  If :identity is true, the body output is followed by
    a space character and a representation of the object's identity,
    typically a storage address.

    If *PRINT-READABLY* is true, PRINT-UNREADABLE-OBJECT signals an error
    of type PRINT-NOT-READABLE without printing anything.

    The <object>, <stream>, :type, and :identity arguments are all evaluated
    normally.  :type and :identity default to false.  It is valid to omit
    the <body> forms.  If :type and :identity are both true and there are no
    <body> forms, only one space character separates the type and the identity.

  Add a new condition type:

    PRINT-NOT-READABLE                                             [Type]

    Errors which occur during output while *PRINT-READABLY* is true, as a
    result of attempting to output a printed representation that cannot be
    read back, should inherit from this type.  This is a subtype of ERROR.
    The init keyword :OBJECT is supported to initialize the slot containing
    the object being printed, which can be accessed using
    PRINT-NOT-READABLE-OBJECT.

Examples:

  ;; Example #1: Reliable Write-Read

  (WITH-OPEN-FILE (FILE pathname :DIRECTION :OUTPUT)
    (WITH-STANDARD-IO-SYNTAX
      (PRINT DATA FILE)))

  ; ... Later, in another Lisp:

  (WITH-OPEN-FILE (FILE pathname :DIRECTION :INPUT)
    (WITH-STANDARD-IO-SYNTAX
      (SETQ DATA (READ FILE))))

  ;; Example #2: Use of PRINT-UNREADABLE-OBJECT
  ;; Note that in this example, the precise form of the output
  ;; is really implementation-dependent.

  (DEFMETHOD PRINT-OBJECT ((OBJ AIRPLANE) STREAM)
    (PRINT-UNREADABLE-OBJECT (OBJ STREAM :TYPE T :IDENTITY T)
      (PRINC (TAIL-NUMBER OBJ) STREAM)))

  (PRINT MY-AIRPLANE)
  #<Airplane NW0773 36000123135>        ;in Implementation A
                                        ;or
  #<FAA:AIRPLANE NW0773 17>             ;in Implementation B

Rationale:

  *PRINT-READABLY* is important so that errors involving data with no
  readable printed representation are detected when writing the file, not
  later on when the file is read.

  *PRINT-READABLY* is different from *PRINT-ESCAPE* because output printed
  with escapes only has to be generally recognizable by humans, whereas
  output printed readably has to be reliably recognizable by computers.

  Providing the WITH-STANDARD-IO-SYNTAX macro to bind all the variables,
  instead of using LET and explicit bindings of the existing variables,
  ensures that nothing is overlooked and avoids problems with
  implementation-defined reader/printer control variables.

  If the user wishes to use a non-standard value for some variable, most
  commonly *PACKAGE*, it can be bound by LET inside the body of
  WITH-STANDARD-IO-SYNTAX.  Similarly, if the user dislikes the somewhat
  arbitrary choices of values for *PRINT-CIRCLE* and *PRINT-PRETTY*, they
  can be bound to the preferred values inside the body.

Current practice:

  Symbolics Genera has had these features for many years, except that
  WITH-STANDARD-IO-SYNTAX is named WITH-STANDARD-IO-ENVIRONMENT and binds
  *PACKAGE* to a non-standard package.  The new name both is more accurate
  and avoids compatibility problems for Genera.

  Genera's WITH-STANDARD-IO-ENVIRONMENT also disables #., to prevent trojan
  horses, since #. could evaluate an arbitrary form.  This is particularly
  important for network protocols.  This feature is not being proposed for
  Common Lisp at this time as it would prevent using #. in the printer for
  common datatypes, which is current practice in some implementations.  #.
  suppression could be a separate reader/printer control variable.

  In Genera, PRINT-UNREADABLE-OBJECT is called SYS:PRINTING-RANDOM-OBJECT
  and takes slightly different arguments. In PCL, PRINT-UNREADABLE-OBJECT
  is called PCL:PRINTING-RANDOM-THING.

Cost to Implementors:

  Very small.

Cost to Users:

  None if they don't use the feature.  Otherwise just the cost of
  supporting *PRINT-READABLY* or using PRINT-UNREADABLE-OBJECT in their
  PRINT-OBJECT methods.

Cost of non-adoption:

  There will be no reliable, standard way to write data into a file.

Performance impact:

  Negligible.  Entering WRITE may be slightly slower since there is
  one more keyword argument to parse and one more special variable
  to bind before calling PRINT-OBJECT.

Benefits:

  Data can be written into files reliably without resorting to
  implementation-specific programming.

Esthetics:

  Mildly improved.

Discussion:

  Pitman and Moon support this proposal.