[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue: HASH-TABLE-PRINTED-REPRESENTATION (Version 1)



Its almost too late to do anything about it, but I had misfiled the
following message about HASH-TABLE-PRINTED-REPRESENTATION. My only excuse
is that there are 104 pending issues.

If you have or want to make a revision to your proposal, which I will
include, and can get to it *today*, let me know. Otherwise, this will wait
for the next meeting or a letter ballot or something.


Included is a message from JonL about HASH-TABLE-PRINTED-REPRESENTATION and
the last version of it.

!

Return-Path: <edsel!jonl@labrea.stanford.edu>
Received: from labrea.stanford.edu by Xerox.COM ; 09 JUN 88 21:36:17 PDT
Received: by labrea.stanford.edu; Thu, 9 Jun 88 21:35:32 PDT
Received: from bhopal.lucid.com by edsel id AA18027g; Thu, 9 Jun 88
21:27:54 PDT
Received: by bhopal id AA26522g; Thu, 9 Jun 88 21:26:35 PDT
Date: Thu, 9 Jun 88 21:26:35 PDT
 From: Jon L White <edsel!jonl@labrea.stanford.edu>
Message-Id: <8806100426.AA26522@bhopal.lucid.com>
To: masinter.pa
Cc: X3J13@sail.stanford.edu, cl-cleanup@sail.stanford.edu
In-Reply-To: Masinter.pa@Xerox.COM's message of 8 Jun 88 20:24 PDT
<880608-202925-3823@Xerox>
Subject: Issue: HASH-TABLE-PRINTED-REPRESENTATION

Touretzky actually said he would alter his proposal to account for the
:rehash-size and :rehash-threshold omissions; but this version of the 
proposaldoesn't show that.  I remember remarking that you still can't
call the objects "first class" if the printed representation cannot be
read in as an equivalent copy; and the fact that CL has some other
datatypes
that aren't "first class" doesn't argue for doing something substandard
for hash-tables.  I don't seem to have a copy of the mail from Dave in
which he said he would alter his proposal.


  Date: Mon, 23 May 88 15:37:48 PDT
  From: Jon L White <edsel!jonl@labrea.stanford.edu>
  To: labrea!Dave.Touretzky@CS.CMU.EDU
  Cc: cl-cleanup@sail.stanford.edu
  In-Reply-To: Dave.Touretzky@B.GP.CS.CMU.EDU's message of Mon, 23 May 88
11:27:58 EDT <361.580404478@DST.BOLTZ.CS.CMU.EDU>
  Subject: HASH-TABLE-PRINTED-REPRESENTATION

  re: One problem with the currently proposed #H notation is that it
provides 
      no way to specify a rehash-size or rehash-threshold.  This should not
be 
      a fatal flaw, though.  The #() notation is also incomplete: it cannot
      indicate whether the vector has a fill pointer, nor can it show when
the
      element-type is something more specific than T.  The latter problem
is 
      also shared by #nA notation.

  I think this is a fatal flaw.  The fact that *some* complex classes of
  arrays also share this fatal flaw is no argument for retaining it.  It
  is still the case that simple arrays of the more common element types
  do not have the flaw; and several years ago there was some discussion
  on how to fix other manifestations of the flaw on multi-dimensional
arrays.


  -- JonL --



!
Status:		DRAFT

Issue:		HASH-TABLE-PRINTED-PREPRESENTATION

Category:		ENHANCEMENT

Edit history:	23-May-88, Version 1 by Touretzky
			 8-Jun-88, Version 2 by Masinter (as per cl-cleanup discussion)

Description:

Hash tables are currently second-class data structures when compared to
lists, vectors, and structures, because they have no READable printed
representation.  This proposal introduces a #H reader syntax for hash
tables and a switch to control when hash tables will be printed this way.


Proposal (HASH-TABLES-PRINTED-REPRESENTATION:#H-NOTATION) :

1) Introduce the following reader notation for hash tables:

	 #nH(type (k1 v1) (k2 v2) ...)

   "n" is the size of the table; it is analagous to the :size argument to
   MAKE-HASH-TABLE.  If omitted, the system picks some reasonable size.

   "type" is one of EQ, EQL, or EQUAL.  If omitted it defaults to EQL.

   The (ki vi) pairs consist of a key and a value.  There may be any number
of
   such pairs, including zero.  Order is not significant.  It is an error
for
   two keys to be identical (using the EQ, EQL, or EQUAL test, as
   appropriate.)


2) Introduce a switch called *PRINT-HASH* whose initial value is
implementation-dependent.  If *PRINT-HASH* is T, hash tables are printed
using the #H syntax (with all optional components supplied), subject to the
usual limits imposed by *PRINT-LEVEL* and *PRINT-LENGTH*.  If *PRINT-HASH*
is NIL, hash tables are printed using the current #<HASH-TABLE ...> syntax.

Rationale:

This is a useful upward compatible extension (except in
implementations that have usurped #H for other purposes), with very
low adoption cost.

Cost to Implementors:

A simple change to PRIN1 and the pretty printer.  Most of the code
will be similar to existing routines for printing vectors in #()
notation and arrays in #nA() notation. The reader would change to
read this notation.

Cost to Users:

Small. Programs that want to control all *PRINT- parameters will need
to know about yet another parameter.


Benefits:

This proposal makes hash tables first class objects.  If
*PRINT-HASH* is T, their contents become visible in all the normal ways,
e.g., if FOO is bound to a hash table object, then typing FOO to a
read-eval-print loop will display the contents of the hash table.  Hash
table contents may also be displayed by TRACE if the table is passed as an
argument; they may also be displayed by the debugger.  Finally, hash tables
may be appear as literal objects in programs and be read or written to
files.

Current practice: 

We know of no current implementations of this proposal.
Although some implementations allow the user to see hash table contents
with DESCRIBE or INSPECT, not all do.  CMU Common Lisp's DESCRIBE, for
example, does not show hash table contents.  This reinforces the need for
a standard #H notation to guarantee that users can manipulate a hash table
as easily as a vector, array, or structure.

Discussion:

Several alternatives have been suggested for the syntax of #H.

 - preferred notation:    #H(EQL (FOO 37) (BAR 42))

 - dotted pair notation:  #H(EQL (FOO . 37) (BAR . 42))

 - property list:         #H(EQL FOO 37 BAR 42)

 - pseudo-structure:      #S(HASH-TABLE TYPE EQL SIZE 20 INITIAL-CONTENTS
((FOO 37) (BAR 42)))


One problem with the currently proposed #H notation is that it provides no
way to specify a rehash-size or rehash-threshold.  This should not be a
fatal flaw, though.  The #() notation is also incomplete: it cannot
indicate whether the vector has a fill pointer, nor can it show when the
element-type is something more specific than T.  The latter problem is also
shared by #nA notation. Some object that the fact that #A is flawed is no
reason to introduce the same flaw elsewhere.

This prompted yet another proposal:

	#[size]H([type] [rehash-size] [rehash-threshold] (ki vi)*)

 e.g.	#65H(EQL 101 65 (FOO 37) (BAR 42))


along with alternative settings for *PRINT-HASH*, NIL, T, :BRIEF, where the
latter would leave out all of the options.