[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Here's the draft of my hash table proposal.



Edit history:	23-May-88, Version 1 by Touretzky


Hash tables are currently second-class data structures when compared to
lists, vectors, and structures, because they have no READable printed
representation.  This proposal introduces a #H reader syntax for hash
tables and a switch to control when hash tables will be printed this way.


1) Introduce the following reader notation for hash tables:

	 #nH(type (k1 v1) (k2 v2) ...)

   "n" is the size of the table; it is analagous to the :size argument to
   MAKE-HASH-TABLE.  If omitted, the system picks some reasonable size.

   "type" is one of EQ, EQL, or EQUAL.  If omitted it defaults to EQL.

   The (ki vi) pairs consist of a key and a value.  There may be any number of
   such pairs, including zero.  Order is not significant.  It is an error for
   two keys to be identical (using the EQ, EQL, or EQUAL test, as

2) Introduce a switch called *PRINT-HASH* whose initial value is
implementation-dependent.  If *PRINT-HASH* is T, hash tables are printed
using the #H syntax (with all optional components supplied), subject to the
usual limits imposed by *PRINT-LEVEL* and *PRINT-LENGTH*.  If *PRINT-HASH*
is NIL, hash tables are printed using the current #<HASH-TABLE ...> syntax.

Rationale:  this is a useful upward compatible extension (except in
implementations that have usurped #H for other purposes), with very
low adoption cost.

Adoption Cost: A simple change to PRIN1 and the pretty printer.  Most of
the code will be similar to existing routines for printing vectors in #()
notation and arrays in #nA() notation.

Benefits: this proposal makes hash tables first class objects.  If
*PRINT-HASH* is T, their contents become visible in all the normal ways,
e.g., if FOO is bound to a hash table object, then typing FOO to a
read-eval-print loop will display the contents of the hash table.  Hash
table contents may also be displayed by TRACE if the table is passed as an
argument; they may also be displayed by the debugger.  Finally, hash tables
may be appear as literal objects in programs and be read or written to files.

Current practice: we know of no current implementations of this proposal.
Although some implementations allow the user to see hash table contents
with DESCRIBE or INSPECT, not all do.  CMU Common Lisp's DESCRIBE, for
example, does not show hash table contents.  This reinforces the need for
a standard #H notation to guarantee that users can manipulate a hash table
as easily as a vector, array, or structure.


Several alternatives have been suggested for the syntax of #H.

 - preferred notation:    #H(EQL (FOO 37) (BAR 42))

 - dotted pair notation:  #H(EQL (FOO . 37) (BAR . 42))

 - property list:         #H(EQL FOO 37 BAR 42)

 - pseudo-structure:      #S(HASH-TABLE TYPE EQL SIZE 20 INITIAL-CONTENTS ((FOO 37) (BAR 42)))

One problem with the currently proposed #H notation is that it provides no
way to specify a rehash-size or rehash-threshold.  This should not be a
fatal flaw, though.  The #() notation is also incomplete: it cannot
indicate whether the vector has a fill pointer, nor can it show when the
element-type is something more specific than T.  The latter problem is also
shared by #nA notation.

-- Dave