[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Semantics of hash tables



    Date: Mon, 29 May 1995 00:16 EDT
    From: psz@MIT.EDU (Peter Szolovits)

    I am confused about whether the MCL implementation of EQUAL hash tables
    matches CLtL2, and (perhaps) about what the definition means.  The question
    is what should happen if one hashes a data item and then changes
    (side-effects) it.  In the particular instance I am working with (a
    midnight hack), I use an adjustable, fill-pointered vector of characters to
    build up strings, and I would then like to associate numbers with those
    strings in a hash table.  The code goes something like this:

    ? (defparameter ht (make-hash-table :test 'equal))
    ht
    ? (defparameter st (make-array 10 :element-type 'character
				   :initial-element #\Space
				   :adjustable t :fill-pointer 10))
    st
    ? st
    "          "
    ? (sxhash st)
    121419597
    ? (setf (gethash st ht) 1)
    1
    ? (gethash st ht)
    1
    t
    ? (vector-push-extend #\a st)
    10
    ? st
    "          a"
    ? (sxhash st)
    70632966
    ? (gethash st ht)       ;;; This is puzzle 1
    1
    t
    ? (setf (gethash st ht) 2)
    2
    ? (gethash st ht)
    2
    t
    ? (setf (fill-pointer st) 10)
    10
    ? st
    "          "
    ? (gethash st ht)       ;;; puzzle 2
    2
    t
    ? (sxhash st)
    121419597

    At "puzzle 1", it appears that it is not the CONTENT of st that is being
    used to access the hash table but its identity.  The same is the case at
    "puzzle 2".  By contrast, sxhash is obviously sensitive to the current
    content of the string.  It appears that gethash somehow uses the identity
    (address) of an object on lookup before checking its contents.  Here is
    another illustrative sequence:


I don't know if SXHASH is CommonLisp.  I couldn't find it in CLtL2.  It appears
that MCL is treating the value of ST as a vector instead of a string.  Is it
STRINGP?  EQUAL (and presumably EQUAL hash tables) behave differently with
respect to strings verses other arrays.


    ? (defparameter st (make-array 10 :element-type 'character
				   :initial-element #\Space
				   :adjustable t :fill-pointer 10))
    st
    ? (setf (gethash (copy-seq st) ht) 3)
    3
    ? (gethash st ht)
    3
    t
    ? (vector-push-extend #\a st)
    10
    ? st
    "          a"
    ? (gethash st ht)
    nil
    nil
    ? (setf (gethash st ht) 99)
    99
    ? (gethash st ht)
    99
    t
    ? (gethash (copy-seq st) ht)
    99
    t
    ? (defparameter s1 (copy-seq st))
    s1
    ? (gethash s1 ht)       ;; It's here
    99
    t
    ? (vector-push-extend #\a st)
    11
    ? (gethash st ht)
    99
    t
    ? (gethash s1 ht)       ;; it's gone
    nil
    nil

    I understand that there are tricky semantic and implementation issues here.
    For example, if only the content of the string were used in the hash and
    if the implementation (reasonably) chose not to copy the string, then it
    would be possible to change the content of the string and thus make it
    impossible to find it in the hashtable, sort of as in the example just
    above.

    Can someone give me the definitive reading of how one is supposed to think
    about these issues?  And are the implementations in different Lisps
    consistent?



    --Pete Szolovits



    ------- End of Forwarded Message