[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Semantics of hash tables
- To: info-mcl@digitool.com
- Subject: Semantics of hash tables
- From: psz@mit.edu (Peter Szolovits)
- Date: Sun, 28 May 1995 21:16:12 -0700
- Sender: owner-info-mcl@digitool.com
I am confused about whether the MCL implementation of EQUAL hash tables
matches CLtL2, and (perhaps) about what the definition means. The question
is what should happen if one hashes a data item and then changes
(side-effects) it. In the particular instance I am working with (a
midnight hack), I use an adjustable, fill-pointered vector of characters to
build up strings, and I would then like to associate numbers with those
strings in a hash table. The code goes something like this:
? (defparameter ht (make-hash-table :test 'equal))
ht
? (defparameter st (make-array 10 :element-type 'character
:initial-element #\Space
:adjustable t :fill-pointer 10))
st
? st
" "
? (sxhash st)
121419597
? (setf (gethash st ht) 1)
1
? (gethash st ht)
1
t
? (vector-push-extend #\a st)
10
? st
" a"
? (sxhash st)
70632966
? (gethash st ht) ;;; This is puzzle 1
1
t
? (setf (gethash st ht) 2)
2
? (gethash st ht)
2
t
? (setf (fill-pointer st) 10)
10
? st
" "
? (gethash st ht) ;;; puzzle 2
2
t
? (sxhash st)
121419597
At "puzzle 1", it appears that it is not the CONTENT of st that is being
used to access the hash table but its identity. The same is the case at
"puzzle 2". By contrast, sxhash is obviously sensitive to the current
content of the string. It appears that gethash somehow uses the identity
(address) of an object on lookup before checking its contents. Here is
another illustrative sequence:
? (defparameter st (make-array 10 :element-type 'character
:initial-element #\Space
:adjustable t :fill-pointer 10))
st
? (setf (gethash (copy-seq st) ht) 3)
3
? (gethash st ht)
3
t
? (vector-push-extend #\a st)
10
? st
" a"
? (gethash st ht)
nil
nil
? (setf (gethash st ht) 99)
99
? (gethash st ht)
99
t
? (gethash (copy-seq st) ht)
99
t
? (defparameter s1 (copy-seq st))
s1
? (gethash s1 ht) ;; It's here
99
t
? (vector-push-extend #\a st)
11
? (gethash st ht)
99
t
? (gethash s1 ht) ;; it's gone
nil
nil
I understand that there are tricky semantic and implementation issues here.
For example, if only the content of the string were used in the hash and
if the implementation (reasonably) chose not to copy the string, then it
would be possible to change the content of the string and thus make it
impossible to find it in the hashtable, sort of as in the example just
above.
Can someone give me the definitive reading of how one is supposed to think
about these issues? And are the implementations in different Lisps
consistent?
--Pete Szolovits