[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A few questions on PCL

  To: commonloops.pa@Xerox.COM
  Subject: A few questions on PCL
  Date: Fri, 09 Dec 88 15:13:29 -0500
  From: Mike Thome <mthome@VAX.BBN.COM>
  Message-Id: <881209-122639-6834@Xerox>
  I've got a few questions for the "community" relating to pcl:
  1) Cache mechanism - although I haven't yet looked at this part of the
  new version, I assume the new cache method ("mirror" lookup for 2nd
  chance before slowing WAY down) is used.  How does this method differ (in
  efficiency) from the fairly standard "try the next cache location"
  algorithm in normal hash table theory? 

Trying the next cache location tends to lead to chains of filled hash
locations as the table fills up.  This is because if location I is
filled than there is a good chance I+1 is filled too, because some
nearby slot has overflowed.  I believe i can find a paper on this
somewhere in the early CACM's.

I believe that the same argument holds for "mirror" lookup, which if i
remember Gregor's description,is that if location I is filled, look at
L - I, where L is the length of the table.  This is just a permutation
of the above idea.  However, Gregor adds a twist that if the mirror
position is full, rather than looking at a nearby position or the
mirror of a nearby position, search linearly from the start of the
table for a free spot.  Now this linear search is even worse than the
one above, but it only occurs in the unlikely(?)  event that the
previous 2 slots are full.

A better way might be to compute the second slot to try as a pseudo
random function of the key, which would tend to distinquish things
that have the same hash address.  

Symbolics simply grows their hash tables whenever the going gets
tough, and if you do this you can tolerate a fairly poor secondary
hashing scheme.  Of course, trading space for time on a paging machine
may not be the best thing either which is probably why Gregor never
grew the caches before.

Data, simulated or actual would clearly help here.