[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Rehashing of big hash tables



A problem I have been having that is vaguely related to David Wilkins' problem
is with the rehaing of pathname hash tables.  Other sites running 7.2 might
have had similar problems, and I'm interested to hear if anyone has thought of
any workarounds.

Our local file server has lots of files on it - perhaps 120,000.  In the
course of it being used, many pathnames in its pathname hash table get
interned.  As more pathnames get used, the hash table automatically grows by a
factor of 1.3.  The problem is that when the pathname hash table grows, it
locks out all use of it while it rehashes itself in its new array.

The problem is that this rehasing process takes an awfully long time.  When
the hash table gets to have 50,000 entries or so, rehasing takes upwards of 20
minutes if page thrashing.  Since the pathname hash table is locked while this
rehashing takes place, The file server goes essentially out to lunch for the
duration, with the file server processes all in the "Table Lock" process
whostate.

If a rehashing fit occurs in the afternoon during our file server usage "rush
hour" and I'm not around, many users around the lab interpret this state as
"the file server being wedged" and decide that they have to reboot it to get
it working again.  The problem with their approach is that a day or two after
the reboot, the same problem will occur again.

I have a few comments about this behavior:

1. It would be neat if there were a flavor of hash table that did its
rehashing incrementally, rehashing a little bit of the table with each
reference to it, until the table was completely rehashed into its new array.
Pathname hash tables could be these kinds of hash tables.  I understand that
hash tables that have to be rehashed after GC couldn't be incrementally
rehashed, but pathname hash tables don't have this requirement.  Other large
applications that have a somewhat real-time requirement, but also would like
to use large hash tables, could use these incrementally-rehashing hash tables,
too.

Easier to do:

2. I think pathname hash tables should grow faster than by 1.3. at
least 2 would be better.  I would pick 5.

3. It would be nice if one could specify in the namespace object for our file
server that it is a file server and that the pathname hash table for the
machine should be large at the start.

Finally:

4. It's disappointing to find that LMFS file server and mailer support seems
to be getting worse with every release, not better.  For example, there has
been a bug in LMFS that I have been diligently reporting with every release
since release 6.0, but has never been fixed (it should be easy to fix it).
The only changes that seem to occur is the introduction of new bugs.  I know
that LMFS file server support isn't very critical to making sales of machines
nowadays, with the new embedded system strategy, but it saddens me that
Symbolics Machines aren't going anywhere as servers.  I moved my home mailbox
to a sun a month ago, because its mailer and filesystem are more reliable and
faster than the lispm's.