[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Rehashing of big hash tables
Date: Wed, 6 Jul 88 17:54 EDT
From: cjl@WHEATIES.AI.MIT.EDU (Chris Lindblad)
...
The problem is that this rehasing process takes an awfully long time. When
the hash table gets to have 50,000 entries or so, rehasing takes upwards of 20
minutes if page thrashing. Since the pathname hash table is locked while this
rehashing takes place, The file server goes essentially out to lunch for the
duration, with the file server processes all in the "Table Lock" process
whostate.
If a rehashing fit occurs in the afternoon during our file server usage "rush
hour" and I'm not around, many users around the lab interpret this state as
"the file server being wedged" and decide that they have to reboot it to get
it working again. The problem with their approach is that a day or two after
the reboot, the same problem will occur again.
Interesting. I first recognized this problem just a few weeks ago. In
our analysis, it appeared that it was going to take about 2 hours to
rehash (but we had already gone beyond the 50K entries). I reported
it to customer-reports at that time. (Since then it has bitten me again,
but at the 50K entries point so it wasn't too bad.)
I have a few comments about this behavior:
1. It would be neat if there were a flavor of hash table that did its
rehashing incrementally, rehashing a little bit of the table with each
reference to it, until the table was completely rehashed into its new array.
Pathname hash tables could be these kinds of hash tables. I understand that
hash tables that have to be rehashed after GC couldn't be incrementally
rehashed, but pathname hash tables don't have this requirement. Other large
applications that have a somewhat real-time requirement, but also would like
to use large hash tables, could use these incrementally-rehashing hash tables,
too.
My suggestion to customer-reports was, during rehashing, to make the table
look up old entries in the old hash table and then in a secondary table
of new entries added since rehashing began. When rehashing is done, the
secondary table entries are added to the new hash table.
Easier to do:
2. I think pathname hash tables should grow faster than by 1.3. at
least 2 would be better. I would pick 5.
We should be able to do this ourselves. Just bash the parameters on the
hash table. (I haven't tried it).
3. It would be nice if one could specify in the namespace object for our file
server that it is a file server and that the pathname hash table for the
machine should be large at the start.
I imagine that you could manually grow it as part of what you do at the boot.
Finally:
4. It's disappointing to find that LMFS file server and mailer support seems
to be getting worse with every release, not better. For example, there has
been a bug in LMFS that I have been diligently reporting with every release
since release 6.0, but has never been fixed (it should be easy to fix it).
The only changes that seem to occur is the introduction of new bugs. I know
that LMFS file server support isn't very critical to making sales of machines
nowadays, with the new embedded system strategy, but it saddens me that
Symbolics Machines aren't going anywhere as servers. I moved my home mailbox
to a sun a month ago, because its mailer and filesystem are more reliable and
faster than the lispm's.
Well, I agree with some of this. My suns still aren't properly using the
domain system (MX records and such). (I think the Symbolics is.) I actually
keep my inbox on a Vaxen but just so I can conveniently read it from home.
On other machines, I've seen lots of problems (just as on Symbolics) with the
conversion to domains. It SEEMS that the mailer is now working pretty well
on the Symbolics.
Re LMFS: Yes, it is relatively slow. Too often I get bit with ECC
errors (not only on the LMFS). But other than that, I haven't had too
many problems with the LMFS.