[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rehashing of big hash tables



    Date: Wed, 6 Jul 88 17:54 EDT
    From: cjl@WHEATIES.AI.MIT.EDU (Chris Lindblad)

	...
    The problem is that this rehasing process takes an awfully long time.  When
    the hash table gets to have 50,000 entries or so, rehasing takes upwards of 20
    minutes if page thrashing.  Since the pathname hash table is locked while this
    rehashing takes place, The file server goes essentially out to lunch for the
    duration, with the file server processes all in the "Table Lock" process
    whostate.

    If a rehashing fit occurs in the afternoon during our file server usage "rush
    hour" and I'm not around, many users around the lab interpret this state as
    "the file server being wedged" and decide that they have to reboot it to get
    it working again.  The problem with their approach is that a day or two after
    the reboot, the same problem will occur again.

Interesting.  I first recognized this problem just a few weeks ago.  In
our analysis, it appeared that it was going to take about 2 hours to
rehash (but we had already gone beyond the 50K entries).  I reported
it to customer-reports at that time.  (Since then it has bitten me again,
but at the 50K entries point so it wasn't too bad.)

    I have a few comments about this behavior:

    1. It would be neat if there were a flavor of hash table that did its
    rehashing incrementally, rehashing a little bit of the table with each
    reference to it, until the table was completely rehashed into its new array.
    Pathname hash tables could be these kinds of hash tables.  I understand that
    hash tables that have to be rehashed after GC couldn't be incrementally
    rehashed, but pathname hash tables don't have this requirement.  Other large
    applications that have a somewhat real-time requirement, but also would like
    to use large hash tables, could use these incrementally-rehashing hash tables,
    too.

My suggestion to customer-reports was, during rehashing, to make the table
look up old entries in the old hash table and then in a secondary table
of new entries added since rehashing began.  When rehashing is done, the
secondary table entries are added to the new hash table.

    Easier to do:

    2. I think pathname hash tables should grow faster than by 1.3. at
    least 2 would be better.  I would pick 5.

We should be able to do this ourselves.  Just bash the parameters on the
hash table.  (I haven't tried it).

    3. It would be nice if one could specify in the namespace object for our file
    server that it is a file server and that the pathname hash table for the
    machine should be large at the start.

I imagine that you could manually grow it as part of what you do at the boot.

    Finally:

    4. It's disappointing to find that LMFS file server and mailer support seems
    to be getting worse with every release, not better.  For example, there has
    been a bug in LMFS that I have been diligently reporting with every release
    since release 6.0, but has never been fixed (it should be easy to fix it).
    The only changes that seem to occur is the introduction of new bugs.  I know
    that LMFS file server support isn't very critical to making sales of machines
    nowadays, with the new embedded system strategy, but it saddens me that
    Symbolics Machines aren't going anywhere as servers.  I moved my home mailbox
    to a sun a month ago, because its mailer and filesystem are more reliable and
    faster than the lispm's.

Well, I agree with some of this.  My suns still aren't properly using the
domain system (MX records and such).  (I think the Symbolics is.)  I actually
keep my inbox on a Vaxen but just so I can conveniently read it from home.
On other machines, I've seen lots of problems (just as on Symbolics) with the
conversion to domains.  It SEEMS that the mailer is now working pretty well
on the Symbolics.

Re LMFS:  Yes, it is relatively slow.  Too often I get bit with ECC
errors (not only on the LMFS).  But other than that, I haven't had too
many problems with the LMFS.