[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: btree's
Date: Mon, 6 Mar 89 16:05 CST
From: vaughan@MCC.COM (Paul Vaughan)
Subject: btree's
To: slug@Warbucks.AI.SRI.COM
cc: pixley%cadillac.cad.mcc.com@mcc.com
Message-Id: <19890306220518.2.VAUGHAN@PASSPORT.CAD.MCC.COM>
Reply-To: vaughan@MCC.COM
Postal-Address: BRC 4.11082
Business-Phone: 338-3639
We've got an application that makes heavy use of hash tables,
but the tables sometimes get very large (> 10000 entries). This reduces
performance dramatically because of the paging characteristics. So, we
are wondering
1) Would it behoove us to use btree tables, rather than hash
tables.
2) Does anyone have a btree implementation that they are willing
to share.
3) Does anyone have any other suggestions for improving
performance.
The tables we make come in all sizes from 5 entries to 20000 entries,
but I'm not sure of the distributions. There are probably ways to
estimate the potential size of the table before build it, but we haven't
worried about it much. Right now we just rely on the growing table
mechanism that Symbolics provides. Does anyone know if Symbolics has
any plans to include sophisticated facilities (such as btrees) in the
(already extremely useful) generalized table facility.
-- Paul Vaughan
Can you give us more information about when your paging problems are
happening? A single hash lookup should produce only ~4 page hits.
A lot of paging happens when the hash table grows. If the hash table
is going to be big, and has a small growth factor, the default is 1.3,
then you spend a lot of time copying the old hash table into the new
one. A lot of needless GC'ing also happens. For example, on a hash
table of size 40,000, the default growth factor will cause it to grow
26 times. So, I'd recommend setting the growth factor to some larger
number like ~2, and create large tables initially if possible.
k
- References:
- btree's
- From: Paul Vaughan <vaughan@MCC.COM>