[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: btree's



  Date: Mon, 6 Mar 89 16:05 CST
  From: vaughan@MCC.COM (Paul Vaughan)
  Subject: btree's
  To: slug@Warbucks.AI.SRI.COM
  cc: pixley%cadillac.cad.mcc.com@mcc.com
  Message-Id: <19890306220518.2.VAUGHAN@PASSPORT.CAD.MCC.COM>
  Reply-To: vaughan@MCC.COM
  Postal-Address: BRC 4.11082
  Business-Phone: 338-3639
  
  	We've got an application that makes heavy use of hash tables,
  but the tables sometimes get very large (> 10000 entries).  This reduces
  performance dramatically because of the paging characteristics.  So, we
  are wondering
  	1) Would it behoove us to use btree tables, rather than hash
  tables.
  	2) Does anyone have a btree implementation that they are willing
  to share.
  	3) Does anyone have any other suggestions for improving
  performance.
  
  	The tables we make come in all sizes from 5 entries to 20000 entries,
  but I'm not sure of the distributions.  There are probably ways to
  estimate the potential size of the table before build it, but we haven't
  worried about it much.  Right now we just rely on the growing table
  mechanism that Symbolics provides.  Does anyone know if Symbolics has
  any plans to include sophisticated facilities (such as btrees) in the
  (already extremely useful) generalized table facility.
  
    -- Paul Vaughan

Can you give us more information about when your paging problems are
happening?  A single hash lookup should produce only ~4 page hits.

A lot of paging happens when the hash table grows.  If the hash table
is going to be big, and has a small growth factor, the default is 1.3,
then you spend a lot of time copying the old hash table into the new
one.  A lot of needless GC'ing also happens.  For example, on a hash
table of size 40,000, the default growth factor will cause it to grow
26 times.  So, I'd recommend setting the growth factor to some larger
number like ~2, and create large tables initially if possible.


k