[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Problem with generic function caches in EXCL (aka Allegro CL)



In performance tuning our CLOS-based window system, I noticed an
inordinate number of generic function cache misses.  In tracking this
down, I discovered that (almost) all caches appeared to be only 1/4
full and that the cache misses were being caused by collisions in the
cache.

I believe the source of this problem is as follows:

  The macros object-cache-no and generic-function-cache-offset compute a
  hash-key based on the address of the object's class and a mask.  For
  generic functions with one method and for a cache size of 32, this
  hash-key is essentially bits 2 thru 5 (where bit 1 is the low order
  bit) of the address of the object's class.

  The problem is that classes appear to be double-word aligned (actually
  double word aligned plus 1, since (excl::pointer-to-fixnum class)
  always returns a value that is 1 in mod 8).  Thus bits 2 and 3 are
  always the same, leaving only 4 unique hash-keys (determined by bits 4
  and 5) rather than the 16 intended.

This problem becomes less severe as the number of specialized arguments
goes up, but since 95%+ of our functions have a single specialized argument
we were getting only 1/4 use out of nearly all of our caches.

I solved this problem by rewriting object-cache-no to shift the address
three bits to the right as follows:

In excl-low.lisp --

(defmacro object-cache-no (object mask)
  `(logand (%ash (the fixnum (excl::pointer-to-fixnum ,object)) 3)
           (the fixnum ,mask)))

(defmacro %ash (fixnum count)
    ;; This is very dangerous and shouldn't be done.  But ...
    ;; EXCL ash appears to be just too slow to be of use.  But then maybe I
    ;; just don't know the right compiler declarations.
    `(comp::.primcall 'shift-right ,fixnum ,count))

This change got rid of nearly all of our cache misses and sped up the system
by about 10% (obviously something else is dominating the time!).

My guess is that the doubling of the cache size suggested by Jim Larus a
few months back is another solution to the same problem (i.e., decreasing
the number of collisions in the cache), but one that still wastes 3/4
of the cache space (for singly specalized functions).

CAVEAT:  I am running on a very old PCL (87/8/27) and on Allegro 3.0.1.
I checked the sources for the 88/8/2 version of PCL and as far as I can tell,
the object-cache-no/generic-function-cache-offset scheme hasn't changed 
(in essence) at all.  Hence I am assuming that this problem and its fix will
hold for the latest PCL, but I haven't tried it out myself.

-- Frank

---------------------------------------------------

Frank Halasz          
MCC Software Technology
3500 West Balcones Center Drive
Austin TX 78759.
[512]338-3648 

halasz@MCC.COM
or {harvard,ihnp4,nike,seismo}!ut-sally!im4u!milano!halasz