[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Notes on the efficiency of the new version of PCL (7/20/88 beta)
- To: Gregor.pa@Xerox.COM, CommonLoops.pa@Xerox.COM
- Subject: Notes on the efficiency of the new version of PCL (7/20/88 beta)
- From: larus%paris.Berkeley.EDU@ginger.Berkeley.EDU (James Larus)
- Date: Tue, 26 Jul 88 14:52:48 PDT
- Cc: franz!smh@ginger.Berkeley.EDU, franz!layer@ginger.Berkeley.EDU
- Redistributed: CommonLoops.pa
- Reply-to: larus@ginger.Berkeley.EDU
Gregor,
The new version of PCL runs about as fast as my old, hacked up
version of PCL. However, there are a few simple changes that increase
its speed by about 50% or more. (All numbers are from test runs of
"Curare", a large program transformation system that I built on top of
PCL). The measurements were made in Allegro CL 3.0beta running on a
Sun 3/75 with 16MB.
Some of the changes are Allegro-dependent and some should work
in any CL. However, even the Allegro optimizations are likely to be
helpful for other CLs.
1. Increase the size of the constant GENERIC-FUNCTION-CACHE-SIZE from
32 to 64 (methods.lisp). This decreased the execution time of Curare
by 20-33%. In one test, the number of calls on PCL::LOOKUP-METHOD-INTERNAL
(i.e., the cache-miss code) fell from 5224 to 1335. Interestingly,
most of these calls came from built-in methods, mainly INITIALIZE,
INITIALIZE-FROM-INIT-PLIST, INITIALIZE-FROM-DEFAULTS, etc., defined on
all classes.
2. The following changes (marked with JL) force a few more critical
operations to be open-coded.
; dcode.lisp:
(defmacro generic-function-cache-offset (mask &rest classes)
(let ((cache-numbers (mapcar #'(lambda (class)
`(the fixnum (object-cache-no ,class ,mask))) ; JL
classes)))
(if (cdr cache-numbers)
`(logand ,mask (logxor ,@cache-numbers))
`(logand ,mask ,@cache-numbers))))
(defmacro generic-function-cache-entry (cache offset offset-from-offset)
`(memory-block-ref ,cache (+ (the fixnum ,offset) ; JL
(the fixnum ,offset-from-offset)))) ; JL
; low.lisp
(defmacro cache-key-from-wrappers
((size words-per-entry &optional (op 'logxor)) &rest wrappers)
(when (or (not (numberp size))
(not (numberp words-per-entry)))
(error "Using cache-key-from-wrappers improperly.~@
The size and words-per-entry arguments must be unquoted numbers."))
(when (not (member op '(nil logand logxor)))
(error "Using cache-key-from-wrappers improperly.~@
If supplied, the op argument must be an unquoted symbol, and~@
one of LOGAND and LOGXOR."))
;; Convert the wrapper forms into forms which will fetch the wrapper's
;; cache number. That is what we really need to work with.
(setq wrappers (mapcar #'(lambda (w) `(the fixnum (wrapper-cache-no ,w)))
wrappers)) ; JL
(cond ((and (null (cdr wrappers)) (= size 2))
(car wrappers))
((eq op 'logand)
`(%logand ,(make-memory-block-mask size words-per-entry)
,.wrappers))
((eq op 'logxor)
`(%logand ,(make-memory-block-mask size words-per-entry)
(%logxor ,.wrappers)))))
3. Finally, the following two macros (add to excl-low.lisp) optimize
the trivial cases of a couple of operations. Yes, the compiler should
do this. Yes, the one arg case does occur in practice (in fact, in
the discriminator code for methods with a single discriminated
argument).
(defmacro %logand (&rest args)
(cond ((null args) `(logand))
((cdr args) `(logand .,args))
(t (car args)))) ; JL
(defmacro %logxor (&rest args)
(cond ((null args) `(logxor))
((cdr args) `(logxor .,args))
(t (car args)))) ; JL
I don't have performance numbers for the latter two
optimizations, but I remember that they were on the order of a 20%
reduction in the execution time for Curare.
/Jim