[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Notes on the efficiency of the new version of PCL (7/20/88 beta)

To: Gregor.pa@Xerox.COM, CommonLoops.pa@Xerox.COM
Subject: Notes on the efficiency of the new version of PCL (7/20/88 beta)
From: larus%paris.Berkeley.EDU@ginger.Berkeley.EDU (James Larus)
Date: Tue, 26 Jul 88 14:52:48 PDT
Cc: franz!smh@ginger.Berkeley.EDU, franz!layer@ginger.Berkeley.EDU
Redistributed: CommonLoops.pa
Reply-to: larus@ginger.Berkeley.EDU

Gregor,

	The new version of PCL runs about as fast as my old, hacked up
version of PCL.  However, there are a few simple changes that increase
its speed by about 50% or more.  (All numbers are from test runs of
"Curare", a large program transformation system that I built on top of
PCL).  The measurements were made in Allegro CL 3.0beta running on a
Sun 3/75 with 16MB.

	Some of the changes are Allegro-dependent and some should work
in any CL.  However, even the Allegro optimizations are likely to be
helpful for other CLs.

1.  Increase the size of the constant GENERIC-FUNCTION-CACHE-SIZE from
32 to 64 (methods.lisp).  This decreased the execution time of Curare
by 20-33%.  In one test, the number of calls on PCL::LOOKUP-METHOD-INTERNAL
(i.e., the cache-miss code) fell from 5224 to 1335.  Interestingly,
most of these calls came from built-in methods, mainly INITIALIZE,
INITIALIZE-FROM-INIT-PLIST, INITIALIZE-FROM-DEFAULTS, etc., defined on
all classes.


2.  The following changes (marked with JL) force a few more critical
operations to be open-coded.

; dcode.lisp:

(defmacro generic-function-cache-offset (mask &rest classes)
  (let ((cache-numbers (mapcar #'(lambda (class)
				   `(the fixnum (object-cache-no ,class ,mask))) ; JL
			       classes)))
    (if (cdr cache-numbers)
	`(logand ,mask (logxor ,@cache-numbers))
	`(logand ,mask ,@cache-numbers))))

(defmacro generic-function-cache-entry (cache offset offset-from-offset)
  `(memory-block-ref ,cache (+ (the fixnum ,offset) ; JL
			       (the fixnum ,offset-from-offset)))) ; JL


; low.lisp

(defmacro cache-key-from-wrappers
	  ((size words-per-entry &optional (op 'logxor)) &rest wrappers)
  (when (or (not (numberp size))
	    (not (numberp words-per-entry)))
    (error "Using cache-key-from-wrappers improperly.~@
            The size and words-per-entry arguments must be unquoted numbers."))
  (when (not (member op '(nil logand logxor)))
    (error "Using cache-key-from-wrappers improperly.~@
            If supplied, the op argument must be an unquoted symbol, and~@
            one of LOGAND and LOGXOR."))
  ;; Convert the wrapper forms into forms which will fetch the wrapper's
  ;; cache number.  That is what we really need to work with.
  (setq wrappers (mapcar #'(lambda (w) `(the fixnum (wrapper-cache-no ,w)))
wrappers)) ; JL
  (cond ((and (null (cdr wrappers)) (= size 2))
	 (car wrappers))
	((eq op 'logand)
	 `(%logand ,(make-memory-block-mask size words-per-entry)
		   ,.wrappers))
	((eq op 'logxor)
	 `(%logand ,(make-memory-block-mask size words-per-entry)
		   (%logxor ,.wrappers)))))


3.  Finally, the following two macros (add to excl-low.lisp) optimize
the trivial cases of a couple of operations.  Yes, the compiler should
do this.  Yes, the one arg case does occur in practice (in fact, in
the discriminator code for methods with a single discriminated
argument).

(defmacro %logand (&rest args)
  (cond ((null args) `(logand))
	((cdr args) `(logand .,args))
	(t (car args))))		; JL

(defmacro %logxor (&rest args)
  (cond ((null args) `(logxor))
	((cdr args) `(logxor .,args))
	(t (car args))))		; JL


	I don't have performance numbers for the latter two
optimizations, but I remember that they were on the order of a 20%
reduction in the execution time for Curare.

/Jim

Prev by Date: shared-class demo..
Next by Date: 7/7 bugs
Previous by thread: shared-class demo..
Next by thread: 7/7 bugs
Index(es):
- Date
- Thread