[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: a question about an efficiency tradeoff with flavors
Date: Wed, 18 Jul 90 17:31 EDT
From: Jeffrey Mark Siskind <Qobi@zermatt.lcs.mit.edu>
Subject: a question about an efficiency tradeoff with flavors
I have a question about an efficiency tradeoff with flavors.
Say I have a flavor F with a readable instance variable I.
(defflavor F (I) () (:readable-instance-variables I))
Now say I want to write a function G which takes an instance of
F and returns some function H of its instance variable I. I
have two choices:
(1) (defun G (F) (H (I F)))
(2) (defmethod (G F) () (H I))
Jeff Morril, Mike Thome, and others have been developing some simple
benchmarks for determining CLOS and Common LISP performance, that may
give some qualitative insite into your questions. Of course, it is
important to do your own timings. It turns out that cases (1) and (2)
should perform about the same, though there appears to be some
differences between the XL400, and 3600 which tend to favor (2) on
Ivorys for some reason. Before i quote the numbers, here are some
heuristics i use:
1. a lexical variable reference is much faster than a special
variable reference (> 5 times), and an instance variable reference
(>8 times). Sorry, i don't have numbers for free variables, but i
assume they are between lexical and specials.
2. Funcalling a function is ~3 times faster than funcalling a generic
function. However you must also count any time your function does
type checking. Some types are very expensive to check, even using
3. Funcalling a lexical variable is faster on Ivorys than funcalling
through the function cell. It doesn't matter too much on 3600's.
4. Flavors and CLOS are roughly comparable in performance. CLOS has
been considerably improved over 8.0 Beta (thanks).
5. Your cases (1) and (2) above, are roughly comparable.
6. Funcalling costs more as you add more arguments. &OPTIONAL
arguments cost more than required arguments.
7. &KEY arguments are EXTREMELY slow. Luckily, sometimes they are
optimized (see below).
8. Some functions you expect to be fast, like ABS, TRUNCATE, or
OPERATION-HANDLED-P are SUPRISINGLY slow.
Here are the raw performance numbers we have. I'll send you the
software separately. The numbers are operations/sec which is the
median of 5 separate timings of between 100,000 and 500,000
operations, done with SI:WITHOUT-INTERRUPTS. The number should be
good to 0.5% or better, through i do see some higher variability on
the XL 400 for some reason i don't understand.
90-7-19 Genera 8.0 Operations/sec
(* 2.1 2.1): 486069.7 375480.4
(+ 2.1 2.1): 486069.7 815526.0
funcall & (* 2.1 2.1): 206728.7 192778.2
lexical reference: 6344156.0 2326191.0
special reference: 1115296.8 614465.4
ivar reference: 722633.2 553854.9 (1)
(setf (aref array 5) t): 428508.8 293746.2
(funcall lexical-f x): 486069.7 207167.1
(f x): 364552.3 192778.2
(1) This is the time of a single ivar reference inside a method, not
the case you were concerned with.
90-7-19, Genera 8.0 Operations/sec
XL 400 3640
flavor:make-instance (2 slots): 8702.08 3621.52
flavor:symbol-value-in-instance: 41909.74 15362.60
1 method, 1 dispatch: 132528.48 84912.23
slot symbol in method (access): 98766.68 66798.85 (2)
slot symbol in method (modify): 98468.05 66890.32
slot accessor bar 92239.43 68532.55 (1)
slot accessor frob: 88002.16 68484.51 (3)
instance-flavor: 861552.0 293746.25
(1) This is like your case (1).
(2) This is like your case (2).
(3) BAR and FROB are two flavors, FROB inherits from BAR. I don't see
why the performance seems to be get worse for frob.
CLOS 90-7-19, Genera 8.0
1 default method: 161434.23 63998.43
1 dispatch, 1 method: 116698.51 71023.55
1 dispatch, 3 methods, instance: 106403.83 74774.23
1 dispatch, 3 methods, noninstance: 126194.78 68484.51
2 dispatch, 2 methods: 60041.79 16860.52
slot reader method: 98567.4 63089.24
with-slots (1 access): 76375.86 56716.59
with-slots (1 modify): 97194.58 54961.74
naked slot-value: 36547.95 26660.48
class-of instance: 117682.49 57755.97
class-of noninstance: 59565.90 24173.59
make-instance (2 slots): 64437.40 9690.92
make-instance (2 constant initargs): 24664.2 9501.48
make-instance (2 variable initargs): 45756.83 9706.32
Finally, here are some results on the cost of keywords. I wrote
functions with either N arguments, or N keywords and looked at their
SEC/CALL FOR KEYWORD AND NONKEYWORD ARGUMENTS
(REL 8.0 XL-400)
N ARGUMENTS KEYWORDS RATIO
1 0.303 1.78 5.87
2 0.318 2.83 8.90
3 0.344 4.07 11.83
4 0.370 5.66 15.3
So, having 4 keywords is like counting to 160 before each function
call. Keyword processing first calls SI:VALIDATE-KEYWORDS-INTERNAL to
make sure the keywords are valid. It then figures out where each
variable should go which is (O (expt N 2)). Even though optimizers
have been written for some common cases, SI:VALIDATE-KEYWORDS-INTERNAL
is called 30 - 150 time/sec, so a suprising amount of time may be
spent inside this function.