[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fast SIGNUM equivalent
Date: Wed, 9 Dec 1992 19:12 EST
From: jbarnett@charming.nrtc.northrop.com
I have written a program that incorporates something like
(CASE (SIGNUM (- I1 I2))
(1 ....)
(0 ....)
(-1 ....))
where I1 and I2 are fairly small, positive integers. This form is being
executed a lot (hundreds of millions of times). Is there anyway to
speed up the test-branch part of this enough to justify whatever
pornography is necessary?
Jeff Barnett
I thought this might be better:
(proclaim '(inline fixnum-compare))
(defun fixnum-compare (x y)
(declare (fixnum x y))
(cond ((> x y) '>)
((= x y) '=)
(t '<)))
(case (fixnum-compare i1 i2)
(> ...)
(= ...)
(< ...))
This version is either slower or faster than your version (on a 3650),
depending on whether I1 is greater than or less than I2. You can see
where that dependency comes from: the order of the COND and CASE
clauses. If you know a priori which cases are more likely, you can
optimize your source code for it. The speed of SIGNUM itself is
also dependent on the sign of the argument; internally it contains
(COND ((PLUSP NUMBER) 1)
((MINUSP NUMBER) -1)
(T NUMBER)))
By the way, this reminds me of my favorite missing optimization in the
Symbolics compiler. A few releases ago they added optimization of CASE
when the keys are all small, non-negative integers: it uses a jump table
instead of a series of comparisons. However, this optimization would be
appropriate whenever the keys are any dense set of integers; just
subtract the value of the smallest one and use that as the jump table
index. I sent the patch for this to Symbolics several years ago;
however, jbarnett's original code still generates a series of tests.
Changing it to
(CASE (1+ (SIGNUM (- I1 I2)))
(2 ....)
(1 ....)
(0 ....))
results in a jump table.
However, my benchmark found that this generally was between the speeds
of the above two versions. I believe this is due to SIGNUM's
argument-dependency, and also because the jump table dispatch causes a
pipeline break, whereas the 3650 normally prefetches the instruction
after a conditional branch.
I did finally manage to get a version that is consistently as fast or
faster than any of the above.
(proclaim '(inline (fixnum signum)))
(defun fixnum-signum (x)
(declare (fixnum x))
(cond ((plusp x) 1)
((minuxp x) -1)
(t 0)))
(case (1+ (fixnum-signum (- i1 i2)))
(2 ...)
(1 ...)
(0 ...))
And the following is slightly faster than the above when (> i1 i2).
(case (fixnum-signum (- i1 i2))
(1 ...)
(0 ...)
(-1 ...))