[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fast SIGNUM equivalent



    Date: Wed, 9 Dec 1992 19:12 EST
    From: jbarnett@charming.nrtc.northrop.com


    I have written a program that incorporates something like

	    (CASE (SIGNUM (- I1 I2))
	      (1 ....)
	      (0 ....)
	      (-1 ....))

    where I1 and I2 are fairly small, positive integers.  This form is being
    executed a lot (hundreds of millions of times).  Is there anyway to
    speed up the test-branch part of this enough to justify whatever
    pornography is necessary?

    Jeff Barnett

I thought this might be better:

(proclaim '(inline fixnum-compare))
(defun fixnum-compare (x y)
  (declare (fixnum x y))
  (cond ((> x y) '>)
        ((= x y) '=)
        (t '<)))

(case (fixnum-compare i1 i2)
  (> ...)
  (= ...)
  (< ...))

This version is either slower or faster than your version (on a 3650),
depending on whether I1 is greater than or less than I2.  You can see
where that dependency comes from: the order of the COND and CASE
clauses.  If you know a priori which cases are more likely, you can
optimize your source code for it.  The speed of SIGNUM itself is
also dependent on the sign of the argument; internally it contains

      (COND ((PLUSP NUMBER) 1)
	    ((MINUSP NUMBER) -1)
	    (T NUMBER)))

By the way, this reminds me of my favorite missing optimization in the
Symbolics compiler.  A few releases ago they added optimization of CASE
when the keys are all small, non-negative integers: it uses a jump table
instead of a series of comparisons.  However, this optimization would be
appropriate whenever the keys are any dense set of integers; just
subtract the value of the smallest one and use that as the jump table
index.  I sent the patch for this to Symbolics several years ago;
however, jbarnett's original code still generates a series of tests.
Changing it to

	    (CASE (1+ (SIGNUM (- I1 I2)))
	      (2 ....)
	      (1 ....)
	      (0 ....))

results in a jump table.

However, my benchmark found that this generally was between the speeds
of the above two versions.  I believe this is due to SIGNUM's
argument-dependency, and also because the jump table dispatch causes a
pipeline break, whereas the 3650 normally prefetches the instruction
after a conditional branch.

I did finally manage to get a version that is consistently as fast or
faster than any of the above.

(proclaim '(inline (fixnum signum)))
(defun fixnum-signum (x)
  (declare (fixnum x))
  (cond ((plusp x) 1)
        ((minuxp x) -1)
        (t 0)))

(case (1+ (fixnum-signum (- i1 i2)))
  (2 ...)
  (1 ...)
  (0 ...))

And the following is slightly faster than the above when (> i1 i2).

(case (fixnum-signum (- i1 i2))
  (1 ...)
  (0 ...)
  (-1 ...))