[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
MCL's floating point performance
- To: hmadorf@eso.org (by way of alms@cambridge.apple.com (Andrew LM Shalit))
- Subject: MCL's floating point performance
- From: bill@cambridge.apple.com (Bill St. Clair)
- Date: Tue, 27 Oct 1992 12:03:56 -0600
- Cc: get@robotics.jpl.nasa.gov (Erann Gat), info-mcl
>two days from now I am up to speak on the prospects of parallel processing
>(for astronomical applications such as HST image restoration). The
>target architecture will be a Connection Machine 5 (CM5) which
>supports *Lisp. I want people to think hard about Lisp as a viable
>high-level language for scientific computing. So much for the background.
>
>While preparing the talk I made a comparison between Fortran 77,
>Fortran 90 and Common Lisp, and no surprise, CL downs the other two
>languages on almost all counts, though Fortran 90 came out
>surprisingly well.
>
>My audience will mainly consist of scientists, whose experience is in
>numerical computing. So I expect the question: "What about the
>performance?" In order to prepare myself I wrote a miniature benchmark
>computing the logarithm of n-factorial in both Fortran-77 and MCL.
>The code simply sums the logs of the individual terms and is appended below.
>
>The timing tests were extremely disappointing. Though I used all
>sorts of declarations in the end, the CL code was consistently slower
>by a factor of 5 to 6 across the SE, SE/30 and SUN Sparc 2 (+ Allegro
>CL) platforms used for benchmarking.
>
>I know that neither MCL nor Allegor CL are running on the CM5, but
>the consistent slower performance (two different Fortran compilers,
>two different CL compilers) of Lisp w.r to Fortran worries me.
>
>Any clue how to improve the performance? If the situation prevails, I
>will find it difficult to recommend CL for supercomputing
>applications.
MCL does in-line arithmetic only for values that are declared to
be fixnums. Inlining floating point operations is known technology,
but is not yet a part of MCL. I believe that CMU's Python compiler
(and probably a few other commercial compilers) will generate good
floating point code if supplied with the proper declarations. We will
likely add better floating point support to some future version of MCL,
though noone here is working on it at present.
Make sure you try out some floating point code on the CM5 or ask
someone at Thinking Machines about it before assuming that it will
be as comparably slow as MCL & Allegro CL.
There is one way to get faster floating point performance in MCL.
Erann Gat wrote a floating point compiler that will in-line single
floating point expressions and can be used in a way that minimizes
consing of floats
I rewrote your log-factorial example using his package and got almost
a factor of 3 speedup (on a Mac IIci):
(defun log-factorial (n)
"Calculates log of n-factorial non-recursively."
(setq n (require-type n 'fixnum))
(locally (declare (fixnum n))
(do ((i 1 (1+ i))
(result 0.0))
((> i n) result)
(declare (fixnum i))
(incf result (log i)))))
(fpc:define-fpc-destructive (fp-log-factorial-step! result sum x)
(+ sum (log x)))
(fpc:define-fpc-destructive (fp-add-one! result x)
(+ x 1.0))
(defun fast-log-factorial (n)
(setq n (require-type n 'fixnum))
(locally (declare (fixnum n))
(let ((result (%copy-float 0.0))
(counter (%copy-float 1.0)))
(dotimes (i n result)
(fp-log-factorial-step! result result counter)
(fp-add-one! counter counter)))))
;; (without-interrupts (time (log-factorial 2000)))
;; (without-interrupts (time (fast-log-factorial 2000)))
#|
(LOG-FACTORIAL 2000) took 305 milliseconds (0.305 seconds) to run.
32000 bytes of memory allocated.
13206.52435051381
(FAST-LOG-FACTORIAL 2000) took 106 milliseconds (0.106 seconds) to run.
16 bytes of memory allocated.
13206.52435051381
|#
A few notes on Erann Gat's floating point compiler:
It works.
It is available for anonymous FTP from cambridge.apple.com in the
file "/pub/mcl2/contrib/fpc.lisp-v1.2a1".
It needs to be loaded before it can be compiled.
Nit: I find it's inclusion of the following form to be antisocial (both
because it modifies parts of my world that are not its business and
because it takes a long time to evaluate):
(do-symbols (s)
(if (fboundp s)
(eval `(defvar ,s ',(symbol-function s)))))
This form can be commented out with the addition of a few "#'"s in
front of function names.