[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fast floats



Hi,

I am having a bit of a problem speeding up computation for floats. Recently
there have been some demonstrations from contributors of this group how to
get runtime down rather dramatically for fixnums.
How about floats? Things seem to be all right for short-floats

(defun zaf (a)
  (declare (type (simple-array short-float (256 256)) a)
           (optimize (speed 3) (safety 0)))
  (dotimes (i 256)
    (dotimes (j 256)
      (setf (aref a i j) 1.0))))


(time (zaf (make-array (list 256 256) :element-type 'short-float
:initial-element 0.0)))

With the declaration this takes ~350 ms (without it ~1175 ms; no garbage
collection interference) on this Centris 610.


But a simple addition makes things a little different (For the definition
of 1+&, I stole from the MCL-CD, but I'm not sure if this mutant makes
sense):

(defmacro def-short-float-op (int-name reg-name &optional (result-type
'short-float))
  `(defmacro ,int-name (&rest args)
     `(the ,',result-type (,',reg-name ,@(mapcar #'(lambda (arg) `(the
short-float ,arg)) args)))))

(def-short-float-op 1+& 1+)

(defun zaf (a)
  (declare (type (simple-array short-float (256 256)) a)
           (optimize (speed 3) (safety 0)))
  (dotimes (i 256)
    (dotimes (j 256)
      (setf (aref a i j) (1+ (aref a i j))))))

Regardless the use of 1+ or 'the optimized' 1+&, it takes zaf approximately
27 sec to run (same call as above)

(btw, the aref& as suggested MCL-CD (file: mt-utils.lisp) doesn't
seem to be of any help).

Furthermore, short-floats seem to do best; the situation gets desperate for
single-floats. Back to assignment, the declaration doesn't seem to matter
anymore: ~24 sec for both ways.

(defun zaf (a)
  (declare (type (simple-array single-float (256 256)) a)
           (optimize (speed 3) (safety 0)))
  (dotimes (i 256)
    (dotimes (j 256)
      (setf (aref a i j) 1.0))))

(time (zaf (make-array (list 256 256) :element-type 'single-float
:initial-element 0.0)))

Simple addition (by changing the last line to  (setf (aref a i j) (1+ (aref
a i j)))

(defun zaf (a)
  (declare (type (simple-array single-float (256 256)) a)
           (optimize (speed 3) (safety 0)))
  (dotimes (i 256)
    (dotimes (j 256)
      (setf (aref a i j) (1+ (aref a i j))))))

takes 56 seconds; replacing 1+ with a 1+& equivalent doesn't help either.

How do I get a boost? Surely I must have shown my ignorance well enough
now. Anyone who is willing to enlighten me?

Regards, Arnoud.

-----
Arnoud Verdwaald,
University of Nijmegen,
Dept. of Special Education (Orthopedagogiek),
Erasmusplein 1-16,
Postbus 9103,
6500 HD  NIJMEGEN,
THE NETHERLANDS (EUROPE).

e-mail:    arnoudv@ped.kun.nl
telephone: +31 80 612691
-----