[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: speed up for array ops
- To: bright@ENH.NIST.GOV
- Subject: Re: speed up for array ops
- From: bill@cambridge.apple.com (Bill St. Clair)
- Date: Fri, 4 Dec 1992 11:11:53 -0600
- Cc: info-macl@cambridge.apple.com, Repenning Alexander <ralex@tigger.cs.colorado.edu>
At 13:49 12/3/92 -0700, Repenning Alexander wrote:
>Some minor problems. Somebody pointed out that aref can be slow
>and svref should be used. However, this is not possible in your case
>as displaced arrays are not considered simple-vectors (WHY?).
A simple vector is one for which it makes the most sense to code
AREF in-line (given suitable declarations). AREF of a displaced
array needs to indirect to the displaced array and add the displacement
offset.
>Since you
>iterate over a large number of items its important to clean out some
>stuff in your dotimes. First, there is no need to access the array twice and
>to call abs twice. Secondly, you can do better than MIN, MAX using IF because
>the way things are initialized changes to amax and amin are exclusive:
>
>[...]
>
>On my MacII this reduced the time from about 11 secs to 4.2 secs. The no ABS
>versions took only 2.5 secs, i.e., ABS is expensive (about 13 us).
>(declare (optimize (speed 3) (safety 0))) had almost no influence, it would
>have with svref.
>
>I've tried a version using vector-pop which was disapointingly slow. What
>good are fill pointers (besided keeping track of the pointer) when VECTOR-POP
>is not any faster than AREF?
>
>I'm sure there are better solutions to this using some low-level MCL stuff,
>oh well..
>
> Alex Repenning
; Here's Alex's code, fixed to work with complex values.
; I also moved the OPTMIZE declaration inside so that the function
; would bother to check for the right number of args.
(defmethod climits ((ara array))
"Returns the maximum and minimum values of elements of an array."
(let* ((asiz (array-total-size ara))
(av (make-array asiz
:displaced-to ara
:element-type (array-element-type ara)))
(amax (abs (aref av 0))) ;; the array starts with 0 not 1!
(amin amax))
(dotimes (i asiz)
(declare (optimize (speed 3) (safety 0)))
(let ((value (abs (aref av i))))
(if (> value amax)
(setq amax value) ; no need to update amin
(when (< value amin) (setq amin value)))))
`((min ,amin)(max ,amax))))
; Some code to make test arrays
(defun make-complex-array (&rest dims)
(unless dims (setq dims '(256 256)))
(let ((array (make-array dims)))
(dotimes (i (array-total-size array))
(setf (row-major-aref array i)
(complex (random 1000.0) (random 1000.0))))
array))
(defun make-real-array (&rest dims)
(unless dims (setq dims '(256 256)))
(let ((array (make-array dims)))
(dotimes (i (array-total-size array))
(setf (row-major-aref array i) (random 1000.0)))
array))
(defparameter *ca* (make-complex-array))
(defparameter *a* (make-real-array))
; Here's a faster version that uses the internal array-data-and-offset
; function to get a likely simple-vector. It also declares the dotimes
; limit variable to be a fixnum so that the increment and compare of
; i with the limit will be inline fixnum operations.
(defmethod faster-climits ((ara array))
"Returns the maximum and minimum values of elements of an array."
(multiple-value-bind (av offset) (ccl::array-data-and-offset ara)
(let* ((asiz (array-total-size ara))
(amax (abs (aref av 0))) ;; the array starts with 0 not 1!
(amin amax)
(index offset))
(declare (fixnum asiz index)
(optimize (speed 3) (safety 0)))
(macrolet ((doit ()
`(dotimes (i asiz)
(let ((value (abs (aref av index))))
(incf index)
(if (> value amax)
(setq amax value) ; no need to update amin
(when (< value amin)
(setq amin value)))))))
(if (simple-vector-p av)
(locally (declare (simple-vector av))
(doit))
(doit))
`((min ,amin)(max ,amax))))))
; You get a little bit from doing the aref inline, but most of the time
; is being spent in ABS, >, and <. Note that ABS of a complex number
; conses a float. These were timed on a ci.
#|
? (without-interrupts
(time (climits *ca*)))
(CLIMITS *CA*) took 7503 milliseconds (7.503 seconds) to run.
Of that, 1 milliseconds (0.001 seconds) were spent in The Cooperative Multitasking Experience.
524400 bytes of memory allocated.
((MIN 2.789532446277684) (MAX 1411.582817229422))
? (without-interrupts
(time (faster-climits *ca*)))
(FASTER-CLIMITS *CA*) took 6149 milliseconds (6.149 seconds) to run.
524368 bytes of memory allocated.
((MIN 2.789532446277684) (MAX 1411.582817229422))
? (without-interrupts
(time (climits *a*)))
(CLIMITS *A*) took 5143 milliseconds (5.143 seconds) to run.
80 bytes of memory allocated.
((MIN 0.04440496367613027) (MAX 999.9853565015904))
? (without-interrupts
(time (faster-climits *a*)))
(FASTER-CLIMITS *A*) took 3830 milliseconds (3.830 seconds) to run.
48 bytes of memory allocated.
((MIN 0.04440496367613027) (MAX 999.9853565015904))
|#