[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
cmu-lisp efficiency
I am trying to get maximal floating point performance. For the most
part, I am very pleased with cmu-lisp's performance but there are a
few problems that I would like to address.
For all of these benchmarks I used the following compiler settings:
(proclaim '(optimize (safety 0) (speed 3) (space 0)))
I am running the following version of CMU-LISP:
CMU Common Lisp 15b, running on aragorn.stanford.edu
Hemlock 3.5 (15b), Python 1.0(15b), target SPARCstation/Sun 4
These are rather advanced wishes but here we go.
(1) Why can't you allow structures to store non-descriptor floats?
Here is an example:
(defstruct point
(x 0.0 :type single-float)
(y 0.0 :type single-float))
(defun add-points (src1 src2 dst)
(declare (type point src1 src2 dst))
(setf (point-x dst) (+ (point-x src1) (point-x src2)))
(setf (point-y dst) (+ (point-y src1) (point-y src2)))
dst)
Here is a trace of the compilation:
* (compile-file "/pdp/bachrach/cmu-lisp/bench-marks/structs" :trace-file t)
Python version 1.0(15b), VM version SPARCstation/Sun 4 on 13 NOV 91 04:07:15 pm.
Compiling: /pdp/bachrach/cmu-lisp/bench-marks/structs.lisp 13 NOV 91 04:06:20 pm
Converted MAKE-POINT.
Compiling DEFSTRUCT POINT:
File: /pdp/bachrach/cmu-lisp/bench-marks/structs.lisp
In: DEFSTRUCT POINT
(DEFSTRUCT POINT
(X 0.0 :TYPE SINGLE-FLOAT)
(Y 0.0 :TYPE SINGLE-FLOAT))
--> LET
==>
(C::%FUNCALL #<LAMBDA #x7010C55
NAME= NIL
TYPE= #<NAMED-TYPE FUNCTION>
WHERE-FROM= :DEFINED
VARS= (#:X-DEFAULTING-TEMP
#:Y-DEFAULTING-TEMP)>
#:G17
#:G18)
Note: Doing MOVE-FROM-SINGLE (cost 13) to #:X-DEFAULTING-TEMP.
Note: Doing MOVE-FROM-SINGLE (cost 13) to #:Y-DEFAULTING-TEMP.
Converted ADD-POINTS.
Compiling DEFUN ADD-POINTS:
File: /pdp/bachrach/cmu-lisp/bench-marks/structs.lisp
In: DEFUN ADD-POINTS
(SETF (POINT-X DST) (+ (POINT-X SRC1) (POINT-X SRC2)))
--> LET* FUNCALL
==>
(C::%FUNCALL #'(SETF POINT-X) #:G1 #:G2)
Note: Doing MOVE-FROM-SINGLE (cost 13), for:
The second argument of STRUCTURE-SET.
(SETF (POINT-Y DST) (+ (POINT-Y SRC1) (POINT-Y SRC2)))
--> LET* FUNCALL
==>
(C::%FUNCALL #'(SETF POINT-Y) #:G3 #:G4)
Note: Doing MOVE-FROM-SINGLE (cost 13), for:
The second argument of STRUCTURE-SET.
Compiling Top-Level Form:
Compilation unit finished.
4 notes
/pdp/bachrach/cmu-lisp/bench-marks/structs.sparcf written.
Compilation finished in 0:00:20.
#p"/pdp/bachrach/cmu-lisp/bench-marks/structs.sparcf"
T
NIL
*
The problem is that the program conses everytime it stores anything to
a floating point struct slot. Naively, it seems that there must be a
way to store non-descriptor floating-points in structures since the
structure information is around (i.e., the type of all the slots)
permitting the computation of the slot offsets and types. What am I
missing?
The solution that I had been using in the past was to store a
non-descriptor float vector of length one instead of the float but
this is such a hack. It at least doubles the storage, decreases the
performance, and makes the code uglier.
(2) The next issue has to do with passing floating-point parameters
and returning floating-point values. Even if you declare a function
to have particular arguments and return values, it still conses. The
following is an example:
(proclaim '(type (function (double-float double-float) double-float) sum+2))
(defun sum+2 (x y)
(declare (type double-float x y)
(values double-float))
(+ (+ x y) 2.0))
(defun sum*2+2 (x y)
(declare (type double-float x y)
(values double-float))
(* 2.0 (sum+2 x y)))
In this example the function sum*2+2 knows the type information
about sum+2, but it still conses on call and return.
One fix for this is inlining functions. I see that cmu-lisp offers the
compilation block construct as well. Unfortunately, these does not
handle some important cases.
For instance, when using CLOS and you want to access a floating point
slot. In CLOS, accessors are generic functions, and thus within your
implementation, you will have to cons everytime you access a floating
point slot. This is a major major drag. (Furthermore you will have
to cons twice to store a floating point value: once for the generic
call and once for the storing to a structure (if this is how pcl
implements objects).)
There must be a solution to this problem.
-- jonathan bachrach (bachrach@psych.stanford.edu)
PS Is there a disassemble function for the sparc implementation or am
I just missing something?