[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

cmu-lisp efficiency



I am trying to get maximal floating point performance.  For the most
part, I am very pleased with cmu-lisp's performance but there are a
few problems that I would like to address.

For all of these benchmarks I used the following compiler settings:

	(proclaim '(optimize (safety 0) (speed 3) (space 0)))

I am running the following version of CMU-LISP:

	CMU Common Lisp 15b, running on aragorn.stanford.edu
	Hemlock 3.5 (15b), Python 1.0(15b), target SPARCstation/Sun 4

These are rather advanced wishes but here we go.

(1) Why can't you allow structures to store non-descriptor floats?
Here is an example:

(defstruct point
  (x 0.0 :type single-float)
  (y 0.0 :type single-float))

(defun add-points (src1 src2 dst)
  (declare (type point src1 src2 dst))
  (setf (point-x dst) (+ (point-x src1) (point-x src2)))
  (setf (point-y dst) (+ (point-y src1) (point-y src2)))
  dst)

Here is a trace of the compilation:

* (compile-file "/pdp/bachrach/cmu-lisp/bench-marks/structs" :trace-file t)

Python version 1.0(15b), VM version SPARCstation/Sun 4 on 13 NOV 91 04:07:15 pm.
Compiling: /pdp/bachrach/cmu-lisp/bench-marks/structs.lisp 13 NOV 91 04:06:20 pm

Converted MAKE-POINT.
Compiling DEFSTRUCT POINT: 

File: /pdp/bachrach/cmu-lisp/bench-marks/structs.lisp

In: DEFSTRUCT POINT
  (DEFSTRUCT POINT
    (X 0.0 :TYPE SINGLE-FLOAT)
    (Y 0.0 :TYPE SINGLE-FLOAT))
--> LET 
==>
  (C::%FUNCALL #<LAMBDA #x7010C55
                        NAME= NIL
                        TYPE= #<NAMED-TYPE FUNCTION>
                        WHERE-FROM= :DEFINED
                        VARS= (#:X-DEFAULTING-TEMP
                               #:Y-DEFAULTING-TEMP)>
               #:G17
               #:G18)
Note: Doing MOVE-FROM-SINGLE (cost 13) to #:X-DEFAULTING-TEMP.
Note: Doing MOVE-FROM-SINGLE (cost 13) to #:Y-DEFAULTING-TEMP.

Converted ADD-POINTS.
Compiling DEFUN ADD-POINTS: 

File: /pdp/bachrach/cmu-lisp/bench-marks/structs.lisp

In: DEFUN ADD-POINTS
  (SETF (POINT-X DST) (+ (POINT-X SRC1) (POINT-X SRC2)))
--> LET* FUNCALL 
==>
  (C::%FUNCALL #'(SETF POINT-X) #:G1 #:G2)
Note: Doing MOVE-FROM-SINGLE (cost 13), for:
      The second argument of STRUCTURE-SET.

  (SETF (POINT-Y DST) (+ (POINT-Y SRC1) (POINT-Y SRC2)))
--> LET* FUNCALL 
==>
  (C::%FUNCALL #'(SETF POINT-Y) #:G3 #:G4)
Note: Doing MOVE-FROM-SINGLE (cost 13), for:
      The second argument of STRUCTURE-SET.

Compiling Top-Level Form: 

Compilation unit finished.
  4 notes


/pdp/bachrach/cmu-lisp/bench-marks/structs.sparcf written.
Compilation finished in 0:00:20.

#p"/pdp/bachrach/cmu-lisp/bench-marks/structs.sparcf"
T
NIL
* 

The problem is that the program conses everytime it stores anything to
a floating point struct slot.  Naively, it seems that there must be a
way to store non-descriptor floating-points in structures since the
structure information is around (i.e., the type of all the slots)
permitting the computation of the slot offsets and types.  What am I
missing?  

The solution that I had been using in the past was to store a
non-descriptor float vector of length one instead of the float but
this is such a hack.  It at least doubles the storage, decreases the
performance, and makes the code uglier.

(2) The next issue has to do with passing floating-point parameters
and returning floating-point values.  Even if you declare a function
to have particular arguments and return values, it still conses.  The
following is an example:

(proclaim '(type (function (double-float double-float) double-float) sum+2))

(defun sum+2 (x y)
  (declare (type double-float x y)
	   (values double-float))
  (+ (+ x y) 2.0))

(defun sum*2+2 (x y)
  (declare (type double-float x y)
	   (values double-float))
  (* 2.0 (sum+2 x y)))

In this example the function sum*2+2 knows the type information
about sum+2, but it still conses on call and return.  

One fix for this is inlining functions.  I see that cmu-lisp offers the
compilation block construct as well.  Unfortunately, these does not
handle some important cases.

For instance, when using CLOS and you want to access a floating point
slot.  In CLOS, accessors are generic functions, and thus within your
implementation, you will have to cons everytime you access a floating
point slot.  This is a major major drag.  (Furthermore you will have
to cons twice to store a floating point value: once for the generic
call and once for the storing to a structure (if this is how pcl
implements objects).)

There must be a solution to this problem.

-- jonathan bachrach (bachrach@psych.stanford.edu)

PS Is there a disassemble function for the sparc implementation or am
I just missing something?