[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: small integers

Scott Fahlman writes:
  I think it would be a serious mistake for the Dylan language to include a
  required list of numeric optimizations at this early point (or any time
  very soon).  Until we all get some experience with this, it won't be clear
  what is easy, what is straightforward but perhaps too expensive for small
  implementations, and what is very hard.  At present, we don't yet have a
  complete system of types and declarations, or even full agreement on how to
  handle bignums/integer-overflow.

I agree that it is not easy to get optimized numerics done right within the
Common Lisp numerics system, either as a language designer or as a
programmer.  I have a group of experienced Common Lisp programmers here, and
many of them are still not confident in their ability to write numeric
operations that will be deemed acceptable for optimization by the compilers
we use.  This was such a problem that we had to write macros for fixnum and
double-float numeric operations that we knew would work, and so our
programmers now use those macros to get optimized arithmetic.

In view of this experience, I was very surprised to see the Common Lisp
numeric system adopted within Dylan wholesale.  I feel that if the Dylan
designers really intend to lure C and C++ programmers into Dylan, they should
either provide the means to write optimized numeric methods, or they should
fall back to a numeric system that is simple enough to be able to get
optimizations.  I still believe the C community will view Dylan as largely
irrelevant without fast arithmetic.  For these reasons I can't agree with
Scott that this work should be postponed.

Scott continues:

  What might be useful is for someone ... to group the optimizations that
  you want to see into a few coherent levels.  Then implementors and
  applications designers would have a convenient shorthand: "This code runs
  well only a in Dylan implementation with level 2 arithmetic

  It wouldn't matter too much whether these levels are blessed officially as
  part of the Dylan language or are just a users-group thing, as long as
  everyone who wants to play this implementation efficiency game agrees to a
  common set of conventions.

I agree that once sufficient type declaration means are provided within Dylan
to allow numeric optimizations, it could be left up to implementors to choose
whether or not to provide optimizations within their implementation.  I also
strongly agree that conventions are needed within the Dylan community to
describe classes of optimizations.  I would hope that these conventions are
blessed by the language designers to ensure widespread understanding of their
meaning.  To use the example of Common Lisp again, I feel that strong enough
definitions of numeric optimizations did not evolve.  Yet, while some
compilers (for example CMU Lisp) do amazing things with types, there are
still commercial implementations that do not support all of the following
optimizations in their numerics.  I still find myself in long-winded
discussions with Lisp vendors about what they mean by "in-line arithmetic."
It would be a shame for that to happen to Dylan.

The following lists some sets of optimizations I would be interested in

1) Compile-time method selection.

Given adequate calls to seal, calls to make-read-only, and type declarations
at a method call site, then a specific method is chosen and the call to it is
staticly compiled in, or even in-lined for simple methods such as slot value
access.  This optimizes away run-time type dispatching.  (This is what I
think Dylan designers are attempting around seal and make-read-only, though I
couldn't find it spelled out in detail within the Dylan manual.)

2) Unboxed intermediate arithmetic values.

Given that compile-time method selection succeeded, intermediate results of
nested fixnum and double-float operations are passed from one operation to
the next without boxing.  This optimizes away possibly expensive consing
operations, makes the expression itself faster and also reduces the load the
garbage collector imposes on an application.

Determining numeric type of intermediate values of double-float expressions
can be done without intermediate declaration since contagion between double-
floats always produces double-floats.  (Would any experts please verify this?
I'm fairly certain this always holds.)  For fixnums, it can be determined (by
a *sufficiently smart* compiler) that intermediate values don't exceed the
range of fixnums given tighter range restrictions on arguments to an
expression.  These tighter bounds can be inferred when using Dylan's
iteration over ranges, though an extention to the type system to declare
ranges on integer types would provide this through a broader means.  Even
given fixnum range-restricted type declarations, a means is needed to assert
types of intermediate values.  Without it, Dylan users will be stuffing
intermediate values back into fixnum type declared lexical variables to get
an implicit type declaration of the expression's return value.

3) Unboxed lexical variable contents.

Given local variables that are type declared as fixnum or double-float, the
memory for that location is allocated to hold the value in its unboxed form.
Uses of the variable value for compile-time method selected expressions with
unboxed intermediate values would use the unboxed value from this variable
directly.  Uses of the variable value for other purposes would cause a boxing
of the value before use.  Like #2, this optimizes away consing operations.

4) Unboxed method arguments and return values.

Given a user-defined (i.e.  not built-in) method that takes fixnum or
double-float arguments, and returns values that can be determined at compile
time to be fixnum or double-float, and given a compile-time selected call to
the method, then the fixnum arguments and return values for this call are
passed in their unboxed form.  Like #2 and #3, this optimizes away consing

Chris Fry writes:

  I agree that there is no contradiction here [between correct programs and
  fast programs]. In fact there's even a strong relationship between the two
  in that there's nothing slower than a broken program.

I don`t mean to jump all over you, Chris, but the sad fact is that there is a
flippant dismissal of performance issues among many high level language
proponents.  In my view, more than anything else, the reputation of Common
Lisp as fat and slow prevented its commercial success and fueled the flight
of expert systems products from Lisp to C, and numeric performance was the
example I heard most often from people arguing against Lisp.

It doesn't matter that Common Lisp implementations have now mainly got those
issues under control.  A bad reputation has become firmly entrenched in the
marketplace.  And it is still true that smart, well-meaning users of that
language have a hard time using it correctly.

If the Dylan designers mean what was said in the introduction to the manual
about targeting C, C++, and Pascal users, that compatibility with Lisps is
not a primary goal, and that they would "leave out anything that we do not
know how to implement efficiently in both space and time", then I think this
numerics issue qualifies as a problem area deserving of attention.

Well, I've blathered on more than I planned here.  I don't mean to tell the
Dylan designers how to do their jobs, and I don't diminish in any way the
difficulty of the task set before them.  I've just gone through a lot with
Common Lisp numerics and so have some strong opinions in this area.


Jim Allard                                        jra@gensym.com
Manager of Languages, Interpreters, & Compilers   (617) 547-2500
Gensym Corporation
125 CambridgePark Drive
Cambridge, MA  02140