[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: small integers



> Date: Fri, 09 Oct 92 03:22:39 -0400
> From: Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU

Scott, this was a good analysis, although I think there are some important points 
you have overlooked.

First, we don't believe in Dylan as an island unto itself, the approach many Lisp 
implementations have taken in the past.  To be successful, many implementations of 
Dylan are going to have to integrate well with the system environment in which they 
find themselves.  I think this would be true even if we were aiming Dylan at the 
Lisp community, but it's all the more true when we say we're aiming Dylan at 
developers who do not now use dynamic languages.  This means both that Dylan 
programs need good access to the facilities that are out there in the system, and 
that it must be possible, efficient, convenient, and natural to write applications 
in a mixture of Dylan and other languages.  This attitude has many implications, 
but the one I want to focus on here is the implication for integers: it is 
impossible to play in the Macintosh world without 32-bit integers; saying that 28 
bits are enough for most programs is only true when you look at Lisp programs in 
isolation.  I imagine most other environments, such as most dialects of Unix, also 
depend on 32-bit integers.

This implies that either small-integers are 32 bits (which is hard to do in a 
run-time typed world; although we could imagine that if we can solve the problem 
for 32-bit single-floats we can use the same solution for 32-bit small-integers) or 
else we can forget about the dream of programs that use small-integers exclusively 
and don't have to include the bignum runtime.

That was my most important point, but I also have a few comments on excerpts from 
your message:

> This integer business isn't going to come out clean on all dimensions
> however we slice it.

Agreed.

> 1. The current Dylan approach ... you have
> to keep the bignum code around in the runtime system at all times, and that
> code is large and complex.  Second, no arithmetic operations on integers
> can be open-coded, since you have to allow for bignum operands and
> overflows everywhere.

This is not quite right.  The combination of the <small-integer> class (behavior 
same as Common Lisp's fixnum type; not the same as Rob MacLachlan's proposed 
<small-integer>), which should be made a standard part of the language in my 
opinion, the pervasiveness of parameter specializers in the preferred Dylan 
programming style, and type inference in the compiler should eliminate the need to 
allow for bignum operands in most places where that is possible and the programmer 
cares about efficiency.  It may still be necessary to check for overflow, but that 
can certainly be open-coded with small cost, especially when the overflow causes a 
type error rather than continuing the computation with bignums.  This is what 
Pascal (at least on the Macintosh) does, so it can't be too expensive, although 
there is a compiler option to turn off the overflow checking.

There are published trace-scheduling-like techniques that allow for continuing the 
computation with bignums without burdening the small-integer path, however I don't 
think we would want to use them (at least not by default) because they expend space 
to gain speed and optimizing space is usually more important.  It's not in the LFP 
'92 paper, but doesn't CMU's Common Lisp compiler do this?

Anyway my point here is that the current Dylan language with the addition of the 
<small-integer> class is not necessarily so bad.  Admittedly we have neither told 
programmers how to use it efficiently nor told implementors what they need to do to 
support that programming style.  But maybe we would be better off doing our duty on 
those things rather than hairing up the language.

> .... Note, for example, that the fact that a loop counter starts at
> 1000 and counts down to 0 does not allow the compiler to assume [small-integer]
> arithmetic unless it can also prove that no code within the body of the
> loop alters the counter variable.

This is one of several reasons why the preferred Dylan programming style avoids 
set! of local variables.  There was some controversy about allowing that operation 
in the language at all, but we took the conservative path and left it in, since 
programmers coming from a Fortran or C (or Common Lisp!) background are likely to 
use it a lot at first until they catch on to the new programming style, and it's 
always good to give people a gradual path into your world.  The built-in Dylan 
iteration operations do not use assignment internally, at least in our 
implementations.  By the way, and I'm sure this will thrill you, I was thinking of 
creating an assignment-free version of the Common Lisp LOOP facility for Dylan (but 
not putting it into the language standard, of course).

> 3. The opposite extreme is to throw bignums out altogether.

As I noted earlier, this requires small-integers to be 32 bits and I'm not sure we 
know how to do that efficiently.

One other point about multiple operators: there are actually four versions of + we 
have been talking about:

 - the "mathematically correct", overflow into bignums version found in Dylan today
 - the same unless the operands are small-integers, then overflow signals an error
 - operands must be small-integers, overflow signals an error
 - operands must be small-integers, overflow wraps around

By the way I don't like the version of the fourth one that also accepts operands 
that aren't small integers.  This seems to me to be a dangerous conflation.

The fourth is what is found in C (overloading for floating point aside) and also 
exists internally in every Common Lisp implementation that I have used enough to 
discover it, however it's never been standardized and the name is different in each 
implementation.  In Macintosh Common Lisp it is %i+, in Symbolics Common Lisp it is 
%32-bit-plus or %24-bit-plus depending on the machine, in Lucid Common Lisp it is 
+& although it seems to turn into + unless optimizing for speed.

I think it's consistent with object orientation for the fourth version of + to have 
a different name from the first version, since they really do different things 
(real addition versus modulo 2^N addition).  Maybe it's been a mistake for Lisp 
implementations to keep this operator an internal secret all these years, and we 
should add such operations to Dylan.  I'm not sure precisely what I think about 
this, but it's something to consider.  Of course it would be better not to hair up 
the language if compiler optimization can do the job.  Also, as Rob MacLachlan 
pointed out in a message today, compiler optimization is more general since the 
same source code can be compiled two different ways depending on what the compiler 
knows about the types involved.

My question of whether it's feasible to eliminate the overflow checking through 
compiler optimization most of the time hasn't been answered yet.  I already know 
it's feasible to eliminate the type checking, so only the overflow checking 
remains.

I still think bundling the overflow behavior into the operands is a dangerous idea.