[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Accuracy of floating point computation in MCL

To: Michael Lewis <melewis@jeeves.waterloo.edu>
Subject: Accuracy of floating point computation in MCL
From: Robert W. Kerns <rwk@crl.dec.com>
Date: Fri, 29 Nov 1991 15:22-0500
Cc: info-mcl@cambridge.apple.com
In-reply-to: <1991Nov28.213622.21841@watdragon.waterloo.edu>

    Date: Thu, 28 Nov 1991 16:36 EST
    From: melewis@jeeves.waterloo.edu (Michael Lewis)

    There seems to be a pervasive and important error in floating point
    calculations in MCL.

There is a pervasive and important error in floating point
calculations.  Period.

This has nothing to do with MCL, nor with the Macintosh,
and has everything to do with the nature of floating point.

The phenomonon is called "roundoff error".  Floating point
is by definition APPROXIMATE arithmetic.  The biggest source
of your confusion is that there is no way to exactly represent
the number .1 in binary; you have to resort to "repeating binaries"
(in analogy to "repeating decimals").

Consider how you represent 1/3 in base 10, as 0.33333333333.....
1/5 and 1/10 have the same problem in base 2.  No matter how
many bits you allocate to the mantissa of the floating point
number, you will have some remaining error.  If you add up
this error enough times, it will get larger.

    And my favourite is 

    ? (> 3/10 .3)
    T

This is, in fact, true.  0.3 may be the representation we
normally use for 3/10's, but in fact, there is no way to
represent that as a floating point number.  It's really
0.29999999999999999.... truncated somewhere.

However, the rational number 3/10 is precise.  Rational
numbers do not suffer from this difficulty.

    Imagine the consequences!

Yes, it is important for you to imagine the consequences
when you write floating-point code.

    Note however that 
    (- 1/2 .5) yields 0.0

.5 (1/2)  can be exactly represented in binary.

    as does
    (- 1/8 .125)

As can any power of 1/2.

    As our application here depends *critically* on *accurate*
    numerical calculations we were wondering exactly how we could get MCL
    to "do the right thing"!

Well, you have a few different choices.

1)  You could avoid floating point calculations entirely, and use
    rationals.

2)  You could use floating point, but avoid exact comparisions,
    which are meaningless with floating point.  This means looking
    at your algorithm to see how much roundoff error could be
    accumulating, and checking to see if numbers are within that
    range, rather than exactly equal.  MCL's floating point numbers
    are pretty precise; very seldom do you need *THAT* much precision.

Option # 1 is slower, but exact.  If you're doing number theory,
for example, it will do what you want.

Option # 2 is faster, but inexact.  If you're doing long numerical
calculations, it is more appropriate.

There are many textbooks which cover this topic in more detail.
It certainly should be covered in almost any basic programming
textbook in basic form, although often it is not.  If your algorithm
is complex, you might need considerable help in figuring out how
large your error bounds are, or how to make them smaller.  Check
out a textbook on numerical programming for more assistance on
these topics.

References:
- Accuracy of floating point computation in MCL
  - From: melewis@jeeves.waterloo.edu (Michael Lewis)

Prev by Date: Accuracy of floating point computation in MCL
Next by Date: Configuration Management System
Previous by thread: Accuracy of floating point computation in MCL
Next by thread: Re: Accuracy of floating point computation in MCL
Index(es):
- Date
- Thread