[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: Michael Lewis <melewis@jeeves.waterloo.edu>*Subject*: Accuracy of floating point computation in MCL*From*: Robert W. Kerns <rwk@crl.dec.com>*Date*: Fri, 29 Nov 1991 15:22-0500*Cc*: info-mcl@cambridge.apple.com*In-reply-to*: <1991Nov28.213622.21841@watdragon.waterloo.edu>

Date: Thu, 28 Nov 1991 16:36 EST From: melewis@jeeves.waterloo.edu (Michael Lewis) There seems to be a pervasive and important error in floating point calculations in MCL. There is a pervasive and important error in floating point calculations. Period. This has nothing to do with MCL, nor with the Macintosh, and has everything to do with the nature of floating point. The phenomonon is called "roundoff error". Floating point is by definition APPROXIMATE arithmetic. The biggest source of your confusion is that there is no way to exactly represent the number .1 in binary; you have to resort to "repeating binaries" (in analogy to "repeating decimals"). Consider how you represent 1/3 in base 10, as 0.33333333333..... 1/5 and 1/10 have the same problem in base 2. No matter how many bits you allocate to the mantissa of the floating point number, you will have some remaining error. If you add up this error enough times, it will get larger. And my favourite is ? (> 3/10 .3) T This is, in fact, true. 0.3 may be the representation we normally use for 3/10's, but in fact, there is no way to represent that as a floating point number. It's really 0.29999999999999999.... truncated somewhere. However, the rational number 3/10 is precise. Rational numbers do not suffer from this difficulty. Imagine the consequences! Yes, it is important for you to imagine the consequences when you write floating-point code. Note however that (- 1/2 .5) yields 0.0 .5 (1/2) can be exactly represented in binary. as does (- 1/8 .125) As can any power of 1/2. As our application here depends *critically* on *accurate* numerical calculations we were wondering exactly how we could get MCL to "do the right thing"! Well, you have a few different choices. 1) You could avoid floating point calculations entirely, and use rationals. 2) You could use floating point, but avoid exact comparisions, which are meaningless with floating point. This means looking at your algorithm to see how much roundoff error could be accumulating, and checking to see if numbers are within that range, rather than exactly equal. MCL's floating point numbers are pretty precise; very seldom do you need *THAT* much precision. Option # 1 is slower, but exact. If you're doing number theory, for example, it will do what you want. Option # 2 is faster, but inexact. If you're doing long numerical calculations, it is more appropriate. There are many textbooks which cover this topic in more detail. It certainly should be covered in almost any basic programming textbook in basic form, although often it is not. If your algorithm is complex, you might need considerable help in figuring out how large your error bounds are, or how to make them smaller. Check out a textbook on numerical programming for more assistance on these topics.

**References**:**Accuracy of floating point computation in MCL***From:*melewis@jeeves.waterloo.edu (Michael Lewis)

- Prev by Date:
**Accuracy of floating point computation in MCL** - Next by Date:
**Configuration Management System** - Previous by thread:
**Accuracy of floating point computation in MCL** - Next by thread:
**Re: Accuracy of floating point computation in MCL** - Index(es):