Groups | Blog | Home
all groups > dotnet general > august 2003 >

dotnet general : How to determine which of the generic decimal datatypes to use.



Jon Skeet
8/30/2003 8:45:20 AM
[quoted text, click to view]

http://www.pobox.com/~skeet/csharp/floatingpoint.html gives some of the
details of floating point numbers, and some suggestions as to when to
use what. It's not exactly what you were after, but hopefully you'll
find it useful.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet/
John Bentley
8/30/2003 5:24:41 PM
John Bentley:
INTRO
The phrase "decimal number" within a programming context is ambiguous. It could
refer to the decimal datatype or the related but separate concept of a generic
decimal number. "Decimal Number" sometimes serves to distinguish Base 10
numbers, eg "15", from Base 2 numbers, Eg "1111". At other times "Decimal
Number" serves to differentiate a number from an integer. For the rest of this
post I shall only use either "Generic Decimal Number " or "decimal datatype" for
clarity.

DEFINTIONS
Generic Decimal Number: a base 10 number with a fractional part represented with
digits. A Generic Decimal Number may be implemented with any of several
datatypes including the decimal datatype, a double, a single, a string.

Decimal Datatype: the .Net (or other programming language) decimal datatype.

ISSUES
When programming with generic decimal numbers there are a few of key issues to
consider:

1 How to round a number.
2 How to determine which of the generic decimal datatypes to use: Single;
Double; or Decimal
3 Determine if two generic decimal numbers are equal.
4 Work with fractions that cannot be represented accurately as a generic decimal
numbers.

These are interrelated issues but for the moment I'm interested in 2.

Would you like to tell me the rules you use when deciding to use the floating
point datatypes (Single and Double) V the Fixed Point/Scaled Integer Datatype
(Decimal)? By all means address other related issues if it helps in the
answering of this question. Although I would be interested in answers that
relate to .NET specifially I'm interested more in the general
mathematical/computational ideas that govern the choice.

Jon Skeet
8/31/2003 6:53:29 PM
[quoted text, click to view]

Thanks - that's very kind of you.

[quoted text, click to view]

Questions are more than welcome - they'll suggest ways I could expand
the article, for one thing :)

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet/
John Bentley
8/31/2003 9:57:47 PM
[quoted text, click to view]

Jon, that is exactly the type of thing I was after. Yours is a well written
article. Standby 1 or 2 days: I am digesting the issues you raise there. I will
have some questions. Thanks for publishing it.

Jon Skeet
9/1/2003 9:05:35 AM
[quoted text, click to view]

Or into another article :)

[quoted text, click to view]

I personally think it's better to leave it without the last sentence,
or a modified one - a nonintegral can be *represented* (often
imprecisely) in base 10 or base 2. Numbers themselves don't
fundamentally have a base though - they're just numbers. Don't worry if
you don't see what I mean - it's a slightly philosophical distinction
to make, but the mathematician in me wants to make it :)

[quoted text, click to view]

Yup.

[quoted text, click to view]

<snip - adding 0.1 ten times>

[quoted text, click to view]

Exactly.

[quoted text, click to view]

<snip - adding a 0.1/0.3 three times>

[quoted text, click to view]

Hmm... that's interesting, as a third isn't represented exactly in
binary either. I think it's just a coincidence, to be honest. If you
print out the exact double of each stage, you get:

0.33333333333333337034076748750521801412105560302734375
0.6666666666666667406815349750104360282421112060546875
1

Note that the first two don't sum to 1, which the third number would
suggest.

Basically there have been various stages where accuracy has been lost,
but they've *happened* to cancel each other out, whereas they didn't
before.

Note that if you keep going (ie keep adding a third) you don't get to
2. The sequence is:

0.33333333333333337034076748750521801412105560302734375
0.6666666666666667406815349750104360282421112060546875
1
1.333333333333333481363069950020872056484222412109375
1.66666666666666696272613990004174411296844482421875
2.000000000000000444089209850062616169452667236328125

[quoted text, click to view]

<snip adding a third as a decimal>

[quoted text, click to view]

See above.

[quoted text, click to view]

Actually, it does slightly - but only if you look closely. As I say
when introducing DoubleConverter, *every* double value can exactly be
represented in decimal, and clearly 1/3 can't - therefore 1/3 can't be
represented exactly in binary.

[quoted text, click to view]

<snip>

Yes. I'll definitely add something about this to the article.

[quoted text, click to view]

Looks like it's time for the decimal datatype article then, doesn't it?
:)

If you have a look in the MSDN you'll find more information, but
basically a decimal is 96 bits of integer information, 1 bit of sign,
and 5 bits of exponent (which aren't all used - the exponent goes from
0 to 28, but is always treated as negative - to get big numbers, you
use a small exponent (eg 0) and a big mantissa). I'll go into more
detail in the article :)

<snip>

[quoted text, click to view]

We could, but it would be inaccurate :) Fixed point is where the
exponent is always assumed to have the same value. For instance, you
could have a very simple decimal fixed point data type where the
exponent was always -2 - a stored value of 1586 would therefore just
represent 15.86.

[quoted text, click to view]

Where did you get the scaled term from? If I had more context I could
perhaps answer the question better :)

I'll modify the existing article and start on the decimal type one...
thanks for the feedback.

By the way, I've been trying out different bits of CSS, so if you go
back to the article and it looks strange, just hit refresh and
hopefully the new version of the CSS will load and all will be well. If
the problem doesn't go away, mail me and I'll check :)

--
John Bentley
9/1/2003 5:38:15 PM

[quoted text, click to view]

I address these points to Jon Skeet but, of course, any body may have some
worthy insights they wish to contribute.

[quoted text, click to view]

You mention "There are points to note about decimal, but this article doesn't go
into them ..." and perhaps my questioning might expand your article in that
direction? :)

Since having done a bit more reading I have decided to speak of "nonintegrals"
rather than "Generic Decimal Number"

So my definition is now:

A nonintegral: a any number with a fractional part represented with
digits. A nonintegral may be implemented with any of several
datatypes including the decimal datatype, a double, a single, a string. We can
have a nonintegral in base 10 or base 2.

You've given me an important epiphany:

1/10 or 0.1 cannot be represented, exactly, in Base 2.

So, to repeat your story in my own words:

There are many fractions that can be represented exactly in Base 10 but CANNOT
in Base 2, like 0.1. A Floating point datatype displays in Base 10 but stores
its information ultimately in Base 2. Therefore a floating point datatype
cannot, exactly, represent 0.1 (and many other fractions).

Which is why when we run (we are biased toward different languages perhaps):

' Need DoubleConverter from
http://www.yoda.arachsys.com/csharp/DoubleConverter.cs
Sub NonintegralEqualityTest()
Dim nonintegral As Double
Dim i As Integer

For i = 1 To 10
nonintegral += 0.1
Next i

Debug.WriteLine("nonintegral: " & nonintegral)

Debug.WriteLine("nonintegral Actual: " _
& DoubleConverter.ToExactString(nonintegral))
Debug.WriteLine("(nonintegral = 1): " & (nonintegral = 1))
End Sub

.... We get:

nonintegral: 1
nonintegral Actual: 0.99999999999999988897769753748434595763683319091796875
(nonintegral = 1): False


However if we Test working with a third in a slightly new procedure:
Sub NonintegralEqualityTestAThird()
Dim nonintegral As Double
Dim i As Integer

For i = 1 To 3
nonintegral += 0.1 / 0.3
Next i

Debug.WriteLine("nonintegral: " & nonintegral)
Debug.WriteLine("nonintegral Actual: " _
& DoubleConverter.ToExactString(nonintegral))
Debug.WriteLine("(nonintegral = 1): " & (nonintegral = 1))
End Sub

.... We get:
nonintegral: 1
nonintegral Actual: 1
(nonintegral = 1): True

Let's test working with a third but with a Decimal data type
Sub NonintegralEqualityTestAThird()
Dim nonintegral As Decimal
Dim i As Integer

For i = 1 To 3
' D is the literal type character for Decimal NOT double in VB.NET
nonintegral += 0.1D / 0.3D
Next i

Debug.WriteLine("nonintegral: " & nonintegral)
Debug.WriteLine("nonintegral Actual: " _
& DoubleConverter.ToExactString(nonintegral))
Debug.WriteLine("(nonintegral = 1): " & (nonintegral = 1))
End Sub

.... We Get:
nonintegral: 0.9999999999999999999999999999
nonintegral Actual: 1
(nonintegral = 1): False

These tests seem to imply that while 1/3 cannot be represented in Base 10 it can
be represented in Base 2. Can you confirm that 1/3 can be represented exactly in
Base 2? Is this it: 0.010101011?

Whether 1/3 can be represented exactly in Base 2 or not contradicts nothing
you've said and is probably unimportant as:

"Whatever base you come up with, you'll have the same problem with some
numbers - and in particular, "irrational" numbers (numbers which can't be
represented as fractions) like the mathematical constants pi and e are always
going to give trouble."

Therefore we can have this rule for when working with nonintegrals:

Never use the equal operator to test for the equality of nonintegrals (whether a
floating point or fixed point datatype, like Decimal). Instead use a custom
EqualEnough(x,y,tolerance) function (See Bellow).

Private Const mDefaultTolerance As Single = 0.000001

' Returns: Whether two floating point numbers are close enough to be
' deemed equal.
' Remarks: Never use the equality operator, =, to test for equality with
' floating point datatypes.
' Params:
' x, y
' Floating point numbers in any order.
' tolerance
' An amount that is sufficient to the numbers
' to differ by and still be considered equal . Eg 0.001
'
' Example:
' If EqualEnough(d,1) then
'
' Created: 31 Aug 2003
' John Bentley johnny_bentley@yahoo.com.au
' +61 (0)40 912 4414
Overloads Function EqualEnough(ByVal x As Double, ByVal y As Double, _
Optional ByVal tolerance As Double _
= CDbl(mDefaultTolerance)) As Boolean

Return (Math.Abs(x - y) <= tolerance)
End Function

Overloads Function EqualEnough(ByVal x As Single, ByVal y As Single, _
Optional ByVal tolerance As Single _
= mDefaultTolerance) As Boolean

Return (Math.Abs(x - y) <= tolerance)
End Function

Overloads Function EqualEnough(ByVal x As Decimal, ByVal y As Decimal, _
Optional ByVal tolerance As Single _
= CDec(mDefaultTolerance)) As Boolean

Return (Math.Abs(x - y) <= tolerance)
End Function

I'm still pursuing the question of How to determine which of the nonintegral
Datatypes to use: Floating point Datatypes versus Fixed Point (I know you say
that a decimal is really a floating point). Your suggestion, if I can represent
it oversimply, to use Floating point for Scientific apps and Fixed for Financial
apps, is a helpful one. However, I'm trying to grasp the issue a little more by
understanding the nature of a Decimal Datatype

What is this beast the Decimal Datatype? Is it that while System.Double and
System.Single are stored in Base 2, a System.Decimal is Stored, somehow, in Base
10? This would seem, to my present niave understanding, impossible as the CPU
ultimately works in machine code, that is, 0s and 1s.

For if we run
Sub NonintegralEqualityTest()
' Note this is now a Decimal rather than a Double
Dim nonintegral As Decimal
Dim i As Integer

For i = 1 To 10
' D is literal type character for Decimal in VB.NET
nonintegral += 0.1D
Next i

Debug.WriteLine("nonintegral: " & nonintegral)
Debug.WriteLine("nonintegral Actual: " _
& DoubleConverter.ToExactString(nonintegral))
Debug.WriteLine("(nonintegral = 1): " & (nonintegral = 1))
End Sub

We get:
nonintegral: 1
nonintegral Actual: 1
(nonintegral = 1): True

This shows, perhaps, an essential difference between a floating point datatype
and a fixed point datatype (can we stick with "float" V "fixed" point as a
AddThis Social Bookmark Button