-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Conversion from float retaining excess precision #438
Comments
The code that is doing the "rounding" per se is this: Lines 1498 to 1516 in 980c987
As you can see in this, it's removing excess precision after approx 16 decimal places for a f64 due to the fact that a 64 bit float can't guarantee more than that (as per IEEE-754). It then continues on to remove any trailing zeros here:Lines 1518 to 1530 in 980c987
Ultimately this is why it becomes 0.1 in the end - namely as that is what can be considered the approximate represented precision based upon IEEE-754 standards.
I think what you're asking for is really to have a function which doesn't remove this excess precision so that we ignore any float guarantees. My question I guess is: why? It'd be useful to understand the use case. Btw, the way this handles floats is very similar to the .NET approach and gives identical solutions in this case. |
The use case is being able to closely approximate the number that "Ignoring any float guarantees" sounds almost like we'd be doing something unsafe, which I believe is not the case here, unless there is reason to doubt the soundness of the algorithm Python (and Having said that, I must also say that I didn't know about the guarantees you're referring to, and it's definitely interesting that .NET gives identical results. Do you think Python is just mistaken about their float->Decimal conversion? The documentation of
I understand that Python documentation is in no way normative, I'm quoting it merely to illustrate how far they go in considering the rational number to be some form of intrinsic value of the float. I would find it puzzling that they'd miss something as basic as a fundamental guarantee afforded by IEEE 754, or that they'd recklessly flout it. Also, does the same apply to |
I think the primary difference between the libraries is that one of the design goals of Rust Decimal, rightly or wrongly, is to follow IEEE-754 as much as possible - at least where it makes sense. I mean, ideally it's not my place to really say what's right or wrong here. It's much easier to provide customization so that you can extract the behavior that you need from the library - especially one so simple as this. Subsequently, I think there are probably two things that need to be addressed here:
At a stretch this could also be enabled/disabled with a feature flag - though I'm a little hesitant to do it that way unless absolutely required. |
I tend toward Python and BOOST approach, for both consistency and practical reasons. // .NET 5 |
I think we are talking about philosophical differences here. It is clear that 64 bits is not enough to represent all possible floating-point numbers. Therefore, for an Therefore, taking the only guaranteed number of digits is JUST AS CORRECT as making up a random stream of digits after that (of course those that do not round up) -- as far as IEEE floats are concerned, the representation scheme cannot distinguish between them. In other words, they all map to the exact same representation. So, technically speaking, So it is a matter of taste which representation to choose:
But, you may also ask, why using the raw IEEE bit patterns would be useful or in any way more meaningful for any purpose? Mind as well zero the non-guaranteed digits - at least then the numbers are simpler; they both have similar distributions - the IEEE representation is deterministic. |
The decimal digits are an exact representation of the binary number stored in
The purpose is not to lose informattion. When converting from a narrower representation to a wider one, such as from let n1 = 0.1f32;
let n2 = 0.1f64;
let d1 = Decimal::from_f32(n1).unwrap();
let d2 = Decimal::from_f64(n2).unwrap();
println!("{} {} {}", d1, d2, n1 as f64 == n2); // 0.1 0.1 false With the proposed change, the output would be |
I don't understand. Any number representation staying with the given number of significant digits will round-trip for IEEE floats. Therefore, However, nothing is there to guarantee If you really want such conversion, you can always do:
|
That's currently the case, but it doesn't have to be so, and it's not so with libraries that don't discard the full information from the float. And that's just one example (which you requested) where the loss of data by the current float->decimal conversion is easily observable. |
@schungx Yes, both Decimal values are different from double 1.1000000000000001, and this is exactly what you get in Python. Python 3.10Decimal(1.1000000000000001) == Decimal("1.1") # False |
The number The closest machine number is Python simply ignores the fact that This crate, however, takes the opposite view that says anything after the 16th digit is not exact, and you most probably want to only know about at most 16 digits. Because... why do you use Personally, I take the view that an IEEE 754 number is not exact by design. Therefore, it is never supposed to be treated exactly. Simply taking whatever bit-pattern of that number and then taking it as gospel is at best misleading, and at worst wrong; a misuse. Anything after the 16th digit in an IEEE 754 double is never supposed to be treated exactly and there is absolutely nothing wrong, even more intuitive, to just zero them off. I'd say most programmers would expect something like: let number: f64 = 1.1;
let decimal = Decimal::from(number); Most people would intuitively expect the And if you output Because when I write |
For me this is the argument to give me the exact number, with all its unsightly digits. There's nothing inexact about those digits, they are mathematically derived from the definition of the float. If there is inexactness, it happened much earlier, such as when the token
Indeed they would, but the same people would probably also expect If Python is just wrong here, then so is Java's |
Would a feature flag to support this requirement work? This is a work in progress PR but could perhaps solve your requirements. With the feature flag
|
Thanks for working on this! Support through a feature is certainly better than no support at all, but features have several downsides, e.g. that they are hard to discover and document, that they are unavailable in some settings (famously the Playground, but there are other examples). Especially problematic are features that change behavior of existing functionality rather than just introduce new functions. Those interact badly with workspace builds, where, as I understand it, a crate is build as a union of the dependencies requested by various dependent crates in the same workspace. (I can dig out the exact details if you're curious.) A pair of methods like |
I tend to agree about the discoverability of feature flags vs introducing a new function. I've updated the PR however have gone for the names: I could be convinced to shorten these to |
One simple real-life example, as this is not philosophical discussion, it has real-life consequences and errors. Which of two expressions y=1 - x*x and z=(1-x) * (1+x) is more accurate at x=1 - 1e-12? Python 3.10
Python 3.10 - 25 decimal digits precision
.NET 5
Philosophy and personal preferences aside, this Decimal will not be useful for my purposes as I can not trust its results. Python Decimal gives correct answer even when crippled to 25 decimal digits precision, which is significantly below .NET Decimal precision, as it does not lose precision during conversion. Its that simple. |
To be clear, I'm not looking at changing the default behavior of this library and will continue to align with the .NET library in this regard. Because we are converting from a "base 2" representation into a "base 10" representation the additional precision may not be entirely accurate to what the "true" float represents either. Consequently, continuing to use a narrowing conversion based upon the calculated IEEE-754 decimal digits seems quite reasonable to cater for this inaccuracy. That being said: perhaps there are alternative scenarios that you need a close approximation of the exp2 representation of the float. Given this, would the functions in #441 provide the additional functionality required? Are the results what you'd expect? |
I'll check the results and report my findings in a comment to #441. |
Currently this code prints
0.1
:At first that sounds like correct behavior and what one would want. But I would argue that it is in fact not the correct result, and that it should instead output
0.10000000000000000555111512313
.As we know, the token
0.1
is read by Rust into anf64
number, where it is represented by a floating-point number equal to the fraction3602879701896397/36028797018963968
. (This ratio can be obtained in Python using0.1 .as_integer_ratio()
, or in Rust by printingBigRational::from_float(0.1)
.) While the resulting number can be displayed as 0.1 without loss of precision when again re-read as float, and while humans often think of it as 0.1, it is the fraction that is in fact used in floating-point calculations.This is why I would expect
"0.1".parse::<Decimal>().unwrap()
to differ fromDecimal::from_f64(0.1).unwrap()
. The former reads from decimal and is precisely0.1
, whereas the latter converts the result of the lossy conversion of the0.1
token to floating point. The input ofDecimal::from_f64(0.1)
is the number corresponding to the above-mentioned fraction, which has an exact decimal representation (because its denominator is a power of 2):0.1000000000000000055511151231257827021181583404541015625
. Given the constraint of a mantissa smaller than2**96
, the closestrust_decimal::Decimal
representation of this is10000000000000000555111512313 * 10**-29
, so I'd expect theDecimal
to display as0.10000000000000000555111512313
.The same applies to
from_f32
whereDecimal
has enough precision to hold all the digits of 13421773/134217728, and should display0.100000001490116119384765625
.Python's
decimal
does the equivalent conversion, except its decimal is not limited to u128, so it can accommodate the whole float:Looking at the implementation, it's not at all obvious that the current behavior is intended. The intention seems to be to convert the number with maximum possible precision. While I'm aware that some people might consider the current behavior to be a feature, I'd still ask to change it, or at least to provide a different method for converting floats to decimals while minimizing the difference from the original number.
The text was updated successfully, but these errors were encountered: