-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normative: Support BigInt in NumberFormat and toLocaleString #236
Conversation
This patch brings Intl.NumberFormat support to BigInt, and adds a BigInt.prototype.toLocaleString method based on it. The design here is to include overloading between BigInt and Number as arguments for the format and formatToParts methods based on ToNumeric. This means that, for example, string arguments are cast to Number, rather than BigInt. This design preserves compatibility and consistency with operators like unary - This definition permits options in the NumberFormat to force decimal places, e.g., 1n formatting as 1.00000 if the minimum fractional digits is 5. Alternative semantics would be to throw an exception in this case. For the algorithm text itself: the specification algorithms ToRawPrecision and ToRawFixed are now used for both Numbers and BigInts. Given the ECMAScript specification's use of implicit coercisions between Numbers and mathematical values, I believe that this is valid without any special changes; the phrasing may change in the future [1]. ICU4C-based implementations of ECMAScript can use LocalizedNumberFormatter::formatDecimal [2] or unum_formatDecimal [3] to implement the algorithms in this patch. [1] tc39/ecma262#1135 [2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f [3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1 Closes tc39#218
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM.
This patch was presented at the May 2018 TC39 meeting. No concerns were expressed, so I take it that it has consensus. Let's consider this change to be "at Stage 3" together with the rest of the BigInt proposal, ready for implementation and test262 tests. cc @jakobkummerow @caiolima @cxielarko |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have concerns about this, see comments.
As a side note, this is potentially going to be tricky to implement, because the BigInt object format is going to be specific to each implementation, so it'll be hard to share much if any of the functionality via ICU.
@@ -376,7 +376,7 @@ <h1>FormatNumberToParts( _numberFormat_, _x_ )</h1> | |||
<h1>ToRawPrecision( _x_, _minPrecision_, _maxPrecision_ )</h1> | |||
|
|||
<p> | |||
When the ToRawPrecision abstract operation is called with arguments _x_ (which must be a finite non-negative number), _minPrecision_, and _maxPrecision_ (both must be integers between 1 and 21), the following steps are taken: | |||
When the ToRawPrecision abstract operation is called with arguments _x_ (which must be a finite non-negative Number or BigInt), _minPrecision_, and _maxPrecision_ (both must be integers between 1 and 21), the following steps are taken: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minPrecision, and maxPrecision (both must be integers between 1 and 21)
doesn't really make sense for BigInts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minPrecision/maxPrecision make sense in some cases, like when using scientific or compact notation. For example, if you have a BigInt 88776655443322, you might want to display only 4 significant digits in compact notation, producing "88.78 Trillion".
@@ -410,7 +410,7 @@ <h1>ToRawPrecision( _x_, _minPrecision_, _maxPrecision_ )</h1> | |||
<h1>ToRawFixed( _x_, _minInteger_, _minFraction_, _maxFraction_ )</h1> | |||
|
|||
<p> | |||
When the ToRawFixed abstract operation is called with arguments _x_ (which must be a finite non-negative number), _minInteger_ (which must be an integer between 1 and 21), _minFraction_, and _maxFraction_ (which must be integers between 0 and 20), the following steps are taken: | |||
When the ToRawFixed abstract operation is called with arguments _x_ (which must be a finite non-negative Number or BigInt), _minInteger_ (which must be an integer between 1 and 21), _minFraction_, and _maxFraction_ (which must be integers between 0 and 20), the following steps are taken: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minInteger (which must be an integer between 1 and 21), minFraction, and maxFraction (which must be integers between 0 and 20)
doesn't really make sense for BigInts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about this being relevant when compact or scientific notation is used.
Another use case: maybe a library dealing with integer numbers from the back-end (such as protobuf int64s, etc.) makes all numbers, even small ones, represented by BigInt. So, it would not be unreasonable to have a case where you have a small BigInt, and you want to show decimals on it, like for currencies.
|
||
<p> | ||
The FormatNumberToString abstract operation is called with arguments _intlObject_ (which must be an object with [[MinimumSignificantDigits]], [[MaximumSignificantDigits]], [[MinimumIntegerDigits]], [[MinimumFractionDigits]], and [[MaximumFractionDigits]] internal slots), and _x_ (which must be a Number value), and returns _x_ as a string value with digits formatted according to the five formatting parameters. | ||
The FormatNumericToString abstract operation is called with arguments _intlObject_ (which must be an object with [[MinimumSignificantDigits]], [[MaximumSignificantDigits]], [[MinimumIntegerDigits]], [[MinimumFractionDigits]], and [[MaximumFractionDigits]] internal slots), and _x_ (which must be a Number or BigInt value), and returns _x_ as a string value with digits formatted according to the five formatting parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(which must be an object with [[MinimumSignificantDigits]], [[MaximumSignificantDigits]], [[MinimumIntegerDigits]], [[MinimumFractionDigits]], and [[MaximumFractionDigits]] internal slots)
doesn't really make sense for BigInts, and neither calling ToRawPrecision
nor ToRawFixed
below is appropriate for them. I think this method needs more significant changes than just an s/Number/Numeric/ renaming.
ICU accepts arbitrary-precision numbers via the decNumber APIs. As long as an implementation can get the Latin digits for the BigInt into a string it can run ICU with it. If BigInt has a toString() method, then that output might be sufficient in itself to run ICU. |
Does this mean the implementation will first turn the BigInt into a string form and then pass that string to icu::number::LocalizedNumberFormatter::formatDecimal() ? -- |
Correct
That should work. If there's a cheap way to convert the BigInt to scientific notation, that should also work. |
@jakobkummerow I can understand these concerns about sloppy writing and parts of this text being irrelevant for BigInt, but I don't understand this part:
I think you can handle this by converting the BigInt to a base-10 string and then calling ICU's functionality to format a string representing a number in a locale-specific way. Separately: We are discussing over in WebIDL-land whether it's appropriate to overload functions between BigInt and Number. @annevk has advocated for not permitting this overloading, which we could follow if we make a separate |
@littledan - so after you see the above comment from @jakobkummerow do you think you still need to revise this PR with some changes before you merge it in? or you think it should be landed AS IS? Just try to understand the implication from @jakobkummerow 's comments. Also, how likely do you think A) the BigInt and B) this ECMA changes to BigInt.prototype.toLocaleString will make the cut into ECMA262 2019 edition and ECMA402 2019 edition? Just also try to get a sense of priority in term of WHEN should I schedule time to implement it in v8. |
v8 prototype cl https://chromium-review.googlesource.com/c/v8/v8/+/1424021 |
@littledan - When do you think you can merge this PR into ECMA402? IF you can merge it within the next 10 calendar, I think I can
What I got from my v8 prototype this morning. |
Just want to add a note virtual UnicodeString& icu::NumberFormat::format ( StringPiece number, UnicodeString & appendTo, |
BigInt will not make it into ECMAScript 2019, but I don't think that should delay your implementation work on toLocaleString any more than BigInt itself. I have already presented this PR to TC39 with no strong concerns raised from the committee. The remaining open question is, should we overload format or make a separate formatBigInt? This should be a pretty superficial change, but I want to have a conversation involving both the TC39 and WebIDL communities to come to a common conclusion before the overloaded format method ships in browsers. I don't think we have any remaining questions about the toLocaleString method's semantics, though. |
ok. Help me out to understand the dependency below. I am still rather new here so maybe some of my assumptions are wrong.
|
I would not merge this PR or tests yet because of the unresolved issue. However, IMO it wouldn't be too soon to land a flagged implementation, as the unresolved issue is superficial. Happy to discuss this with more V8 folks as well. |
It seems better not to merge this until there's a consensus on the pattern to be used. I think the main reason I had against overloading was because normally only one type of number makes sense and accepting both would be a convenience that's better done explicitly (otherwise any number of things accepting Number today would also have to accept BigInt). This method seems to be somewhat exceptional in that it actually makes sense for both types. That seems reasonable, but it seems hard to convey that nuance and I'm afraid there'll be a number of specifications that would do inappropriate overloading if IDL allowed for it (as e.g. argued for some Wasm API and nearly accepted). |
OK, how should we get everyone together to discuss this question. Email thread? Call? TC39 meeting topic? I would be happy to stick with Anne's resolution, to take a strong stance to avoid likely confusion. |
I'd be happy to let TC39 deliberate over these points and come up with something for JavaScript/IDL. (To be clear, I will accept the outcome.) |
@jakobkummerow & @gsathya - with this, would you mind I add a flag to guard on my BigInt.prototype.toLocaleString implementation? I know Jakob earlier said this should not need a flag but I think Daniel's point probably make it more sense for us to have one. IS that ok? |
@FrankYFTang Sure, when the spec is still in flux then a (default-off) flag makes sense. |
I haven't been following this too closely so apologies if this is obvious,
Is this an observable change? If not, why is this blocking shipping in browsers? If yes, can someone give me a summary of the observable part of this? |
@gsathya If you want to ship BigInt.prototype.toLocaleString by itself, there's no observable change. If you want to ship this PR as a whole, there's also a change to Intl.NumberFormat.prototype.format, which overloads between Number and BigInt--that second part, I'd like to discuss in the upcoming TC39 meeting. |
@littledan should BigInt(-0).toLocaleString() format to "0" or "-0" ? why? |
Thanks for the code review from @jakobkummerow the v8 implementation sync with the current spec ( 040f809 ) is landed into v8 tree behind the flag --harmony-intl-bigint . You can also see the test I wrote here |
Based on decision on 2019 March TC39 meeting, ECMA402 will go with this PR and abandon the idea in #318. |
see tc39/proposal-intl-relative-time#106 also |
FYI- Chrome/v8 will ship this BigInt support in m76 |
This patch brings Intl.NumberFormat support to BigInt, and
adds a BigInt.prototype.toLocaleString method based on it.
The design here is to include overloading between BigInt and Number
as arguments for the format and formatToParts methods based on
ToNumeric. This means that, for example, string arguments are
cast to Number, rather than BigInt. This design preserves
compatibility and consistency with operators like unary -
This definition permits options in the NumberFormat to force
decimal places, e.g., 1n formatting as 1.00000 if the minimum
fractional digits is 5. Alternative semantics would be to
throw an exception in this case.
For the algorithm text itself: the specification algorithms
ToRawPrecision and ToRawFixed are now used for both Numbers
and BigInts. Given the ECMAScript specification's use of implicit
coercisions between Numbers and mathematical values, I believe
that this is valid without any special changes; the phrasing
may change in the future [1].
ICU4C-based implementations of ECMAScript can use
LocalizedNumberFormatter::formatDecimal [2] or
unum_formatDecimal [3] to implement the algorithms in this patch.
[1] tc39/ecma262#1135
[2] http://icu-project.org/apiref/icu4c/classicu_1_1number_1_1LocalizedNumberFormatter.html#a29cd3d107b784496e19175ce0115f26f
[3] http://icu-project.org/apiref/icu4c/unum_8h.html#a59870a322f012dc1b9d99cf8a7b708f1
Closes #218