-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non breaking space and breaking space #288
Comments
Hi @pergardebrink, thanks for your clear description. As you have pointed out, Globalize deduces the grouping separator symbol from the CLDR content. Therefore, all it "knows" comes from that data set. If the On Globalize, we make sure this will always be true: var sv = Globalize("sv");
sv.parseNumber(sv.formatNumber(123456.78)) === 123456.78; // true We don't have any specific rules/conditions on the parser code like "if grouping separator is 160, also try 32". As of now, I think this current behavior is correct. @scottgonzalez, @jzaefferer, @srl295 any ideas? Anyway, if you want to allow user to input 32 (breaking space) as an alternative grouping separator, which I agree it makes sense in your case, this could be used: sanitezedInput = "123 456,78".replace( "\x20", "\xa0" ); // 20 is the hex for 32, a0 is the hex for 160.
sv.parseNumber( sanitezedInput ); TR35 defines this: (link)
Although we don't fully implement this heuristics in Globalize (it doesn't parse the number string using all loaded grouping separators, but the locale one), note that even implementing that would not solve your problem. Because, no language defines 32 (breaking space) as a grouping separator. |
Yes, I'll probably have to use some sort of sanitization as you suggest. My application will use the culture that the user specifies (from a list of all .NET supported cultures) so there are probably more cultures other than swedish that specifies non breaking space as a grouping character. (The reason I started this issue was that the 0.1.1 version did allow me to specify both non breaking and breaking space and was curious if it was a bug or not) Thanks for quick reply! |
Lets wait on input from cc'ed people above. I'm open to suggestion. But, I +55 (16) 98138-1582, +1 (415) 568-5854, skype: rxaviers |
I've been thinking and reading since yesterday and I think that Globalize really should support both non breaking and breaking space even if the CLDR says non breaking as a grouping character. I think that most developers not familiar with cultures that uses space as a grouping separator probably won't know this until they are hit by the first bug report from a swedish or french end user (or any other that uses it). I've found some info on unicode.org suggesting that you should use a more "lenient parsing" that if the grouping character is non breaking space, all whitespace characters should match.
|
Excellent. So, let's do it. |
The documentation lead me to the questions below. I have sent that to the CLDR mailing list and will update here as I get replies. If anyone knows the answers, please just let me know.
Where do I find a list of all format characters?
Where do I find a list of all [:Zs:] characters?
Where do I find a list of all [:Dash:] characters?
Except for the U+05F3 example, the other two cannot be found in http://www.unicode.org/repos/cldr-aux/json/25/supplemental/characterFallbacks.json. Are both the "other apostrophe-like characters". Where do I find a complete list of the apostrophe-like characters? Do mappings follow an algorithm, algebric formula or lookup table? On http://unicode.org/reports/tr35/tr35-info.html#Supplemental_Character_Fallback_Data, there's:
Does it mean that when the character being looked up is not found, the above process should be followed? Where do I find the definition of
Where?
Where do I find more information about it?
Are both mappings (no-break space and half-width katakana) all it's about, or are there any other NFKC normalizations that should be done? Where do I find a complete list of what should be done? Do mappings follow an algorithm, algebric formula or lookup table?
"NA f." is the currency symbol for ANG (Netherlands Antillean guilder, aka Netherlands Antilles Florin according to wikipedia). Following the above recommendation (to map |
@rxaviers Ping me about this i have some experience with NFC and NFKC as will as js implementation of these. |
For the record, @arschmitz has worked with Unicode normalization in his arschmitz/jquery-pr project, where he used walling/unorm/.../unorm.js for the NFC and etc normalizations. |
Also, I have received answers from CLDR mailing list: https://gist.github.com/rxaviers/76762da0ea8d3335f263 |
The ES6 // Comparing 160 no-break space with 32:
" " === " "; // false
" ".normalize("NFKC") === " ".normalize("NFKC"); // true The problem is that |
@rxaviers ah cool that es6 |
Closed in favor of the broader scope #292 (Loose Matching). |
have to monkeypatch activerecord, fixes globalizejs#288.
…id conflict, should solve issue in globalizejs#288.
- Correctly handles prefix and suffix literals; #353; - Loose Matching: This implementation is now much closer to UTS#35 7.1.2 Loose Matching http://unicode.org/reports/tr35/#Loose_Matching and fixes all reported cases that related to it, including #288; - Regression: Drop scientific notation parsing support, which wasn't documented anyway and shall be implemented by #533. Ref #292 Fixes #353 Fixes #46 Fixes #288 Fixes #443 Fixes #457 Fixes #492 Fixes #587 Fixes #644
- Correctly handles prefix and suffix literals; #353; - Loose Matching: This implementation is now much closer to UTS#35 7.1.2 Loose Matching http://unicode.org/reports/tr35/#Loose_Matching and fixes all reported cases that are related to it, including #288; - Regression: Drop scientific notation parsing support, which wasn't documented anyway and shall be implemented by #533. Ref #292 Fixes #353 Fixes #46 Fixes #288 Fixes #443 Fixes #457 Fixes #492 Fixes #587 Fixes #644
- Correctly handles prefix and suffix literals; #353; - Loose Matching: This implementation is now much closer to UTS#35 7.1.2 Loose Matching http://unicode.org/reports/tr35/#Loose_Matching and fixes all reported cases that are related to it, including #288; - Regression: Drop scientific notation parsing support, which wasn't documented anyway and shall be implemented by #533. Ref #292 Fixes #353 Fixes #46 Fixes #288 Fixes #443 Fixes #457 Fixes #492 Fixes #587 Fixes #644
- Correctly handles prefix and suffix literals; #353; - Loose Matching: This implementation is now much closer to UTS#35 7.1.2 Loose Matching http://unicode.org/reports/tr35/#Loose_Matching and fixes all reported cases that are related to it, including #288; - Regression: Drop scientific notation parsing support, which wasn't documented anyway and shall be implemented by #533. Ref #292 Fixes #353 Fixes #46 Fixes #288 Fixes #443 Fixes #457 Fixes #492 Fixes #587 Fixes #644
I'm new to both version 0.1.1 and 1.0.0-alpha and have never used jquery globalize before so I might have misunderstood something, but I have a small issue with version 1.0.0-alpha5 (as I plan to move there from 0.1.1 when it's stable).
In the previous version (0.1.1), if run the following code:
But if I run the following code in 1.0.0.alpha-5:
Since an enduser probably (definitely) won't type the space as a non breaking space, any conversion will fail. If I change the group property in numbers.json to be a breaking space instead, then of course the parse will work, but then my value provided from the server won't work since I use .NET to format my number with swedish culture:
(C#)
The text was updated successfully, but these errors were encountered: