Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default hour-cycle computation can resolve to non-preferred hour format #402

Closed
anba opened this issue Jan 21, 2020 · 24 comments
Closed

Default hour-cycle computation can resolve to non-preferred hour format #402

anba opened this issue Jan 21, 2020 · 24 comments
Labels
c: datetime Component: dates, times, timezones s: help wanted Status: help wanted; needs proposal champion Small Smaller change solvable in a Pull Request
Milestone

Comments

@anba
Copy link
Contributor

anba commented Jan 21, 2020

https://github.com/tc39/test262/blob/master/test/intl402/DateTimeFormat/prototype/resolvedOptions/hourCycle-default.js is currently failing in SpiderMonkey, revealing a potential spec issue.

Reduced test case:

function hourCycle(hour12) {
  return new Intl.DateTimeFormat("en", {hour: "numeric", hour12}).resolvedOptions().hourCycle;
}

print("default:", hourCycle(undefined));
print("hour12=true:", hourCycle(true));
print("hour12=false:", hourCycle(false));

Returns in SpiderMonkey:

default: h12
hour12=true: h12
hour12=false: h23

But in V8 (and per current spec):

default: h12
hour12=true: h12
hour12=false: h24

InitializeDateTimeFormat, step 30.d for hcDefault=h12, computes hc=hc12 for hour12=true resp. hc=hc24 for hour12=false.

But when we're consulting CLDR's supplementalData.xml for "US" [1], the preferred hour formats are either "h" or "H", which means either "hc12" or "hc23". So "hc24" is likely an unexpected result for US, because "k" isn't listed in the preferred hour formats for that region. (See [2] for "h", "H", and "k".)

[1] Adding likely subtags to the "en" locale results in "en-Latn-US".
[2] https://unicode.org/reports/tr35/tr35-dates.html#dfst-hour

@sffc sffc added c: datetime Component: dates, times, timezones s: help wanted Status: help wanted; needs proposal champion labels Mar 20, 2020
@anba
Copy link
Contributor Author

anba commented Mar 31, 2020

cc @zbraniecki

@mobz
Copy link

mobz commented May 1, 2020

@littledan
Copy link
Member

This issue seems somewhat important--it is a sort of user-observable "bug". Should we put it on the ES2021 milestone?

@mobz
Copy link

mobz commented May 3, 2020

I consider it a bug. The assumption in the spec is that h11 maps to h23 and h12 maps to h24, as a logical preference is wrong.

We can see from various bugs in date formatting packages that are following the spec that this is surprising.

hourCycle options map to keys in https://unicode.org/reports/tr35/tr35-dates.html#dfst-hour
K = h11 = 0-11
h = h12 = 1-12
H = h23 = 0-23
k = h24= 1-24
Fall back formatting preferences for different regions can be obtained http://cldr.unicode.org/
We can see (data below) that

  • all regions use either h12 or h23 as the default
  • all regions have h23 as an allowed value
  • zero regions have h24 as an allowed value
  • one region (JP) has h11 (K) as an allowed value

there is not a single region on the planet that provides h24 as an allowed format.

either;

  • hcDefault should only apply for hour12=true,
  • The value for hourCycle when hour12=false should be h23 in all cases
    or
  • the value for hcDefault should be determined based on the value of hour12
    or
  • hcDefault should be an ordered array of values and we should match the first one that matches hour12 setting

all these algorithms should evaluate to the same thing.
(formatting part b/B= am/AM etc)

grep "<hours.*allowed" ./cldr/common/supplemental/supplementalData.xml
<hours preferred="H" allowed="H" regions="AX BQ CP CZ DK FI ID IS ML NE RU SE SJ SK"/>
<hours preferred="H" allowed="H h" regions="001"/>
<hours preferred="H" allowed="H h hb hB" regions="AC AI BW BZ CC CK CX DG FK GB GG GI IE IM IO JE LT MK MN MS NF NG NR NU PN SH SX TA ZA"/>
<hours preferred="H" allowed="H h hB" regions="CF CM LU NP PF SC SM SN TF VA ca_ES fr_CA gl_ES it_CH it_IT"/>
<hours preferred="H" allowed="H h hB hb" regions="AR CL CR CU EA GT HN IC KG KM LK MA MX NI PY SV UY af_ZA es_BO es_BR es_EC es_ES es_GQ es_PE"/>
<hours preferred="H" allowed="H h K" regions="JP"/>
<hours preferred="H" allowed="H hb hB h" regions="AF LA"/>
<hours preferred="H" allowed="H hB" regions="AD AM AO AT AW BE BF BJ BL BR CG CI CV DE EE FR GA GF GN GP GW HR IL IT KZ MC MD MF MQ MZ NC NL PM PT RE RO SI SR ST TG TR WF YT"/>
<hours preferred="H" allowed="H hB h" regions="AZ BA BG CH GE LI ME RS UA UZ XK"/>
<hours preferred="H" allowed="H hB h hb" regions="BO EC ES GQ PE"/>
<hours preferred="H" allowed="H hB hb h" regions="LV TL zu_ZA"/>
<hours preferred="H" allowed="hB H" regions="CD IR"/>
<hours preferred="H" allowed="hB hb H h" regions="KE MM TZ UG"/>
<hours preferred="h" allowed="h H" regions="AS BT DJ ER GH IN LS PG PW SO TO VU WS"/>
<hours preferred="h" allowed="h H hb hB" regions="CY GR"/>
<hours preferred="h" allowed="h H hB" regions="AL TD"/>
<hours preferred="h" allowed="h H hB hb" regions="CO DO KP KR NA PA PR VE"/>
<hours preferred="h" allowed="h hb H hB" regions="AG AU BB BM BS CA DM FJ FM GD GM GU GY JM KI KN KY LC LR MH MP MW NZ SB SG SL SS SZ TC TT UM US VC VG VI ZM en_001"/>
<hours preferred="h" allowed="h hB H" regions="BD PK"/>
<hours preferred="h" allowed="h hB hb H" regions="AE BH DZ EG EH IQ JO KW LB LY MR OM PH PS QA SA SD SY TN YE ar_001"/>
<hours preferred="h" allowed="hb hB h H" regions="BN MY"/>
<hours preferred="h" allowed="hB h H" regions="hi_IN kn_IN ml_IN te_IN"/>
<hours preferred="h" allowed="hB h H hb" regions="KH"/>
<hours preferred="h" allowed="hB h hb H" regions="ta_IN"/>
<hours preferred="h" allowed="hB hb h H" regions="CN HK MO TW ET gu_IN mr_IN pa_IN"/>

@zbraniecki
Copy link
Member

So, if I'm correct, SpiderMonkey enforces h12/h23 while V8 enforces h12/h24 for hour12 true/false, right?
My preference would be to make the fallback depend on the locale's preferred value.

@anba
Copy link
Contributor Author

anba commented May 4, 2020

SpiderMonkey uses the following hour symbols in the skeleton:

  • No option present: "j"
  • hour12 = true: "h"
  • hour12 = false: "H"
  • hourCycle = h11: "h", plus modifying the resolved pattern to use the hour symbol "K".
  • hourCycle = h12: "h", plus modifying the resolved pattern to use the hour symbol "h".
  • hourCycle = h23: "H", plus modifying the resolved pattern to use the hour symbol "H".
  • hourCycle = h24: "H", plus modifying the resolved pattern to use the hour symbol "k".

(Input skeletons don't differentiate between "h" and "K" resp. "H" and "k".)

With that approach we don't strictly enforce h12/h23, which is visible for "ja":

js> print(new Intl.DateTimeFormat("en", {hour:"numeric", hour12:true, timeZone:"UTC"}).format(new Date("2020-01-01T12:00Z"))) 
12 PM
js> print(new Intl.DateTimeFormat("ja", {hour:"numeric", hour12:true, timeZone:"UTC"}).format(new Date("2020-01-01T12:00Z")))
午後0時

@ziyunfei
Copy link

ziyunfei commented May 9, 2020

I was bitten by this V8 change early this week https://bugs.chromium.org/p/chromium/issues/detail?id=1045791#c7.

Here was my quick fix :
image

@ray007
Copy link

ray007 commented May 11, 2020

What I'm missing with the hourCycle option is a 0-24.
When defining time-ranges, I might want an end-time of 0:55, but also 24:00 when it's midnight.

@littledan
Copy link
Member

littledan commented May 11, 2020

@ziyunfei This is exactly what { hourCycle: "h23" } is for. Your RegExp seems like it could be pretty unfortunate if the number 24 shows up somewhere else in the datetime!

@ray007 Can you say more about in which locales this is appropriate?

@ray007
Copy link

ray007 commented May 11, 2020

@ray007 Can you say more about in which locales this is appropriate?

I know that at least with german speaking locales having midnight as endtime with 24:00 is common.
So the time-range for a fullday event would be 00:00 - 24:00.

No idea about other locales.

@littledan
Copy link
Member

@ray007 In this case, it's hard for me to see what sort of setting for Intl.DateTimeFormat would permit both 0:00 and 24:00 to be generated with the same formatter...

@ray007
Copy link

ray007 commented May 11, 2020

@ray007 In this case, it's hard for me to see what sort of setting for Intl.DateTimeFormat would permit both 0:00 and 24:00 to be generated with the same formatter...

No need.
It's either 0:00 .. 23:59 (start time) or 0:01 .. 24:00 (end time). 2 formatters needed.

@mobz
Copy link

mobz commented May 13, 2020

@ray007 While there are cases of 24:00 (or later) being used as the end times of a duration this issue should concern it's self with resolving the existing discrepancy between new v8, the spec, and spidermonkey etc. mapping 24hour to hourCycle

A new proposal for a new hourCycle (or other) for handling this might be successful, though ecma402 tends to consume upstream standards and I note in particular changes in the latest version of ISO-8601

replacement of the term “midnight” with “beginning of day”, disallowing the value “24” for hour;

@mobz
Copy link

mobz commented May 13, 2020

@anba I think the correct way to fix this is to modify the standard to expect the spidermonkey (and old v8) behaviour. To me this matches most expectations and more closely matches cldr data. A more sophisticated algorithm would need to account for whether the AM/PM section is shown or not. @FrankYFTang The recent changes to Chromium - are they simply to match the spec? Do we have any additional information about why the spec was written that way.

@FrankYFTang
Copy link
Contributor

FrankYFTang commented May 13, 2020

I start to work on v8 after ECMA402 added the hourCycle algorithm and at that time v8 break a lot of test262 tests. The hourCycle algorithm in the ECMA402 spec is very complex and I am not sure WHY it was written in that way and I spent some effort to engineering it to match what the current spec (and therefore test262) stated in order to make v8 pass those tests. By default is hard to make ICU to behave like what the current ECMA402 states and maybe exactly why anba stated. I have not spend efforts to figure out the spec "make sense" or NOT and only ASSUME those before my involvement to ECMA402 already figure out it correctly. Now after I look at what Anba wrote I finally realize maybe the fault is on the spec but not the v8 code before my "fixing". I am totally fine if someone can figure out what the HourCycle algorithm should be and I will change v8 code to meet that. (It take me a long time to make the v8 switch to h24 on "en" but I am totally fine to change it.)

@FrankYFTang
Copy link
Contributor

@sffc I think this one is important to discuss in our next meeting.

@sffc sffc added s: discuss Status: TG2 must discuss to move forward and removed s: help wanted Status: help wanted; needs proposal champion labels May 13, 2020
@sffc sffc added this to the ES 2021 milestone May 21, 2020
@sffc sffc added s: help wanted Status: help wanted; needs proposal champion and removed s: discuss Status: TG2 must discuss to move forward labels May 21, 2020
@FrankYFTang
Copy link
Contributor

We discussed this in the ECMA402 2020-05-21 meeting and committee agree with @anba opinion about this issue and agree we should change the spec to the behavior what Mozilla currently implement. @anba Do you like to propose a spec change for that? Thanks

@FrankYFTang
Copy link
Contributor

We should comment / review on #436 to address this issue. I need to read through that PR. (not yet)

@sffc sffc removed the s: help wanted Status: help wanted; needs proposal champion label Jun 5, 2020
@sffc sffc added the Small Smaller change solvable in a Pull Request label Jun 5, 2020
@Constellation
Copy link
Member

FYI, we aligned JavaScriptCore behavior to SpiderMonkey's one described in #402 (comment) :)

ryanhaddad pushed a commit to WebKit/WebKit that referenced this issue Dec 22, 2020
…s used

https://bugs.webkit.org/show_bug.cgi?id=216521

Reviewed by Ross Kirsling.

JSTests:

JSTests/stress/intl-date-time-format-date-time-style-hour-cycle.js result is now matching against V8 and SpiderMonkey,
and this new behavior is more reasonable.

* stress/intl-date-time-format-date-time-style-hour-cycle.js:
(shouldBe.o.format):
(shouldBe):
* stress/intl-datetimeformat-hour-cycle.js: Added.
(shouldBe):
(throw.new.Error):
* test262/expectations.yaml:

Source/JavaScriptCore:

When specifying timeStyle in Intl.DateTimeFormat, we need to check that the generated format also follows to the hourCycle / hour12 options
specified in the constructor. Because dayPeriod can be included automatically, just replacing symbols after generating a pattern can dump strange result.
For example, the generated one is something like "02:12:47 PM Coordinated Universal Time". And we adjust the pattern to make it "14:12:47 PM Coordinated Universal Time"
when hourCycle H23 / H24 is specified. But this looks strange since dayPeriod "PM" should not exist when using H23 / H24.

In this patch, we revise our hour-cycle handling in Intl.DateTimeFormat. We align our behavior to SpiderMonkey's one[1] rather than the spec's one: when hour12 is specified,
we will just use 'H' or 'h' skeleton and do not enforce hour-cycle after generating pattern in hour12 case. If hour12 is not specified, then we use 'h' or 'H' skeleton
symbols based on hour-cycle, and later we modify the pattern based on hour-cycle. If both are not offered, we use 'j' which allows ICU to pick preferable one.
This is slightly different behavior to the spec (hcDefault etc.) but the spec's behavior can cause a bit surprising result[2,3], and SpiderMonkey like behavior will be
integrated into the spec eventually[4].

[1]: tc39/ecma402#402 (comment)
[2]: tc39/ecma402#402
[3]: https://bugs.chromium.org/p/chromium/issues/detail?id=1045791
[4]: tc39/ecma402#436

* runtime/IntlDateTimeFormat.cpp:
(JSC::IntlDateTimeFormat::setFormatsFromPattern):
(JSC::IntlDateTimeFormat::parseHourCycle):
(JSC::IntlDateTimeFormat::hourCycleFromPattern):
(JSC::IntlDateTimeFormat::replaceHourCycleInSkeleton):
(JSC::IntlDateTimeFormat::replaceHourCycleInPattern):
(JSC::IntlDateTimeFormat::initializeDateTimeFormat):
(JSC::IntlDateTimeFormat::hourCycleString):
(JSC::IntlDateTimeFormat::resolvedOptions const):
(JSC::IntlDateTimeFormat::createDateIntervalFormatIfNecessary):
* runtime/IntlDateTimeFormat.h:

Canonical link: https://commits.webkit.org/229387@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@267108 268f45cc-cd09-0410-ab3c-d52691b4dbfc
@AndyOGo
Copy link

AndyOGo commented Feb 25, 2021

I would really like to see this being resolved.
Is an help needed.
Since the issue is agreed upon, I think it's open for too long and affecting to many websites world wide.

@sffc sffc added s: help wanted Status: help wanted; needs proposal champion and removed s: in progress Status: the issue has an active proposal labels Feb 25, 2021
@sffc sffc modified the milestones: ES 2021, ES 2022 Mar 22, 2021
@devongovett
Copy link

This is also broken in other cases than just h24:

new Intl.DateTimeFormat("fr-CA", {hour: "numeric", hour12: true}).format(new Date(2020, 2, 3, 0))

Chrome returns 0 h a.m. (resolved hourCycle is h11). Firefox and Safari return 12 h a.m. (resolved hourCycle is h12).

@justingrant
Copy link
Contributor

@sffc @FrankYFTang is there a plan to fix this bug in the spec?

FWIW, it seems like anyone who's building a date/time-related library seems to run into this bug sooner or later. In Temporal we fixed it in tc39/proposal-temporal#599 (my first Temporal PR, BTW). Looks like @devongovett just ran into it too with his cool internationalized date/time picker library.

@sffc sffc modified the milestones: ES 2022, ES 2023 Jun 1, 2022
FrankYFTang added a commit that referenced this issue Mar 13, 2023
While hour12 is either true or false, let hourCycle to be either 'h12' or 'h23' but not 'h11' nor 'h24'.

The current logic is not reasonable. We see no region in the CLDR use h11 nor h24 hour cycle as default. 
While we set a hour12: true on a 24 hour system region (which use 0:00 - 23:59), we should set to h12 instead of h11

Do not perform the starnge logic of setting default hourCycle based on both hour12 and the defaultHourCycle while hour12 is true or false. Instead, only depends on hour12 to set it to h12 or h23. 

Address #402
ben-allen pushed a commit to ben-allen/ecma402 that referenced this issue Aug 1, 2023
While hour12 is either true or false, let hourCycle to be either 'h12' or 'h23' but not 'h11' nor 'h24'.

The current logic is not reasonable. We see no region in the CLDR use h11 nor h24 hour cycle as default. 
While we set a hour12: true on a 24 hour system region (which use 0:00 - 23:59), we should set to h12 instead of h11

Do not perform the starnge logic of setting default hourCycle based on both hour12 and the defaultHourCycle while hour12 is true or false. Instead, only depends on hour12 to set it to h12 or h23. 

Address tc39#402
@jufemaiz
Copy link

jufemaiz commented Aug 3, 2023

@here is there a desire to resolve this because I think there's some implicit assumptions baked into the specification that have not been (or at least, from the documentary evidence left behind, I can't see that they have been) appropriately tested for each locale expectation.

I imagine a critical change would, if the convention of h12/h23 as a standard pairing does not hold universally for instances of time (as distinct to ranges of time, which would necessarily require the use of something like h24 in some locations, which also then has loaded implicit impacts on not just the time portion but the date portion of a datetime too), then there is a justifiable change for a baseline locale preference for both 12 hour and 24 hour time formats that isn't based on implicit assumptions. This could still have a preferred option between 12 hour and 24 hour time representation, but also specify the implementation of each format explicitly rather than through implicit assumptions baked into a specification but never articulated and validated.

EDIT: of course #758

ryzokuken added a commit that referenced this issue Sep 27, 2023
While hour12 is either true or false, let hourCycle to be either 'h12' or 'h23' but not 'h11' nor 'h24'.

The current logic is not reasonable. We see no region in the CLDR use h11 nor h24 hour cycle as default. 
While we set a hour12: true on a 24 hour system region (which use 0:00 - 23:59), we should set to h12 instead of h11

Do not perform the starnge logic of setting default hourCycle based on both hour12 and the defaultHourCycle while hour12 is true or false. Instead, only depends on hour12 to set it to h12 or h23. 

Address #402

Co-authored-by: Shane F. Carr <[email protected]>
Co-authored-by: Ujjwal Sharma <[email protected]>
Co-authored-by: André Bargull <[email protected]>
ben-allen pushed a commit to ben-allen/ecma402 that referenced this issue Oct 12, 2023
While hour12 is either true or false, let hourCycle to be either 'h12' or 'h23' but not 'h11' nor 'h24'.

The current logic is not reasonable. We see no region in the CLDR use h11 nor h24 hour cycle as default. 
While we set a hour12: true on a 24 hour system region (which use 0:00 - 23:59), we should set to h12 instead of h11

Do not perform the starnge logic of setting default hourCycle based on both hour12 and the defaultHourCycle while hour12 is true or false. Instead, only depends on hour12 to set it to h12 or h23. 

Address tc39#402

Co-authored-by: Shane F. Carr <[email protected]>
Co-authored-by: Ujjwal Sharma <[email protected]>
Co-authored-by: André Bargull <[email protected]>
@anba
Copy link
Contributor Author

anba commented Jan 3, 2024

Fixed in #758.

@anba anba closed this as completed Jan 3, 2024
@sffc sffc moved this to Previously Discussed in ECMA-402 Meeting Topics Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: datetime Component: dates, times, timezones s: help wanted Status: help wanted; needs proposal champion Small Smaller change solvable in a Pull Request
Projects
Archived in project
Development

No branches or pull requests