Skip to content

Commit

Permalink
More robust Date/Time format patterns parsing (#7826)
Browse files Browse the repository at this point in the history
- Closes #7461 by introducing a `Date_Time_Formatter` type and making parsing date time formats more robust and safer.
- The default ('simple') set of patterns is slightly simplified and made case insensitive (except for `M/m` and `H/h`) to avoid the `YYYY` vs `yyyy` issues and make it less error prone.
- The `YYYY` now has the same meaning as `yyyy` in simple mode. The old meaning (week-based year) is moved to a _separate mode_, triggered by `Date_Time_Formatter.from_iso_week_date_pattern`.
- Full Java syntax, as well as custom-built Java `DateTimeFormatter` can also be used by `Date_Time_Formatter.from_java`.
- Text-based constants (e.g. `ISO_ZONED_DATE_TIME`) have now become methods on `Date_Time_Formatter`, e.g. `Date_Time_Formatter.iso_zoned_date_time`).
  • Loading branch information
radeusgd authored Sep 22, 2023
1 parent 3e615f3 commit 12c4f29
Show file tree
Hide file tree
Showing 44 changed files with 2,011 additions and 724 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,8 @@
- [Added `Data.post` method to write to HTTP endpoints.][7700]
- [Added support for S3. Using `Input_Stream` more for reading.][7776]
- [Renamed `Decimal` to `Float`.][7807]
- [Implemented `Date_Time_Formatter` for more user-friendly date/time format
parsing.][7826]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -817,6 +819,7 @@
[7709]: https://github.com/enso-org/enso/pull/7709
[7776]: https://github.com/enso-org/enso/pull/7776
[7807]: https://github.com/enso-org/enso/pull/7807
[7826]: https://github.com/enso-org/enso/pull/7826

#### Enso Compiler

Expand Down
150 changes: 105 additions & 45 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import project.Data.Text.Text_Sub_Range.Codepoint_Ranges
import project.Data.Text.Text_Sub_Range.Text_Sub_Range
import project.Data.Time.Date.Date
import project.Data.Time.Date_Time.Date_Time
import project.Data.Time.Date_Time_Formatter.Date_Time_Formatter
import project.Data.Time.Time_Of_Day.Time_Of_Day
import project.Data.Time.Time_Zone.Time_Zone
import project.Data.Vector.Vector
Expand Down Expand Up @@ -1473,17 +1474,10 @@ Text.parse_json self = Json.parse self

Converts text containing a date into a Date object.

Arguments:
- format: An optional format describing how to parse the text.

Returns a `Time_Error` if `self`` cannot be parsed using the provided
`format`.
This method will return a `Time_Error` if the provided time cannot be parsed.

? Format Syntax
A custom format string consists of one or more custom date and time format
specifiers. For example, "d MMM yyyy" will format "2011-12-03" as
"3 Dec 2011". See https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/time/format/DateTimeFormatter.html
for a complete format specification.
Arguments:
- format: The format to use for parsing the input text.

? Default Date Formatting
Unless you provide a custom format, the text must represent a valid date
Expand All @@ -1500,6 +1494,34 @@ Text.parse_json self = Json.parse self
- Two digits for the day-of-month. This is pre-padded by zero to ensure two
digits.

? Pattern Syntax
If the pattern is provided as `Text`, it is parsed using the format
described below. See `Date_Time_Formatter` for more options.
- y: Year. The number of pattern letters determines the minimum number of
digits.
- y: The year using any number of digits.
- yy: The year, using at most two digits. The default range is
1950-2049, but this can be changed by including the end year in
braces e.g. `yy{2099}`.
- yyyy: The year, using exactly four digits.
- M: Month of year. The number of pattern letters determines the format:
- M: Any number (1-12).
- MM: Month number with zero padding required (01-12).
- MMM: Short name of the month (Jan-Dec).
- MMMM: Full name of the month (January-December).
The month names depend on the selected locale.
- d: Day. The number of pattern letters determines the format:
- d: Any number (1-31).
- dd: Day number with zero padding required (01-31).
- ddd: Short name of the day of week (Mon-Sun).
- dddd: Full name of the day of week (Monday-Sunday).
The weekday names depend on the selected locale.
Both day of week and day of month may be included in a single pattern -
in such case the day of week is used as a sanity check.
- Q: Quarter of year.
If only year and quarter are provided in the pattern, when parsing a
date, the result will be the first day of that quarter.

> Example
Parse the date of 23rd December 2020.

Expand Down Expand Up @@ -1533,32 +1555,27 @@ Text.parse_json self = Json.parse self
date = "1999-1-1".parse_date "yyyy-MM-dd"
date.catch Time_Error (_->Date.new 2000 1 1)
@format make_date_format_selector
@locale Locale.default_widget
Text.parse_date : Text -> Locale -> Date ! Time_Error
Text.parse_date self format:Text="" locale:Locale=Locale.default = Date.parse self format locale
Text.parse_date : Date_Time_Formatter -> Date ! Time_Error
Text.parse_date self format:Date_Time_Formatter=Date_Time_Formatter.iso_date =
Date.parse self format

## ALIAS date_time from text
GROUP Conversions

Obtains an instance of `Date_Time` from a text such as
"2007-12-03T10:15:30+01:00 Europe/Paris".

This method will return a `Time_Error` if the provided time cannot be parsed.

Arguments:
- format: The format to use for parsing the input text.
- locale: The locale in which the format should be interpreted.

? Format Syntax
A custom format string consists of one or more custom date and time format
specifiers. For example, "d MMM yyyy" will format "2011-12-03" as
"3 Dec 2011". See https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/time/format/DateTimeFormatter.html
for a complete format specification.

? Default Date_Time Format
The text must represent a valid date-time as defined by the ISO-8601
format. (See https://en.wikipedia.org/wiki/ISO_8601.) If a time zone is
present, it must be in the ISO-8601 Extended Date/Time Format (EDTF).
(See https://en.wikipedia.org/wiki/ISO_8601#EDTF.) The time zone format
consists of:
Unless you provide a custom format, the text must represent a valid
date-time as defined by the ISO-8601 format (see https://en.wikipedia.org/wiki/ISO_8601).
If a time zone is present, it must be in the ISO-8601 Extended Date/Time
Format (EDTF) (see https://en.wikipedia.org/wiki/ISO_8601#EDTF). The time
zone format consists of:

- The ISO offset date time.
- If the zone ID is not available or is a zone offset then the format is
Expand All @@ -1568,8 +1585,45 @@ Text.parse_date self format:Text="" locale:Locale=Locale.default = Date.parse se
sensitive.
- A close square bracket ']'.

This method will return a `Time_Error` if the provided time cannot be parsed
using the above format.
? Pattern Syntax
If the pattern is provided as `Text`, it is parsed using the format
described below. See `Date_Time_Formatter` for more options.
- y: Year. The number of pattern letters determines the minimum number of
digits.
- y: The year using any number of digits.
- yy: The year, using at most two digits. The default range is
1950-2049, but this can be changed by including the end year in
braces e.g. `yy{2099}`.
- yyyy: The year, using exactly four digits.
- M: Month of year. The number of pattern letters determines the format:
- M: Any number (1-12).
- MM: Month number with zero padding required (01-12).
- MMM: Short name of the month (Jan-Dec).
- MMMM: Full name of the month (January-December).
The month names depend on the selected locale.
- d: Day. The number of pattern letters determines the format:
- d: Any number (1-31).
- dd: Day number with zero padding required (01-31).
- ddd: Short name of the day of week (Mon-Sun).
- dddd: Full name of the day of week (Monday-Sunday).
The weekday names depend on the selected locale.
Both day of week and day of month may be included in a single pattern -
in such case the day of week is used as a sanity check.
- Q: Quarter of year.
If only year and quarter are provided in the pattern, when parsing a
date, the result will be the first day of that quarter.
- H: 24h hour of day (0-23).
- h: 12h hour of day (0-12). The `a` pattern is needed to disambiguate
between AM and PM.
- m: Minute of hour.
- s: Second of minute.
- f: Fractional part of the second. The number of pattern letters
determines the number of digits. If one letter is used, any number of
digits will be accepted.
- a: AM/PM marker.
- T: If repeated 3 or less times - Time zone ID (e.g. Europe/Warsaw, Z,
-08:30), otherwise - Time zone name (e.g. Central European Time, CET).
- Z: Zone offset (e.g. +0000, -0830, +08:30:15).

> Example
Parse UTC time.
Expand Down Expand Up @@ -1621,31 +1675,24 @@ Text.parse_date self format:Text="" locale:Locale=Locale.default = Date.parse se
example_parse =
"06 of May 2020 at 04:30AM".parse_date_time "dd 'of' MMMM yyyy 'at' hh:mma"
@format make_date_time_format_selector
@locale Locale.default_widget
Text.parse_date_time : Text -> Locale -> Date_Time ! Time_Error
Text.parse_date_time self format:Text="" locale:Locale=Locale.default = Date_Time.parse self format locale
Text.parse_date_time : Date_Time_Formatter -> Date_Time ! Time_Error
Text.parse_date_time self format:Date_Time_Formatter=Date_Time_Formatter.default_enso_zoned_date_time =
Date_Time.parse self format

## ALIAS time_of_day from text, to_time_of_day
GROUP Conversions

Obtains an instance of `Time_Of_Day` from a text such as "10:15".

This method will return a `Time_Error` if the provided time cannot be parsed.

Arguments:
- format: The format to use for parsing the input text.
- locale: The locale in which the format should be interpreted.

Returns a `Time_Error` if the provided text cannot be parsed using the
default format.

? Format Syntax
A custom format string consists of one or more custom date and time format
specifiers. For example, "d MMM yyyy" will format "2011-12-03" as
"3 Dec 2011". See https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/time/format/DateTimeFormatter.html
for a complete format specification.

? Default Time Format
The text must represent a valid time and is parsed using the ISO-8601
extended local time format. The format consists of:
Unless you provide a custom format, the text must represent a valid time
and is parsed using the ISO-8601 extended local time format.
The format consists of:

- Two digits for the hour-of-day. This is pre-padded by zero to ensure two
digits.
Expand All @@ -1662,6 +1709,19 @@ Text.parse_date_time self format:Text="" locale:Locale=Locale.default = Date_Tim
- One to nine digits for the nano-of-second. As many digits will be output
as required.

? Pattern Syntax
If the pattern is provided as `Text`, it is parsed using the format
described below. See `Date_Time_Formatter` for more options.
- H: 24h hour of day (0-23).
- h: 12h hour of day (0-12). The `a` pattern is needed to disambiguate
between AM and PM.
- m: Minute of hour.
- s: Second of minute.
- f: Fractional part of the second. The number of pattern letters
determines the number of digits. If one letter is used, any number of
digits will be accepted.
- a: AM/PM marker.

> Example
Get the time 15:05:30.

Expand Down Expand Up @@ -1692,9 +1752,9 @@ Text.parse_date_time self format:Text="" locale:Locale=Locale.default = Date_Tim

example_parse = "4:30AM".parse_time_of_day "h:mma"
@format make_time_format_selector
@locale Locale.default_widget
Text.parse_time_of_day : Text -> Locale -> Time_Of_Day ! Time_Error
Text.parse_time_of_day self format:Text="" locale:Locale=Locale.default = Time_Of_Day.parse self format locale
Text.parse_time_of_day : Date_Time_Formatter -> Time_Of_Day ! Time_Error
Text.parse_time_of_day self format:Date_Time_Formatter=Date_Time_Formatter.iso_time =
Time_Of_Day.parse self format

## ALIAS time_zone from text, to_time_zone
GROUP Conversions
Expand Down
104 changes: 69 additions & 35 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Time/Date.enso
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import project.Data.Text.Text
import project.Data.Time.Date_Period.Date_Period
import project.Data.Time.Date_Range.Date_Range
import project.Data.Time.Date_Time.Date_Time
import project.Data.Time.Date_Time_Formatter.Date_Time_Formatter
import project.Data.Time.Day_Of_Week.Day_Of_Week
import project.Data.Time.Day_Of_Week_From
import project.Data.Time.Duration.Duration
Expand Down Expand Up @@ -105,17 +106,11 @@ type Date

Arguments:
- text: The text to try and parse as a date.
- pattern: An optional pattern describing how to parse the text.
- locale: The locale in which the pattern should be interpreted.
- format: A pattern describing how to parse the text,
or a `Date_Time_Formatter`.

Returns a `Time_Error` if the provided `text` cannot be parsed using the
provided `pattern`.

? Pattern Syntax
A custom pattern string consists of one or more custom date and time
format specifiers. For example, "d MMM yyyy" will format "2011-12-03"
as "3 Dec 2011". See https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/time/format/DateTimeFormatter.html
for a complete format specification.
provided `format`.

? Default Date Formatting
Unless you provide a custom format, the text must represent a valid date
Expand All @@ -132,6 +127,34 @@ type Date
- Two digits for the day-of-month. This is pre-padded by zero to ensure two
digits.

? Pattern Syntax
If the pattern is provided as `Text`, it is parsed using the format
described below. See `Date_Time_Formatter` for more options.
- y: Year. The number of pattern letters determines the minimum number of
digits.
- y: The year using any number of digits.
- yy: The year, using at most two digits. The default range is
1950-2049, but this can be changed by including the end year in
braces e.g. `yy{2099}`.
- yyyy: The year, using exactly four digits.
- M: Month of year. The number of pattern letters determines the format:
- M: Any number (1-12).
- MM: Month number with zero padding required (01-12).
- MMM: Short name of the month (Jan-Dec).
- MMMM: Full name of the month (January-December).
The month names depend on the selected locale.
- d: Day. The number of pattern letters determines the format:
- d: Any number (1-31).
- dd: Day number with zero padding required (01-31).
- ddd: Short name of the day of week (Mon-Sun).
- dddd: Full name of the day of week (Monday-Sunday).
The weekday names depend on the selected locale.
Both day of week and day of month may be included in a single pattern -
in such case the day of week is used as a sanity check.
- Q: Quarter of year.
If only year and quarter are provided in the pattern, when parsing a
date, the result will be the first day of that quarter.

> Example
Parse the date of 23rd December 2020.

Expand Down Expand Up @@ -164,17 +187,10 @@ type Date
example_parse_err =
date = Date.parse "1999-1-1" "yyyy-MM-dd"
date.catch Time_Error (_->Date.new 2000 1 1)
@pattern make_date_format_selector
@locale Locale.default_widget
parse : Text -> Text -> Locale -> Date ! Time_Error
parse text:Text pattern:Text="" locale:Locale=Locale.default =
result = Panic.recover Any <|
formatter = if pattern.is_empty then Time_Utils.default_date_formatter else
Time_Utils.make_formatter pattern locale.java_locale
Time_Utils.parse_date text.trim formatter
result . map_error <| case _ of
err : JException -> Time_Error.Error err.getMessage
ex -> ex
@format make_date_format_selector
parse : Text -> Date_Time_Formatter -> Date ! Time_Error
parse text:Text format:Date_Time_Formatter=Date_Time_Formatter.iso_date =
format.parse_date text

## GROUP Metadata
Get the year field.
Expand Down Expand Up @@ -709,15 +725,36 @@ type Date
Format this date using the provided format specifier.

Arguments:
- pattern: The text specifying the format for formatting the date.
- locale: The locale in which the format should be interpreted.
(Defaults to Locale.default.)
- format: A pattern describing how to format the text,
or a `Date_Time_Formatter`.

? Pattern Syntax
A custom pattern string consists of one or more custom date and time
format specifiers. For example, "d MMM yyyy" will format "2011-12-03"
as "3 Dec 2011". See https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/time/format/DateTimeFormatter.html
for a complete format specification.
If the pattern is provided as `Text`, it is parsed using the format
described below. See `Date_Time_Formatter` for more options.
- y: Year. The number of pattern letters determines the minimum number of
digits.
- y: The year using any number of digits.
- yy: The year, using at most two digits. The default range is
1950-2049, but this can be changed by including the end year in
braces e.g. `yy{2099}`.
- yyyy: The year, using exactly four digits.
- M: Month of year. The number of pattern letters determines the format:
- M: Any number (1-12).
- MM: Month number with zero padding required (01-12).
- MMM: Short name of the month (Jan-Dec).
- MMMM: Full name of the month (January-December).
The month names depend on the selected locale.
- d: Day. The number of pattern letters determines the format:
- d: Any number (1-31).
- dd: Day number with zero padding required (01-31).
- ddd: Short name of the day of week (Mon-Sun).
- dddd: Full name of the day of week (Monday-Sunday).
The weekday names depend on the selected locale.
Both day of week and day of month may be included in a single pattern -
in such case the day of week is used as a sanity check.
- Q: Quarter of year.
If only year and quarter are provided in the pattern, when parsing a
date, the result will be the first day of that quarter.

> Example
Format "2020-06-02" as "2 Jun 2020"
Expand Down Expand Up @@ -749,14 +786,11 @@ type Date
> Example
Format "2020-06-21" with French locale as "21. juin 2020"

example_format = Date.new 2020 6 21 . format "d. MMMM yyyy" (Locale.new "fr")
@pattern (value-> make_date_format_selector value)
@locale Locale.default_widget
format : Text -> Locale -> Text
format self pattern:Text locale=Locale.default =
formatter = if pattern.is_empty then Time_Utils.default_date_formatter else
Time_Utils.make_formatter pattern locale.java_locale
Time_Utils.date_format self formatter
example_format = Date.new 2020 6 21 . format (Date_Time_Formatter.from "d. MMMM yyyy" (Locale.new "fr"))
@format (value-> make_date_format_selector value)
format : Date_Time_Formatter -> Text
format self format:Date_Time_Formatter=Date_Time_Formatter.iso_date =
format.format_date self

## PRIVATE
week_days_between start end =
Expand Down
Loading

0 comments on commit 12c4f29

Please sign in to comment.