-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pkg/ottl] Adding support for locale into our time parsing library #32977
Labels
Comments
michalpristas
added
enhancement
New feature or request
needs triage
New item requiring triage
labels
May 10, 2024
4 tasks
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
michalpristas
changed the title
[pkg/ottl] Adding support for locale into our time library
[pkg/ottl] Adding support for locale into our time parsing library
May 10, 2024
Hi there! may I get this issue assigned? I'm working on a lib for doing exactly this and could be submitting a PR soon. |
TylerHelmuth
pushed a commit
that referenced
this issue
Sep 6, 2024
…34353) **Description:** <Describe what has changed.> Added support for localized time parsing into the `coreinternal/timeutils` package. The [tracking issue](#32977) is a following up to #32140, and the added function (`ParseLocalizedStrptime`) can be used later to [add locale support](#32978) to the ottl `Time` function. As discussed in the related issues, the plan is to have a similar support as implemented by the library [monday](https://github.com/goodsign/monday), making it possible to parse non-english time strings into `time.Time` objects. Elastic has built a new OSS library for that same purpose ([lunes](https://github.com/elastic/lunes)), that considering the discussed requirements, seems to be a better option than `monday`. Both libraries are just wrapper to the `time.Parse` and `time.ParseInLocation` features. They work by translating the foreign language value to English before calling the standard parsing functions. The main `lunes` differences are: 1 - Performance is considerably better, ~13x faster for complete .Parse operations: ``` BenchmarkParseLunes-10 2707008 429.7 ns/op 220 B/op 5 allocs/op BenchmarkParseMonday-10 212086 5630 ns/op 3753 B/op 117 allocs/op BenchmarkParseInLocationLunes-10 2777323 429.4 ns/op 220 B/op 5 allocs/op BenchmarkParseInLocationMonday-10 210357 5596 ns/op 3754 B/op 117 allocs/op ``` Given `ParseLocalizedStrptime` uses `lunes.Translate` under the hood, its performance is similar to the existing `ParseStrptime`, adding an extra overhead of ~303 ns/op for translating the value before parsing: ``` BenchmarkTranslate-10 3591234 302.4 ns/op 220 B/op 5 allocs/op ``` ``` BenchmarkParseLocalizedStrptime-10 821572 1405 ns/op 429 B/op 9 allocs/op BenchmarkParseStrptime-10 1000000 1082 ns/op 186 B/op 6 allocs/op ``` 2 - Translations are based on the [CLDR](https://cldr.unicode.org/) project, and it does support 900+ locales (vs 45+), including locales in draft stage. Those lunes translations are [generated](https://github.com/elastic/lunes/blob/main/generator.go) from a CLDR core file, and does not require manually changes. 3 - Replicates all the relevant standard `time.format_test.go` test cases, parsing foreign language values with and without layout replacements in all available locales and supported layouts (https://github.com/elastic/lunes/blob/main/lunes_test.go#L154). 4 - It is actively maintained (hosted under elastic repo). **Link to tracking Issue:** #32977 **Testing:** - Added unit tests For now, the only way of manually testing this functionality is by invoking the `ParseLocalizedStrptime` function manually through tests. **Documentation:**
@michalpristas @TylerHelmuth, I think we should close this issue as it was done by #34353. Thanks! |
f7o
pushed a commit
to f7o/opentelemetry-collector-contrib
that referenced
this issue
Sep 12, 2024
…pen-telemetry#34353) **Description:** <Describe what has changed.> Added support for localized time parsing into the `coreinternal/timeutils` package. The [tracking issue](open-telemetry#32977) is a following up to open-telemetry#32140, and the added function (`ParseLocalizedStrptime`) can be used later to [add locale support](open-telemetry#32978) to the ottl `Time` function. As discussed in the related issues, the plan is to have a similar support as implemented by the library [monday](https://github.com/goodsign/monday), making it possible to parse non-english time strings into `time.Time` objects. Elastic has built a new OSS library for that same purpose ([lunes](https://github.com/elastic/lunes)), that considering the discussed requirements, seems to be a better option than `monday`. Both libraries are just wrapper to the `time.Parse` and `time.ParseInLocation` features. They work by translating the foreign language value to English before calling the standard parsing functions. The main `lunes` differences are: 1 - Performance is considerably better, ~13x faster for complete .Parse operations: ``` BenchmarkParseLunes-10 2707008 429.7 ns/op 220 B/op 5 allocs/op BenchmarkParseMonday-10 212086 5630 ns/op 3753 B/op 117 allocs/op BenchmarkParseInLocationLunes-10 2777323 429.4 ns/op 220 B/op 5 allocs/op BenchmarkParseInLocationMonday-10 210357 5596 ns/op 3754 B/op 117 allocs/op ``` Given `ParseLocalizedStrptime` uses `lunes.Translate` under the hood, its performance is similar to the existing `ParseStrptime`, adding an extra overhead of ~303 ns/op for translating the value before parsing: ``` BenchmarkTranslate-10 3591234 302.4 ns/op 220 B/op 5 allocs/op ``` ``` BenchmarkParseLocalizedStrptime-10 821572 1405 ns/op 429 B/op 9 allocs/op BenchmarkParseStrptime-10 1000000 1082 ns/op 186 B/op 6 allocs/op ``` 2 - Translations are based on the [CLDR](https://cldr.unicode.org/) project, and it does support 900+ locales (vs 45+), including locales in draft stage. Those lunes translations are [generated](https://github.com/elastic/lunes/blob/main/generator.go) from a CLDR core file, and does not require manually changes. 3 - Replicates all the relevant standard `time.format_test.go` test cases, parsing foreign language values with and without layout replacements in all available locales and supported layouts (https://github.com/elastic/lunes/blob/main/lunes_test.go#L154). 4 - It is actively maintained (hosted under elastic repo). **Link to tracking Issue:** open-telemetry#32977 **Testing:** - Added unit tests For now, the only way of manually testing this functionality is by invoking the `ParseLocalizedStrptime` function manually through tests. **Documentation:**
jriguera
pushed a commit
to springernature/opentelemetry-collector-contrib
that referenced
this issue
Oct 4, 2024
…pen-telemetry#34353) **Description:** <Describe what has changed.> Added support for localized time parsing into the `coreinternal/timeutils` package. The [tracking issue](open-telemetry#32977) is a following up to open-telemetry#32140, and the added function (`ParseLocalizedStrptime`) can be used later to [add locale support](open-telemetry#32978) to the ottl `Time` function. As discussed in the related issues, the plan is to have a similar support as implemented by the library [monday](https://github.com/goodsign/monday), making it possible to parse non-english time strings into `time.Time` objects. Elastic has built a new OSS library for that same purpose ([lunes](https://github.com/elastic/lunes)), that considering the discussed requirements, seems to be a better option than `monday`. Both libraries are just wrapper to the `time.Parse` and `time.ParseInLocation` features. They work by translating the foreign language value to English before calling the standard parsing functions. The main `lunes` differences are: 1 - Performance is considerably better, ~13x faster for complete .Parse operations: ``` BenchmarkParseLunes-10 2707008 429.7 ns/op 220 B/op 5 allocs/op BenchmarkParseMonday-10 212086 5630 ns/op 3753 B/op 117 allocs/op BenchmarkParseInLocationLunes-10 2777323 429.4 ns/op 220 B/op 5 allocs/op BenchmarkParseInLocationMonday-10 210357 5596 ns/op 3754 B/op 117 allocs/op ``` Given `ParseLocalizedStrptime` uses `lunes.Translate` under the hood, its performance is similar to the existing `ParseStrptime`, adding an extra overhead of ~303 ns/op for translating the value before parsing: ``` BenchmarkTranslate-10 3591234 302.4 ns/op 220 B/op 5 allocs/op ``` ``` BenchmarkParseLocalizedStrptime-10 821572 1405 ns/op 429 B/op 9 allocs/op BenchmarkParseStrptime-10 1000000 1082 ns/op 186 B/op 6 allocs/op ``` 2 - Translations are based on the [CLDR](https://cldr.unicode.org/) project, and it does support 900+ locales (vs 45+), including locales in draft stage. Those lunes translations are [generated](https://github.com/elastic/lunes/blob/main/generator.go) from a CLDR core file, and does not require manually changes. 3 - Replicates all the relevant standard `time.format_test.go` test cases, parsing foreign language values with and without layout replacements in all available locales and supported layouts (https://github.com/elastic/lunes/blob/main/lunes_test.go#L154). 4 - It is actively maintained (hosted under elastic repo). **Link to tracking Issue:** open-telemetry#32977 **Testing:** - Added unit tests For now, the only way of manually testing this functionality is by invoking the `ParseLocalizedStrptime` function manually through tests. **Documentation:**
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Component(s)
pkg/ottl
Is your feature request related to a problem? Please describe.
This is a followup to #32140
When timestamp is in non-english language normal golang parsing will fail.
example:
when the tool i'm observing is producing Italian logs in a form of
mercoledì set 4 2024
parsing with a layoutMonday Jan 2 2006
should yield4th September 2024
correctlyDescribe the solution you'd like
Similar support as implemented in https://github.com/ruang-guru/monday for our stanza time library
We could use CLDR data and generate mapping using
ca-generic.json
file from specified locale to English.then use golang standard library to parse translated English date
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: