Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date field does not support partial dates anymore #45284

Closed
ywelsch opened this issue Aug 7, 2019 · 3 comments · Fixed by #46814
Closed

Date field does not support partial dates anymore #45284

ywelsch opened this issue Aug 7, 2019 · 3 comments · Fixed by #46814
Labels
>bug :Core/Infra/Core Core issues without another label

Comments

@ywelsch
Copy link
Contributor

ywelsch commented Aug 7, 2019

On ES 6.x and below, date fields (i.e. fields of "type": "date") used to accept values of the form "2018-09" (i.e. just year and month) whereas this no longer works on ES 7.x (tested on 7.3.0 and master).

This breaks recoveries from 6.x to 7.x and indexing pipelines.

@ywelsch ywelsch added the :Core/Infra/Core Core issues without another label label Aug 7, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@rjernst
Copy link
Member

rjernst commented Aug 8, 2019

The default parser for date fields is strict_date_optional_time (and a fallback to epoch millis, but that is unimportant for this discussion). It looks like confusing terminology in Joda has caused us to be too strict in several places with our conversion to java time. It looks like myself and others who worked on the java time migration viewed the "strict" term here as correlated with the "optional" portion, meaning the above format would have a required date, but an optional time. But upon further inspection it appears strict has to do with validation of the fields of the date, eg allowing 13 as the month. Unfortunately the "dueling" tests that we have comparing java time to joda did not compare these partial dates to joda.

Java time's facility for allowing lenient parsing is the ResolverStyle, which can be LENIENT, SMART or STRICT. SMART is actually the default, which will try to carry over field values out of the expected range (for example, july 32 would be translated to aug 1). Switching the strict parsers we have to allow optional portions of dates, and changing the resolver style to strict should match the old joda behavior.

We should also be consistent about not using SMART, and maybe discuss whether we need the LENIENT resolved versions of date formatters at all? It is unclear how these would be used in Elasticsearch since in many places we necessarily convert date/times to epoch millis or nanos, and I don't think a lenient parsed field can be converted in this way without a custom field resolver.

/cc @spinscale @pgomulka

@pgomulka
Copy link
Contributor

pgomulka commented Sep 3, 2019

So we can fix 4 things here:

  1. To enable partial parsing of dates. So that date_optional_time can parse 2018, 2018-01, 2018-01-01
  2. To make the strict_* parsers validate the length of date parts. This would make date_optional_time parse 99999-01-01 but strict_date_optional_time would fail. This is the difference between StrictISODateTimeFormat and ISODateTimeFormat in Joda implementation. I would however skip this one, as we already relaxed this in ES v7, some people might have started to use it.
  3. Enable ResolverStyle.Strict for all parsers. This is how it worked in Joda (there was no option to change this). By default Java is using ResolverStyle.SMART which would allow parsing 2018-02-30 but Joda would throw an exception.
  4. Consider renaming our formatters. strict_date_optional_time makes it unclear what is strict here (in fact it is the strict length parsing, not the presence of full date part or resolver style - which is always strict). Maybe we could start with adding aliases for these unclear. Like strict_length_date_time ? Or any other ideas?

pgomulka added a commit that referenced this issue Sep 27, 2019
…#46654)

Currently DateMathParser with roundUp = true is relying on the DateFormatter build with combined optional sub parsers with defaulted fields (depending on the formatter). That means that for yyyy-MM-dd'T'HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS
Java.time implementation expects optional parsers in order from most specific to least specific (reverse in the example above).
It is causing a problem because the first parsing succeeds but does not consume the full input. The second parser should be used.
We can work around this with keeping a list of RoundUpParsers and iterate over them choosing the one that parsed full input. The same approach we used for regular (non date math) in
relates #40100
The jdk is not considering this to be a bug https://bugs.openjdk.java.net/browse/JDK-8188771

Those below will expect this change first
relates #46242
relates #45284
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Sep 27, 2019
…elastic#46654)

Currently DateMathParser with roundUp = true is relying on the DateFormatter build with
combined optional sub parsers with defaulted fields (depending on the formatter).
That means that for yyyy-MM-dd'T'HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS
Java.time implementation expects optional parsers in order from most specific to
least specific (reverse in the example above).
It is causing a problem because the first parsing succeeds but does not consume the full input.
The second parser should be used.
We can work around this with keeping a list of RoundUpParsers and iterate over them choosing
the one that parsed full input. The same approach we used for regular (non date math) in
relates elastic#40100
The jdk is not considering this to be a bug https://bugs.openjdk.java.net/browse/JDK-8188771

Those below will expect this change first
relates elastic#46242
relates elastic#45284
pgomulka added a commit that referenced this issue Sep 30, 2019
…654) (#47217)

Currently DateMathParser with roundUp = true is relying on the DateFormatter build with
combined optional sub parsers with defaulted fields (depending on the formatter).
That means that for yyyy-MM-dd'T'HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS
Java.time implementation expects optional parsers in order from most specific to
least specific (reverse in the example above).
It is causing a problem because the first parsing succeeds but does not consume the full input.
The second parser should be used.
We can work around this with keeping a list of RoundUpParsers and iterate over them choosing
the one that parsed full input. The same approach we used for regular (non date math) in
relates #40100
The jdk is not considering this to be a bug https://bugs.openjdk.java.net/browse/JDK-8188771

Those below will expect this change first
relates #46242
relates #45284
backport #46654
pgomulka added a commit that referenced this issue Oct 10, 2019
Enable partial parsing of date part. This is making the behaviour in java.time implementation the same as with joda.
2018, 2018-01 and 2018-01-01 are all valid dates for date_optional_time or strict_date_optional_time
closes #45284
closes #47473
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 10, 2019
Enable partial parsing of date part. This is making the behaviour in java.time implementation the same as with joda.
2018, 2018-01 and 2018-01-01 are all valid dates for date_optional_time or strict_date_optional_time
closes elastic#45284
closes elastic#47473
pgomulka added a commit that referenced this issue Oct 11, 2019
Enable partial parsing of date part. This is making the behaviour in java.time implementation the same as with joda.
2018, 2018-01 and 2018-01-01 are all valid dates for date_optional_time or strict_date_optional_time
closes #45284
closes #47473
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Dec 9, 2019
…654) (elastic#47217)

Currently DateMathParser with roundUp = true is relying on the DateFormatter build with
combined optional sub parsers with defaulted fields (depending on the formatter).
That means that for yyyy-MM-dd'T'HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS
Java.time implementation expects optional parsers in order from most specific to
least specific (reverse in the example above).
It is causing a problem because the first parsing succeeds but does not consume the full input.
The second parser should be used.
We can work around this with keeping a list of RoundUpParsers and iterate over them choosing
the one that parsed full input. The same approach we used for regular (non date math) in
relates elastic#40100
The jdk is not considering this to be a bug https://bugs.openjdk.java.net/browse/JDK-8188771

Those below will expect this change first
relates elastic#46242
relates elastic#45284
backport elastic#46654
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Feb 7, 2020
…654) (elastic#47217)

Currently DateMathParser with roundUp = true is relying on the DateFormatter build with
combined optional sub parsers with defaulted fields (depending on the formatter).
That means that for yyyy-MM-dd'T'HH:mm:ss||yyyy-MM-dd'T'HH:mm:ss.SSS
Java.time implementation expects optional parsers in order from most specific to
least specific (reverse in the example above).
It is causing a problem because the first parsing succeeds but does not consume the full input.
The second parser should be used.
We can work around this with keeping a list of RoundUpParsers and iterate over them choosing
the one that parsed full input. The same approach we used for regular (non date math) in
relates elastic#40100
The jdk is not considering this to be a bug https://bugs.openjdk.java.net/browse/JDK-8188771

Those below will expect this change first
relates elastic#46242
relates elastic#45284
backport elastic#46654
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Core Core issues without another label
Projects
None yet
5 participants