Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support negative epoch_millis timestamps #80208

Merged

Conversation

rkophs
Copy link

@rkophs rkophs commented Nov 2, 2021

note: Several issues have been opened & closed for this regression. Assuming this PR is accepted, I'm not sure whether it's best practice to create yet another issue, or re-open an existing issue to link in this PR as the contribution guidelines ask for. Will update the PR accordingly once I get feedback 🙂 .

Relevant issues:

Context

ES 7.0 introduced a regression by no longer supporting negative timestamps once the epoch parsing code was migrated to use DateTimeFormatters instead of DateFormatter implementations. This regression appears to have been introduced
when the DateTimeFormatter replaced DateFormatter for parsing/formatting and the issue at hand appears to live exclusively in the parsing/formatting phase. Following the discussion in 40983, it appears the issue at hand is added complexity in parsing/formatting epoch's in order to accommodate Java 8's date-time API representation of epoch timestamps as a tuple of: a signed epoch-seconds & a non-negative unsigned fractional time (sub second precision).

Indeed, Instant uses a epochSeconds + non-negative nanosOfSecond fractional part.

The alternative is to use an ISO 8601 string to represent negative epochs. That is not something we can do without several thousand lines of client code setup to handle the conversion at ingestion, queries & responses and is one of the single largest blockers to a Elasticsearch major version upgrade on our backend.

Changes

Enhance the formatting & parsing logic to support negative epoch timestamps:

  • Parsing: in the presence of a negative fractional timestamp, subtract the epochSeconds portion by 1second and increase the negative fraction by 1second (i.e. compute the absolute value of its complementary fraction) so that the fraction is always positive (.e.g. "-4700.900ms" -> -4s + -700.900ms -> -5s + 299.100ms -> -5s + 299100000ns)
  • Formatting: reverse operation to parsing. In the presence of a negative epochSeconds & non-zero fraction, Concat the results of: adding 1second to the epochSeconds, recompute complementary fraction of the fractional part.

Many of the relevant issues allude to an edge-case that makes this difficult to handle. If this PR does not handle that edge-case, please shed light on what that edge-case is 🙇 .

ES 7.0 introduced a regression by no longer supporting negative
timestamps once the epoch parsing code was migrated to use
DateTimeFormatters instead of DateFormatter implementations

This change adds back the support for negative epoch timestamps
@cla-checker-service
Copy link

cla-checker-service bot commented Nov 2, 2021

💚 CLA has been signed

@elasticsearchmachine elasticsearchmachine added v8.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Nov 2, 2021
@rkophs
Copy link
Author

rkophs commented Nov 2, 2021

^ The contributor agreement was signed an hour before the PR was made.

@adagios
Copy link

adagios commented Nov 9, 2021

I would very much like that something like this is merged. The decision to not support one of the most common date formats is in my opinion very ill advised.

The added interface code that ES now needs when receiving things from other systems, because it doesn't support one of the simplest standards is difficult to accept as a good decision.

Imagine if Java removed the parsing of timestamps...

@rkophs rkophs changed the title support negative epoch_timestamps support negative epoch_millis timestamps Nov 9, 2021
@jtibshirani jtibshirani added the :Core/Infra/Core Core issues without another label label Dec 1, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Dec 1, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@rjernst
Copy link
Member

rjernst commented Jan 26, 2022

It has been many years so the details are a bit hazy for me. However, IIRC, the problem was not in the representation (which I believe I tried the same offsetting you do to make nanos of second always positive) but instead in Java’s format parsing for negative fractional seconds. I think it was something like -0.12345, the negative is lost by the time the nanos are handled. Can you add a test for this case?

@br3no
Copy link

br3no commented Feb 2, 2022

@rkophs there's an issue to address the same problem in OpenSearch here: opensearch-project/OpenSearch#1991. Would you like to submit your pull request to OpenSearch? Otherwise, would you mind if I port your fix to OpenSearch?

@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
@grcevski grcevski added the >bug label Feb 3, 2022
@rjernst
Copy link
Member

rjernst commented Feb 9, 2022

@elasticmachine update branch

@rjernst
Copy link
Member

rjernst commented Feb 9, 2022

@elasticmachine ok to test

@rjernst rjernst self-assigned this Feb 9, 2022
@rjernst rjernst self-requested a review February 9, 2022 16:24
Ryan Kophs added 2 commits February 19, 2022 20:50
Java DateTimeFormatter isn't able to parse "-0". Therefore,
the MILLIS portion of the parser fails for epochs such as
"-0.12345". We can overcome this by introducing an optional
TemporalField that represents whether the timestamp is
signed. It is only created during parsing in the presence
of a leading '-' char. Formatting will thus only support it
if we're doing with a INSTANT_SECONDS temporalAccessor that
happens to be negative.
@rkophs
Copy link
Author

rkophs commented Feb 20, 2022

@rjernst -- thanks for the insight. I added a test case for -0.12345. Indeed, you are correct. The DateTimeFormatter doesn't know how to parse "-0" for the millis portion of the timestamp.

I've added a bit of a "hack" to get that case to work by introducing a new TemporalField whose optional presence denotes that the timestamp is negative. This information is then able to be used by the MILLIS TemporalField to calculate the appropriate seconds/nanos parts during parsing. On the flipside, for formatting, the new TemporalField is only marked as supported for a TemporalAccessor that has a negative INSTANT_SECONDS value.

See my latest commit for details.

@rkophs rkophs force-pushed the support_negative_epoch_millis_timestamps branch from dd2edc3 to f225fcf Compare February 20, 2022 03:56
@rjernst
Copy link
Member

rjernst commented Feb 21, 2022

@elasticmachine update branch

@rjernst
Copy link
Member

rjernst commented Feb 23, 2022

@rkophs Could you please sign the CLA and I will review this more thoroughly?

@rkophs
Copy link
Author

rkophs commented Feb 28, 2022

@rjernst -- I've signed that CLA several times (using every one of my several different emails registered in Github). Does Elastic's CLA automation require the email to match what's tied to the commits, etc?

I'm happy to email Elastic a copy of my signed agreement(s).

edit Looks like the CLA check just passed. So I think we're good?

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkophs This is a very clever approach, using a dummy field as a placeholder for negative values.

I left a couple minor comments, but the one thing I think must be addressed is that adding this support just for epoch_millis is going to make support for epoch timestamps inconsistent. Your approach should work for the epoch_second parser as well, which is in the same EpochTime class. Would you please adjust its implementation similarly?

fieldValues.put(ChronoField.INSTANT_SECONDS, seconds);
fieldValues.put(ChronoField.NANO_OF_SECOND, nanos);
// if there is already a milli of second, we need to overwrite it
if (fieldValues.containsKey(ChronoField.MILLI_OF_SECOND)) {
fieldValues.put(ChronoField.MILLI_OF_SECOND, nanos / 1_000_000);
}
if (fieldValues.containsKey(ChronoField.MICRO_OF_SECOND)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Author

@rkophs rkophs Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not needed. I added it defensively since the same logic is happening above with MILLI_OF_SECOND, but I'm happy to remove it since it's not directly related to the change at hand.

long nanos = temporal.getLong(ChronoField.NANO_OF_SECOND);
if (nanos % 1_000_000 != 0) {
// Fractional negative timestamp.
// Add 1 ms towards positive infinity because the fraction leads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adjustment is tricky, and I had to stare at it for a while to understand. I think the comment would be more helpful if it mentions rounding. Essentially there is some extra precision that we are going to throw away by converting this to millis, and this adjustment by 1 ensures we are always rounding towards the epoch. Is that right? If so, can you improve the comment a bit?

Copy link
Author

@rkophs rkophs Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your understanding is correct. If we get into this branch, it means two things: 1) we're dealing with a negative epoch, 2) we're dealing with a fractional millisecond. Thus, the calculated millis invokes a precision loss that needs to be rounded towards the epoch for formatting (i.e. to a smaller magnitude for formatting, since Java date-time stores it as a negative at a higher magnitude by 1)

I'll update the comment to reflect.

@rdnm rdnm removed the team-discuss label Mar 2, 2022
@rkophs rkophs requested a review from rjernst March 21, 2022 22:02
@rjernst
Copy link
Member

rjernst commented Mar 23, 2022

@elasticmachine update branch

@rjernst
Copy link
Member

rjernst commented Mar 23, 2022

Thanks for the updates @rkophs. This looks good, I will wait for CI to complete and then merge.

@mark-vieira
Copy link
Contributor

@elasticsearchmachine generate changelog

1 similar comment
@mark-vieira
Copy link
Contributor

@elasticsearchmachine generate changelog

@rjernst rjernst merged commit c0799ec into elastic:master Mar 24, 2022
@rjernst
Copy link
Member

rjernst commented Mar 24, 2022

Thanks for your work here @rkophs! This is a clever approach.

amooncake added a commit to amooncake/elasticsearch that referenced this pull request Dec 16, 2022
Negative epoch timestamps are supported in 8.2.0 by pr elastic#80208
abdonpijpelink pushed a commit that referenced this pull request Dec 20, 2022
Negative epoch timestamps are supported in 8.2.0 by pr #80208
abdonpijpelink pushed a commit to abdonpijpelink/elasticsearch that referenced this pull request Dec 20, 2022
Negative epoch timestamps are supported in 8.2.0 by pr elastic#80208
elasticsearchmachine pushed a commit that referenced this pull request Dec 20, 2022
Negative epoch timestamps are supported in 8.2.0 by pr #80208

Co-authored-by: QY <[email protected]>
abdonpijpelink pushed a commit to abdonpijpelink/elasticsearch that referenced this pull request Dec 20, 2022
Negative epoch timestamps are supported in 8.2.0 by pr elastic#80208

(cherry picked from commit 7b17e1b)
abdonpijpelink added a commit that referenced this pull request Dec 21, 2022
Negative epoch timestamps are supported in 8.2.0 by pr #80208

(cherry picked from commit 7b17e1b)

Co-authored-by: QY <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Core Core issues without another label external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Core/Infra Meta label for core/infra team v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants