-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support negative epoch_millis timestamps #80208
support negative epoch_millis timestamps #80208
Conversation
ES 7.0 introduced a regression by no longer supporting negative timestamps once the epoch parsing code was migrated to use DateTimeFormatters instead of DateFormatter implementations This change adds back the support for negative epoch timestamps
💚 CLA has been signed |
^ The contributor agreement was signed an hour before the PR was made. |
I would very much like that something like this is merged. The decision to not support one of the most common date formats is in my opinion very ill advised. The added interface code that ES now needs when receiving things from other systems, because it doesn't support one of the simplest standards is difficult to accept as a good decision. Imagine if Java removed the parsing of timestamps... |
Pinging @elastic/es-core-infra (Team:Core/Infra) |
It has been many years so the details are a bit hazy for me. However, IIRC, the problem was not in the representation (which I believe I tried the same offsetting you do to make nanos of second always positive) but instead in Java’s format parsing for negative fractional seconds. I think it was something like -0.12345, the negative is lost by the time the nanos are handled. Can you add a test for this case? |
@rkophs there's an issue to address the same problem in OpenSearch here: opensearch-project/OpenSearch#1991. Would you like to submit your pull request to OpenSearch? Otherwise, would you mind if I port your fix to OpenSearch? |
@elasticmachine update branch |
@elasticmachine ok to test |
Java DateTimeFormatter isn't able to parse "-0". Therefore, the MILLIS portion of the parser fails for epochs such as "-0.12345". We can overcome this by introducing an optional TemporalField that represents whether the timestamp is signed. It is only created during parsing in the presence of a leading '-' char. Formatting will thus only support it if we're doing with a INSTANT_SECONDS temporalAccessor that happens to be negative.
…thub.com/rkophs/elasticsearch into support_negative_epoch_millis_timestamps
@rjernst -- thanks for the insight. I added a test case for I've added a bit of a "hack" to get that case to work by introducing a new See my latest commit for details. |
dd2edc3
to
f225fcf
Compare
@elasticmachine update branch |
@rjernst -- I've signed that CLA several times (using every one of my several different emails registered in Github). Does Elastic's CLA automation require the email to match what's tied to the commits, etc? I'm happy to email Elastic a copy of my signed agreement(s). edit Looks like the CLA check just passed. So I think we're good? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkophs This is a very clever approach, using a dummy field as a placeholder for negative values.
I left a couple minor comments, but the one thing I think must be addressed is that adding this support just for epoch_millis is going to make support for epoch timestamps inconsistent. Your approach should work for the epoch_second
parser as well, which is in the same EpochTime class. Would you please adjust its implementation similarly?
fieldValues.put(ChronoField.INSTANT_SECONDS, seconds); | ||
fieldValues.put(ChronoField.NANO_OF_SECOND, nanos); | ||
// if there is already a milli of second, we need to overwrite it | ||
if (fieldValues.containsKey(ChronoField.MILLI_OF_SECOND)) { | ||
fieldValues.put(ChronoField.MILLI_OF_SECOND, nanos / 1_000_000); | ||
} | ||
if (fieldValues.containsKey(ChronoField.MICRO_OF_SECOND)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed. I added it defensively since the same logic is happening above with MILLI_OF_SECOND
, but I'm happy to remove it since it's not directly related to the change at hand.
long nanos = temporal.getLong(ChronoField.NANO_OF_SECOND); | ||
if (nanos % 1_000_000 != 0) { | ||
// Fractional negative timestamp. | ||
// Add 1 ms towards positive infinity because the fraction leads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adjustment is tricky, and I had to stare at it for a while to understand. I think the comment would be more helpful if it mentions rounding. Essentially there is some extra precision that we are going to throw away by converting this to millis, and this adjustment by 1 ensures we are always rounding towards the epoch. Is that right? If so, can you improve the comment a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your understanding is correct. If we get into this branch, it means two things: 1) we're dealing with a negative epoch, 2) we're dealing with a fractional millisecond. Thus, the calculated millis
invokes a precision loss that needs to be rounded towards the epoch for formatting (i.e. to a smaller magnitude for formatting, since Java date-time stores it as a negative at a higher magnitude by 1)
I'll update the comment to reflect.
@elasticmachine update branch |
Thanks for the updates @rkophs. This looks good, I will wait for CI to complete and then merge. |
@elasticsearchmachine generate changelog |
1 similar comment
@elasticsearchmachine generate changelog |
Thanks for your work here @rkophs! This is a clever approach. |
Negative epoch timestamps are supported in 8.2.0 by pr elastic#80208
Negative epoch timestamps are supported in 8.2.0 by pr #80208
Negative epoch timestamps are supported in 8.2.0 by pr elastic#80208
Negative epoch timestamps are supported in 8.2.0 by pr #80208 Co-authored-by: QY <[email protected]>
Negative epoch timestamps are supported in 8.2.0 by pr elastic#80208 (cherry picked from commit 7b17e1b)
Negative epoch timestamps are supported in 8.2.0 by pr #80208 (cherry picked from commit 7b17e1b) Co-authored-by: QY <[email protected]>
note: Several issues have been opened & closed for this regression. Assuming this PR is accepted, I'm not sure whether it's best practice to create yet another issue, or re-open an existing issue to link in this PR as the contribution guidelines ask for. Will update the PR accordingly once I get feedback 🙂 .
Relevant issues:
Context
ES 7.0 introduced a regression by no longer supporting negative timestamps once the epoch parsing code was migrated to use DateTimeFormatters instead of DateFormatter implementations. This regression appears to have been introduced
when the DateTimeFormatter replaced DateFormatter for parsing/formatting and the issue at hand appears to live exclusively in the parsing/formatting phase. Following the discussion in 40983, it appears the issue at hand is added complexity in parsing/formatting epoch's in order to accommodate Java 8's date-time API representation of epoch timestamps as a tuple of: a signed epoch-seconds & a non-negative unsigned fractional time (sub second precision).
Indeed,
Instant
uses a epochSeconds + non-negative nanosOfSecond fractional part.The alternative is to use an
ISO 8601
string to represent negative epochs. That is not something we can do without several thousand lines of client code setup to handle the conversion at ingestion, queries & responses and is one of the single largest blockers to a Elasticsearch major version upgrade on our backend.Changes
Enhance the formatting & parsing logic to support negative epoch timestamps:
"-4700.900ms"
->-4s + -700.900ms
->-5s + 299.100ms
->-5s + 299100000ns
)Many of the relevant issues allude to an edge-case that makes this difficult to handle. If this PR does not handle that edge-case, please shed light on what that edge-case is 🙇 .