-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add back support for negative epoch values for time fields in Elasticsearch #40983
Comments
To add to what Hanoch posted, there is a lot of historical data that one might want to store. One tiny example could be:
As an example at a much larger scale, an organization could very well be interested in extending a dataset like GDELT farther into the past. |
To clarify, you're not prevented from indexing historical data, only that we did not build support for negative timestamps into
I could be misreading the source material, but it doesn't appear to me that GDELT uses |
@jasontedor for existed negative epoch_mills, will the reindex API automatically do the conversion when migrating from 6.x to 7.x? |
Pinging @elastic/es-search |
To clarify - the support for for negative timestamps in the As I mentioned in the issue description -
|
In addition to merely storing it, we are interested in the performance of date handling.
What the GDELT project itself does it immaterial to using it as in illustration of something an organization may want to do - and with performant date handling. |
It's material in so far as this issue is about negative timestamps and |
If we are to switch our implementation for newly created indices at Elasticsearch v7.x to use the ISO 8601 string format for date field values, what is your recommendation on how to define the date field format in the index (template) mapping?
We are thinking to remove the date format from the index (template) mapping to allow it to default to the default "strict_date_optional_time||epoch_millis" value, but looking for some kind of confirmation that that approach is the correct and recommended one from both performance as well as functionality (support ISO 8601 string date values as well as positive epoch_millis long date values) point of view. Is the 8601 string date format as performing as the epoch_millis long date format, from an index & query/search time point of view? |
Based on your later comment, it seems you are planning to sometimes pass epoch millis, and sometimes pass ISO 8601 dates? If that is the case, and epoch millis would be used for most dates (all dates after 1970), you might consider still setting the format, but reversing the order so that
We already recommend doing your own performance tests, as each use case is different. Switching the ordering of formats as suggested above could help with performance depending on the workload.
Correct, the |
Some clarifications: Clarification 1
If we are to always only pass ISO 8601 dates (since we are not sure if the data might or might not have dates before 1970), what would be your recommendation:
Clarification 2For Indices we know we don't need to support date values before 1970, which format is more performing - |
I can't give any concrete recommendation based on performance, as I have not done any performance tests of these date formats. It is hard to say whether either would be any faster than the other in practice. It would depend heavily on things like ingestion rate, document size, number of mapped fields, etc. This is why we recommend doing your own performance tests. As for whether to set one or many formats, that completely depends on if you will use the multiple formats. And for the order, I would put the most commonly used first. |
Pinging @elastic/es-core-infra |
This issue has also hit us. We usually store
Given that the dates are stored as longs internally anyway, it seems really silly to have to do all this hacking on the outside for older dates. A workaround could be to simply always use strings, but this would add a overhead of constantly printing and parsing ISO datetime strings that is simply not needed. Is there any chance we could get this regression addressed? |
What is the status of this issue in the 7.3.0 release? |
@hanoch There is no change in the 7.3.0 release. |
@hanoch @Jamalarm We really appreciate the feedback on date usage; it is always welcome. Early feedback especially helps us prevent surprising changes that slip through without proper communication. In this case, we probably should have done a better job before 7.0 identifying this as a potential pain point for upgrade, and we are sorry for that. I realized in reading back through this issue I had never given a proper explanation of what caused the change in the first place. The underlying issue is a difference in how Joda and Java 8 represent instants. Joda internally represented these as milliseconds, while java represents epoch seconds and fractions separately, where the fraction is always positive. Due to how Java 8 date parsers work, these values are parsed independently, so we have no (easy) way of keeping track of negative fractional time (less than one second before epoch) because the negative is lost from the 0 seconds. When originally working on the migration to java 8 time, I had an idea for how to work around this, but it would require several hundred lines of code and many more for tests as well. While we understand that such an edge case an be frustrating, we also think negative epoch values in general are uncommon. Since ISO date times (and other formats) still work to represent dates both before and after epoch, we don't think the overhead of that additional code is worth the maintenance complexity. Thus, we do not plan to make any change to this behavior. Given that, I hope you don't mind that I close this issue. |
@rjernst So according to your explanation above - the only limitation of using epoch_millis values are for values that represent milliseconds of the last second before the epoch (the last second of the year 1970 UTC). These values are blocked from being indexed, but all other positive and negative epoch_millis values are allowed. Please confirm. |
Sorry for the confusion @hanoch. All negative values are disallowed, because of this edge case in negative handling. |
I got hit by this. We have epoch millis in our source data, representing points in time before 1970. Nowhere in https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html is it mentioned that negative epoch millis throws errors :-( could maybe documentation be improved? |
This commit adds a reminder to date type documentation that negative epoch times are not supported. relates elastic#40983
This commit adds a reminder to date type documentation that negative epoch times are not supported. relates #40983
This commit adds a reminder to date type documentation that negative epoch times are not supported. relates #40983
This commit adds a reminder to date type documentation that negative epoch times are not supported. relates #40983
I would just like to suggest that epoch values are likely the most common representation of UTC dates (Java System.currentTimeMillis(), JavaScript Date.now()), and dates before 1970 are also quite common. It seems non-obvious that you have to choose a different representation for dates before 1970. |
Testing with Elasticsearch v7.0.0-rc2, we can't seem to be able to write negative date epoch long values anymore using the
epoch_millis
date field format. This means that any date value before 1/1/1970 is a problem.We are seeing the following error message:
...failed to parse date field [-12321] with format [epoch_millis]...
Using the following GA versions of Elasticsearch we were and are able to write documents with negative date epoch long values using the
epoch_millis
date field format - v1.6.x, v1.7.x, v2.3.2, v5.5.0, v5.6.5, v6.4.2, v6.6.1, v6.6.2, v6.7.0, v6.7.1.We are aware of the workaround to switch to use the ISO8601 string representation of the date values and set the field value to use that date field format rather than the
epoch_millis
format, but we are concerned about the need to change our code-bases and products from using the epoch long values to the ISO8601 string values because of:Relates to:
Logs:
Error Message:
Insert operation failed. Error: "WrappedArray([UpsertFieldValue:null]:Map(type -> mapper_parsing_exception, reason -> failed to parse field [IntervalStart] of type [date] in document with id 'XTYw5WkBP1AXtllM8BvD', caused_by -> Map(type -> illegal_argument_exception, reason -> failed to parse date field [-12321] with format [epoch_millis], caused_by -> Map(type -> date_time_parse_exception, reason -> Failed to parse with all enclosed parsers))),[UpsertFieldValue:null]:Map(type -> mapper_parsing_exception, reason -> failed to parse field [IntervalStart] of type [date] in document with id 'XjYw5WkBP1AXtllM8BvD', caused_by -> Map(type -> illegal_argument_exception, reason -> failed to parse date field [-5718] with format [epoch_millis], caused_by -> Map(type -> date_time_parse_exception, reason -> Failed to parse with all enclosed parsers))))".
Stacktrace:
com.esri.rf.HttpException: Insert operation failed. Error: "WrappedArray([UpsertFieldValue:null]:Map(type -> mapper_parsing_exception, reason -> failed to parse field [IntervalStart] of type [date] in document with id 'XTYw5WkBP1AXtllM8BvD', caused_by -> Map(type -> illegal_argument_exception, reason -> failed to parse date field [-12321] with format [epoch_millis], caused_by -> Map(type -> date_time_parse_exception, reason -> Failed to parse with all enclosed parsers))),[UpsertFieldValue:null]:Map(type -> mapper_parsing_exception, reason -> failed to parse field [IntervalStart] of type [date] in document with id 'XjYw5WkBP1AXtllM8BvD', caused_by -> Map(type -> illegal_argument_exception, reason -> failed to parse date field [-5718] with format [epoch_millis], caused_by -> Map(type -> date_time_parse_exception, reason -> Failed to parse with all enclosed parsers))))". at com.esri.bds.dialect.elasticsearch.editor.MultiPointEditorDialectTester.addFeatures(EditorDIalectTester.scala:49)
Standard Output:
2019-04-04T06:53:03,470 | INFO | c.e.b.d.e.e.MultiPointEditorDialectTester:342 | before test 2019-04-04T06:53:03,703 | ERROR | c.e.a.b.e.HttpBDSClient$:2167 | Data source "Wind_Forecast" failed to create or update new documents. Failed to execute the bulk request. Error: {"took":15,"errors":true,"items":[{"index":{"_index":"esri-ds-data_5da50ee9-9c9b-4352-93a0-919cf645d577_es-6-4-2_20190403","_type":"_doc","_id":"WjYw5WkBP1AXtllM8BvD","_version":1,"result":"created","_shards":{"total":1,"successful":1,"failed":0},"_seq_no":141,"_primary_term":1,"status":201}},{"index":{"_index":"esri-ds-data_5da50ee9-9c9b-4352-93a0-919cf645d577_es-6-4-2_20190403","_type":"_doc","_id":"WzYw5WkBP1AXtllM8BvD","_version":1,"result":"created","_shards":{"total":1,"successful":1,"failed":0},"_seq_no":142,"_primary_term":1,"status":201}},{"index":{"_index":"esri-ds-data_5da50ee9-9c9b-4352-93a0-919cf645d577_es-6-4-2_20190403","_type":"_doc","_id":"XDYw5WkBP1AXtllM8BvD","_version":1,"result":"created","_shards":{"total":1,"successful":1,"failed":0},"_seq_no":143,"_primary_term":1,"status":201}},{"index":{"_index":"esri-ds-data_5da50ee9-9c9b-4352-93a0-919cf645d577_es-6-4-2_20190403","_type":"_doc","_id":"XTYw5WkBP1AXtllM8BvD","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [IntervalStart] of type [date] in document with id 'XTYw5WkBP1AXtllM8BvD'","caused_by":{"type":"illegal_argument_exception","reason":"failed to parse date field [-12321] with format [epoch_millis]","caused_by":{"type":"date_time_parse_exception","reason":"Failed to parse with all enclosed parsers"}}}}},{"index":{"_index":"esri-ds-data_5da50ee9-9c9b-4352-93a0-919cf645d577_es-6-4-2_20190403","_type":"_doc","_id":"XjYw5WkBP1AXtllM8BvD","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [IntervalStart] of type [date] in document with id 'XjYw5WkBP1AXtllM8BvD'","caused_by":{"type":"illegal_argument_exception","reason":"failed to parse date field [-5718] with format [epoch_millis]","caused_by":{"type":"date_time_parse_exception","reason":"Failed to parse with all enclosed parsers"}}}}}]} 2019-04-04T06:53:03,705 | INFO | c.e.b.d.e.e.MultiPointEditorDialectTester:371 | after test Standard Error REPRODUCE WITH: gradlew null -Dtests.seed=FEC706F81CEFF67D -Dtests.class=com.esri.bds.dialect.elasticsearch.editor.MultiPointEditorDialectTester -Dtests.method="deleteFeaturesByExtent" -Dtests.security.manager=false -Dtests.locale=sr-RS -Dtests.timezone=Asia/Yakutsk
The text was updated successfully, but these errors were encountered: