Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

64 bit number/integer support in Kibana #40183

Open
timroes opened this issue Jul 2, 2019 · 11 comments
Open

64 bit number/integer support in Kibana #40183

timroes opened this issue Jul 2, 2019 · 11 comments
Labels
impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:x-large Extra Large Level of Effort Meta Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. v8.9.0

Comments

@timroes
Copy link
Contributor

timroes commented Jul 2, 2019

The problem

JavaScript (in contrast to Java which Elasticsearch uses) does not support 64 bit integer values. Using a number in JavaScript results in the Number type which itself is a 64bit IEEE floating point number, thus leaving us a 52 bit mantisse to store integer values without rounding errors. Everything above what fits into 52 bit will have rounding errors.

You can see that in the following screenshot when entering a number that doesn't fit into 52 bit anymore, JavaScript cannot represent it correctly anymore. So even though I wrote two different numbers (1st and 3rd line) they both are the same number for JavaScript.

screenshot-20190702-181742

That means every integer number above 9007199254740991 (Number.MAX_SAFE_INTEGER) or below -9007199254740991 (Number.MIN_SAFE_INTEGER) will be represented with rounding errors in Kibana, even if Elasticsearch can handle it correctly and return it correctly.

This problem is intrinsic to JavaScript and you should be careful when trying to debug something like that, since even the Chrome Dev Tools when using the "Preview" tab on a response will round those values, and only the "Response" tab will show the result as it really was returned from the Server. This is though only true for JSON responses. For APIs returning ndjson (new line separated JSON) as our internal search endpoints, values will be rounded in the "Preview" and in the "Response" tab.

devtools

Effect in Kibana

Thus fields in Elasticsearch that can contain numbers outside that safe JavaScript range (e.g. long, date_nanos, unsigned_long) are prone to that rounding error in several parts of Kibana. This section should summarize what might not work in Kibana or how it behaves in Kibana.

Dev Tools > Console

✔️ works (with minor limitations)

You will see no rounding errors in the Console's requests or response from 6.5 onwards, since this will never be "interpreted" as an object with potential numbers, but Kibana simply output the bytestream from the response into that panel. (#23685 introduced this)

In 6.4 and earlier versions of Kibana, the same rounding error problem will occur in console as in other parts of Kibana.

Limitation: Performing a code reindent in the console is impacted by this issue, as the reindent logic is parsing the JSON content, i.e. the request after reindentation might have rounded values from what the user entered beforehand. See #101391

Discover

several issues

When looking at documents containing a value outside that safe range, it will be shown in Discover with this rounding error, thus not necessarily representing the real value in Elasticsearch. This is true for the primary Discover table AS for the JSON tab (including values inside _source). Though the table round it "more specific", thus those two values, might still be different from each other, but both might be rounded, see the following example where the value "9174563637958049791" was injected into a long field in Elasticsearch:

Discover-Elastic

Sorting on a field containing those values will still work as expected (even though the values are not shown correctly), since we let Elasticsearch do the sorting, which doesn't have that issue.

A field of type date_nanos will actually work and also show the correct nanosecond value, because we don't treat them as numbers, but use Elasticsearch's capabilities to return them as date strings, which we can then print and never lose precision by touching that field as a number.

Prior to 7.13 or when having the advanced setting discover:readFieldsFromSource switched on, you might even see this rounding issues in Discover on fields of type keyword or text if they were inserted into Elasticsearch as a number. Since we print the value returned in _source and this value will represent the value as it was indexed (meaning if it wasn't put into quotes as a string while indexing, we will neither get it returned as a string). See #38398 for an example.

Visualizations

several issues

Visualizing date_nanos is safe, since Elasticsearch treats them as milliseconds for aggregations, which are inside the safe range for numbers in JavaScript.

For all fields really containing values outside that safe integer range in JavaScript it actually depends a bit on how they are used in visualizations.

If you are creating buckets using such a field, each document will be in the correct bucket (since that's handled by Elasticsearch), but you might see a couple of weird effects:

  • Buckets that due to the rounding error have the same key would be merged into one bucket in some visualizations (this is actually a rather rare case, and you most likely won't have such small buckets that the rounding error actually give them the same key).
  • For all those values outside the safe range, you will see the rounded values on axes and tooltips.
  • If you visualize such a value e.g. on the y-axis, the values drawn inside the charts are the rounded ones not the correct ones returned from Elasticsearch. You can see this in the following example chart. I indexed 5 documents, and all of them have a different value in the value field, but 3 of those documents round to the same number and 2 of them to the other same number, so the chart looks as follows, even though all the values where continuously decreasing from document to document:
    screenshot-20190702-190816

KQL

✔️ Everything works from 7.12 onwards

Prior to 7.12 KQL queries were prone to rounding errors. If you wanted to query for large integers outside that safe range in KQL, you'll be affected by the rounding issue, e.g. if you're looking for document were number > 9007199254740993 you'll also retrieve documents that had a value of 9007199254740992 in it, since the value in the query was rounded.

Inspector

several issues

The Inspector in Kibana, that can be used to investigate request/responses is affected by that issue, meaning responses shown there, might already contain rounded values and not fully reflect what Elasticsearch actually returned.

The future

There actually is a stage 3 4 proposal (the most mature level before getting into the standard) for JavaScript (and implemented in TypeScript since 3.2) that adds a BigInt datatype into JavaScript (https://github.com/tc39/proposal-bigint). If we would have that, we could potentially fix that problem, though it would still take a lot of effort, since we would need to make sure we're having a custom JSON parser and not let the browser touch any response and potentially lose that precision by parsing a numeric value as a pure Number. Then we would need to make sure everything from the first custom parsing into BigInt to visualizations are aware that numeric values could also be BigInt. This will be a huger undertaking, once JavaScript would support that datatype.

@vigneshshanmugam
Copy link
Member

Have you considered using alternatives like https://github.com/GoogleChromeLabs/jsbi right now to fix the issue. Migrating to big int could be done simply by using a babel plugin.

@LeeDr
Copy link

LeeDr commented Oct 19, 2020

Moved the content of this comment to the respective issue.

@vikmik
Copy link

vikmik commented Nov 23, 2021

Note: this can come some confusion as viewing the same document as JSON vs table shows different results:
1637679095_1551_23112021_365x506
1637679053_1550_23112021_1867x402

(see how the values are rounded in the JSON view)

@timroes
Copy link
Contributor Author

timroes commented Nov 24, 2021

@vikmik I've updated the description about Discover above, to make it more explicit that both values can be rounded, the one in the table AND the one in the JSON (including _source). Keep in mind that the values that you're seeing in the table might not be the correct ones either. They simply round differently than in the JSON view.

@joshdover
Copy link
Contributor

@thomasneirynck Could we get an update on this issue? We are exploring better supporting nanosecond timestamps in Beats/Agent and need to be sure at least the basics in Kibana work. Specifically, Discover, Lens, and TSVB. It's unclear from the issue description what issues still remain.

@thomasneirynck
Copy link
Contributor

@joshdover There has been no movement on this.

Could you give an indication on urgency/importance?

A few clarifications around "support" of big numbers:

  • is this mainly about display. (a nano-timestamp should be displayed correctly, e.g. in Discover, in metric-chart, ...)
  • full integration with entire functionality (ie. Lens formula's should work correctly)

@alisonelizabeth
Copy link
Contributor

alisonelizabeth commented Aug 21, 2023

Dev Tools > Console
✔️ works (with minor limitations)

You will see no rounding errors in the Console's requests or response from 6.5 onwards, since this will never be "interpreted" as an object with potential numbers, but Kibana simply output the bytestream from the response into that panel. (#23685 introduced this)

While the issue description indicates the problem does not exist in Console in 6.5+, this no longer appears to be the case. The issue is reproducible on 8.8.

Steps to reproduce:

POST index1/_doc
{
  "f" : 4565354787218997248
}

GET index1/_search

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "index1",
        "_id": "5KEr_okBl89TvRjx_b7V",
        "_score": 1,
        "_source": {
          "f": 4565354787218997000
        }
      }
    ]
  }
}

Update: I confirmed the regression was introduced in v8.3.

@toddferg
Copy link

Can confirm its occuring in 8.9.0 as well.

@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 18, 2023
@toddferg toddferg added the bug Fixes for quality problems that affect the customer experience label Sep 18, 2023
@kertal kertal removed bug Fixes for quality problems that affect the customer experience needs-team Issues missing a team label labels Sep 26, 2023
@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 26, 2023
@thomasneirynck thomasneirynck added the Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. label Sep 26, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Sep 26, 2023
@davismcphee davismcphee added loe:x-large Extra Large Level of Effort impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. labels Sep 26, 2023
@marwan-at-work
Copy link

👋🏼 Is there an update here? this was an unfortunate surprise 😔

@thomasneirynck
Copy link
Contributor

@marwan-at-work no update on this at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:x-large Extra Large Level of Effort Meta Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. v8.9.0
Projects
None yet
Development

No branches or pull requests