-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BigInteger/BigDecimal support #5683
Conversation
Conflicts: src/test/java/org/elasticsearch/common/xcontent/builder/XContentBuilderTests.java
FYI there is some discussion on https://issues.apache.org/jira/browse/LUCENE-5596 in order to add this range support to types that are more than 64 bits. |
Quick update: most of this change is good and we would be a good start to support big integers/decimals in the future. I added the
|
"these types should probably be forbidden in the numeric metrics aggregations, otherwise we would either need to use big decimals there which would kill performance, or the information loss would make results unusable" I agree that numeric metrics aggregation must never use BigInteger/BigDecimal types. A thought is to add a special aggregation type, like "monetary/financial aggregation", where performance is less important with regard to exactness/correctness of numeric results, and BigDecimal is not converted to double/float. "should they be specified as strings or numbers in the _source document? (would there be compatibility issues with some languages/json parsers/json generators with numbers?)" The Jackson library maps it to "JSON Type number" http://wiki.fasterxml.com/JacksonDataBinding |
My concern here is more with other languages, eg Javascript can't support bigint/decimals, and we'll find lots of similar issues. It may be ok to accept them as numbers, as long as we also support coercing from strings. That way users of languages without support can still use them. |
The problem of Javascript is, it has poor support of numbers, even 64bit ints fail (and I think ES/Lucene supports 64bit longs for a while now). BigInteger/BigDecimals can be added as an extension, at least to Node.js https://www.npmjs.org/package/json-bignum |
👍 much awaited. |
I think https://issues.apache.org/jira/browse/LUCENE-6697 (just released in Lucene 5.3.0) is a compelling way to allow fast range filters on BigInteger/Decimal values. Values for the field must be indexed as a SortedSetDocValuesField (with the BigInteger/Decimal value converted to a byte[]) and the field must use the RangeTreeDocValuesFormat. Then use the NumericRangeTreeQuery at search time. Some care must be taken in the byte[] encoding, so that sort order is the same, e.g. I think this means the BigInteger field must have a max allowed value (set once up front in the mapping), and maybe the BigDecimal field must have the same up-front scale across all values (?), and the sign bit needs to be flipped like we do for NumericField. But I think it should work well, and from my limited perf testing on the original issue, the resulting index is smaller and filters are faster than NumericField/RangeQuery. One caveat is because this code is very new, it lives in sandbox now, and there's no guarantee of back-compat of the file-format it writes. But then, the file format is also ridiculously simple ... |
Big integers are also interesting for cryptographic applications. |
@jprante Does the above fix support range and filter queries too ?. Any Idea when Elastic Search is gonna add BigDecimal /BigInteger Support oficially |
From what I can see BigDecimal/BigInteger is implemented in Lucene 5.3 which will appear in Elasticsearch 2.x (not 2.0) |
Hey, I have applied this fix mentioned in this post however when I index data or fetch it data is getting rounded off. I am using the REST API calls. Am I doing anything wrong here. Here is my mappings { Data: { Get Result: { |
@SKumarMN the patch is only 50% of the required work. It only means that BigInteger/BigDecimal is accepted as JSON input. The default is to downgrade the accepted values to double/float wherever possible, otherwise, the change would not be compatible to existing ES applications. REST actions would have to be changed to prefer BigInteger/BigDecimal. |
This code is in the Lucene sandbox only. We need to wait until it graduates to core before we can start using it. |
I'm working on graduating this to Lucene's core ... here's the first step: https://issues.apache.org/jira/browse/LUCENE-6825 |
w00t! |
Hi, I have used the fixhttps://github.com//pull/5758 in my 1.4.4 code to support big integer by changing the IPV6 Mapper. Search and range queries works fine. Our application needs support for Bigdecimal too. Could you please provide me pointers about how can i implement big decimal support with range functionality as well.. |
Closing in favour of #17006 |
For XContentBuilder/XContentParser and document mapping, this will add support for "big" numeric types BigInteger/BigDecimal.
BigInteger/BigDecimal support for XContentBuilder/XContentParser is implemented by using the existing Jackson support for the "big" numeric types. A new method
losslessDecimals()
is used to switch the XContentParser into recognizing BigInteger/BigDecimal in precedence over primitive numeric types, for better convenience when using the Java API for parsing document sources with BigInteger/BigDecimal field values.For the document mapping, new core types
biginteger
andbigdecimal
are introduced. With a new flaglossless_numeric_detection
, the precedence of BigInteger/BigDecimal over primitive numeric types can be controlled in the mapping. When set totrue
, new dynamic numeric fields are assigned to "big" numeric types first. Default isfalse
, where primitive numeric types still take precedence.Caveat: BigInteger/BigDecimal support is just meant for search and indexing/storing. The "big" numeric types are degraded to their
.longValue()
and.doubleValue()
components when they are used in NumericRangeQuery and related contexts, so it is not recommended to use values larger than Long.MAX_VALUE or Double.MAX_VALUE in analytical queries like facets and aggregations, strange cut-offs or underflows/overflows should occur.