-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using empty string in range query 'gt' returns no documents #63386
Comments
Pinging @elastic/es-search (:Search/Search) |
This is indeed odd and to me seems a bit counterintuitive. At least for the lower bound of a text or keyword field I would assume ever value being larger than the empty string. The problem with this edge case is that the empty string should be considered an open lower bound that matches everything. This works as expected for "inclusice" lower bound (i.e. "gte") but not for "exclusive" (i.e. "gt") because its not completely clear which term to exclude. Internally we use Lucenes TermRangeQuery which does the right thing for "null" values here implicitely sets the inclusion flag to "true" in this case (see https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/TermRangeQuery.java#L52 and https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/TermRangeQuery.java#L81). |
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. This PR fixed this special edge case by implicitely setting the "lower" include flag in this case before constructing the TermRangeQuery. Closes elastic#63386
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes #63386
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes elastic#63386
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes elastic#63386
Currently when searching with an empty string as lower bound for a range query on text-based fields we return all documents when 'gte' is used (including the lower bound) but no documents when 'gt' is used. This might seem counterintuitive since every value should be greate than the empty string. The bug has been fixed in Lucene and this PR adds a test for assuring we observe the fixed behaviour on searches now. Closes #63386
Should be fixed in 7.11. |
I have created a new Index and then added two documents to it.
PUT test_index
PUT /test_index/_doc/1 { "A" : "5" }
PUT /test_index/_doc/2 { "B" : "5" }
After that I searched for every document with A greater than emty in that index,
GET test_index/_search { "query": { "range": { "A": { "gt": "" } } }, "size": 2000 }
and received no document.
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] } }
On the other hand when I searched for every document with A greater than equal empty,
GET test_index/_search { "query": { "range": { "A": { "gte": "" } } }, "size": 2000 }
I received one document.
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "test_index", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "A" : "5" } } ] } }
Why does the greater than comparison behave different and does not return a document, too?
The text was updated successfully, but these errors were encountered: