[6.8] Backport: Search optimisation - add canMatch early aborts for queries on "_index" field #80005

mdaudali · 2021-10-28T11:05:22Z

Hey!

Backports PR #48681 for issue #48473, As per the maintenance plan hopefully it should be acceptable to backport this to 6.8!

Context:
When two or more indices share the same field name, with different, non-coercible field types, searches across these indices fail when searching on the shared field name, even when scoped to a single index via an "_index" term query. This happens frequently when documents have a common id field name (e.g. "id"), but with different types.

A more concrete example:

curl -X PUT http://localhost:9200/index_with_keyword_field -H 'Content-Type: application/json' -d '
{
  "mappings": {
    "_doc": {
      "dynamic": "strict",
      "properties": {
        "id": {
          "type": "long"
        },
        "test": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}'

curl -X PUT http://localhost:9200/index_with_long_field -H 'Content-Type: application/json' -d '
{
  "mappings": {
    "_doc": {
      "dynamic": "strict",
      "properties": {
        "id": {
          "type": "long"
        },
        "test": {
          "type": "long"
        }
      }
    }
  }
}'

curl -X POST http://localhost:9200/index_with_keyword_field/_doc/1 -H 'Content-Type: application/json' -d '
{
  "id": 1,
  "test": "mystring"
}'

curl -X POST http://localhost:9200/index_with_long_field/_doc/1 -H 'Content-Type: application/json' -d '
{
  "id": 1,
  "test": 1
}'

# Fails as expected on the index_with_long_field, given that can't coerce string -> long
curl -X GET http://localhost:9200/index_with_keyword_field,index_with_long_field/_search -H 'Content-Type: application/json' -d '
{
  "query": {
    "term": {
      "test": "mystring"
    }
  }
}'

# Fails on the index_with_long_field shard (therefore only returning partial results [the matched results from index_with_keyword_field]), even though though we're searching "mystring" against only the index with a keyword field, and 1 only against the index_with_long_field. 
# This works correctly on 7.6+ due to the PR this PR backports.
curl -X GET http://localhost:9200/index_with_keyword_field,index_with_long_field/_search -H 'Content-Type: application/json' -d '
{
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "term": {
                  "test": "mystring"
                }
              },
              {
                "term": {
                  "_index": "index_with_keyword_field"
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must": [
              {
                "term": {
                  "test": 1
                }
              },
              {
                "term": {
                  "_index": "index_with_long_field"
                }
              }
            ]
          }
        }
      ]
    }
  }
}'

Why not move to ES7?: The migration from ES6 -> ES7 in progress, but we would like to backport the fix as we still have clusters on ES6.8

Thanks! Let me know if there's any further changes I should make, or any additional clarifying information I can provide :)

(Apologies if there's a separate SOP for request backports. I have checked previous issues and the forum for prior examples for backporting, and did not find any.)

indexMatcher doesn't exist in the 6.8 branch.

jtibshirani · 2021-10-29T01:02:22Z

Hello @mdaudali, we only backport bugfixes to the previous major version 6.8, since we only perform patch releases (which aren't supposed to include enhancements and new features). So unfortunately this change isn't eligible for backporting.

Would you be able to refine the index pattern you're searching against to only include indices where the fields have compatible types? This is the usual approach we encourage.

mdaudali · 2021-11-03T11:49:42Z

Hey @jtibshirani! Thanks for the quick response.

Regarding the bugfix backport policy. Totally makes sense! In this case however, I believe this to be a bugfix despite the original PR being proposed as an optimisation. The original behaviour causes an unexpected exception to occur for a certain class of queries - namely, queries that search on shared fieldnames but are singly-scoped to a particular index with a bool-must filter.

Regarding splitting the query up apriori: Understood. However, a paging query that should span across N documents would need to be internally split and sent to ES as up to N different paging queries, and the set of results would need to be held in memory and sorted, before manually creating pages to return back to the user. Overall, that can end up being quite a fiddly workaround!

Hope this makes sense, happy to clarify anything :)

Thanks!

jtibshirani · 2021-11-03T17:36:15Z

@mdaudali got it, I understand better why you consider this to be a 'bugfix' as well. I will check with the rest of the team to see what they think. Could you explain more about your use case? I didn't quite follow the description "However, a paging query that should span across N documents would need to be internally split and sent to ES as up to N different paging queries..."

Some other important context: as our work ramps up for the 8.0 release, we can't be certain there will be more 6.8 patch releases. So even if we do decide to backport this, you might not be able to pick it up.

elasticmachine · 2021-11-03T17:36:46Z

Pinging @elastic/es-search (Team:Search)

mdaudali · 2021-11-08T15:20:26Z

Thanks @jtibshirani!

Apologies, I meant N different indices. In the ideal state (which this PR fixes), I can perform a paging query across index A, B, C, D in a single query, and request and return say the top 100 results in each page.
However, this is currently not possible in the case described in the original description. Suppose A, B had a field name X of type int, and C, D of type string. Searching for OR(AND(X=5, index={A, B}), AND(X="my_str", index={C, D})) fails on the shards for indices A and B, returning only results from C, D

The suggested solution of running separate queries on indices (A, B) and (C, D) would require loading a number of pages from both sets, storing the results and sorting them appropriately, then provide a page back to the client representing the top 100 matches from A, B, C, D as originally requested. An intermediary would then also need to cache any additional entries that have already been loaded from ES to be returned when a client requests the next page - essentially re-implementing ES paging functionality.

This can be quite a complicated workaround even with merging just two paging calls together - and gets increasingly so when running more concurrent paging queries (when searching over more indices)

Thanks!

cla-checker-service · 2021-11-08T21:28:32Z

💚 CLA has been signed

jimczi · 2021-11-09T16:48:31Z

As Julie explained, it is very unlikely that we release a 6.8 patch version any soon. Although I wonder if your problem can be solved differently. The term query is not lenient, if you provide an input that is not parseable by the field type, an error is thrown. If you require leniency, you could switch to the match query and set the lenient flag to true. In such case your query would get the correct behavior even on indices that cannot parse the input correctly.

Would that be enough to solve your problem ?

The backport only fixes queries that reference the actual index name, since the indexMatcher for aliases was not backported.

mdaudali · 2021-11-11T21:55:37Z

Hey @jimczi,

Thanks for the response and great suggestion!

Sadly, I don't believe the match filter will solve the problem at hand. To expand - a match query, as I understand it, allows us to search for a single value on a field (+ fuzzy matching). A query across many different terms would require a disjunction of match queries across all terms.

It's not uncommon to receive >10,000 individual terms in a single query. This, in turn, means we hit the 1024 bool max clause count limit very quickly. On the other hand, a terms query has a default term count limit of 65536, which is more than sufficient.

An assumption here is that the bool max clause count limit is significantly lower since additional clauses are more resource-expensive/intensive than the equivalent number of terms in a terms query. If this isn't the case (and therefore a bool OR'd match query is as expensive as terms given the same number of terms), then bumping the limit would be sufficient.

As an aside, thanks for the context on the 6.8.X release. It's an accepted risk if we're not able to find a better solution through existing ES 6.8 primitives, but naturally no expectation on the ES team for a release to be cut if this gets merged.

Thanks!

mdaudali · 2021-11-30T18:14:03Z

Hey @jtibshirani, @jimczi ,

Just wanted to follow up here, are there any other alternative solutions we should try?

Thank you!!

jtibshirani · 2021-12-06T18:47:30Z

@mdaudali after considering the options, I don't think we should backport this to 6.8. We really try to minimize changes on 6.8 in the rare case we need to quickly issue a patch release (usually for a security issue). Although the changes in the PR look small, to do a "proper fix" we'd need to backport the changes around SearchIndexNameMatcher too, which are more complex.

I'm sorry this didn't result in the outcome you were hoping for. I hope your migration to 7.x goes smoothly, and in the meantime find some way to move forward (perhaps by adjusting the mapping to avoid fields with the same name but with conflicting types?)

mdaudali added 2 commits October 27, 2021 11:19

feat: cherry pick from elastic#48681

a4d9ce4

fix: use string comparison methods instead of indexMatcher

4baaa2f

indexMatcher doesn't exist in the 6.8 branch.

elasticsearchmachine added v6.8.21 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Oct 28, 2021

jtibshirani added :Search/Search Search-related issues that do not fall into other categories team-discuss labels Nov 3, 2021

elasticmachine added the Team:Search Meta label for search team label Nov 3, 2021

fix: Remove tests that skip indices based on aliases

644e142

The backport only fixes queries that reference the actual index name, since the indexMatcher for aliases was not backported.

mdaudali force-pushed the mdaudali/skipIndices branch from bac1e99 to 644e142 Compare November 9, 2021 17:03

jtibshirani removed the team-discuss label Nov 11, 2021

jtibshirani closed this Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[6.8] Backport: Search optimisation - add canMatch early aborts for queries on "_index" field #80005

[6.8] Backport: Search optimisation - add canMatch early aborts for queries on "_index" field #80005

mdaudali commented Oct 28, 2021

jtibshirani commented Oct 29, 2021

mdaudali commented Nov 3, 2021

jtibshirani commented Nov 3, 2021

elasticmachine commented Nov 3, 2021

mdaudali commented Nov 8, 2021

cla-checker-service bot commented Nov 8, 2021 •

edited

Loading

jimczi commented Nov 9, 2021

mdaudali commented Nov 11, 2021 •

edited

Loading

mdaudali commented Nov 30, 2021

jtibshirani commented Dec 6, 2021

[6.8] Backport: Search optimisation - add canMatch early aborts for queries on "_index" field #80005

[6.8] Backport: Search optimisation - add canMatch early aborts for queries on "_index" field #80005

Conversation

mdaudali commented Oct 28, 2021

jtibshirani commented Oct 29, 2021

mdaudali commented Nov 3, 2021

jtibshirani commented Nov 3, 2021

elasticmachine commented Nov 3, 2021

mdaudali commented Nov 8, 2021

cla-checker-service bot commented Nov 8, 2021 • edited Loading

jimczi commented Nov 9, 2021

mdaudali commented Nov 11, 2021 • edited Loading

mdaudali commented Nov 30, 2021

jtibshirani commented Dec 6, 2021

cla-checker-service bot commented Nov 8, 2021 •

edited

Loading

mdaudali commented Nov 11, 2021 •

edited

Loading