Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude doc-values-only fields from '*' field expansion queries #82710

Closed
romseygeek opened this issue Jan 18, 2022 · 5 comments
Closed

Exclude doc-values-only fields from '*' field expansion queries #82710

romseygeek opened this issue Jan 18, 2022 · 5 comments
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team won't fix

Comments

@romseygeek
Copy link
Contributor

#82602 and #82409 allow searching on fields with doc-values but no index. These fields would previously have been excluded from queries against all fields as they would throw an exception when queried, but instead they will now effectively do a full-table scan. We should add another check to QueryParserHelper.resolveMappingFields() to exclude these fields from all-field queries as well.

@romseygeek romseygeek added >enhancement :Search/Search Search-related issues that do not fall into other categories labels Jan 18, 2022
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jan 18, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna
Copy link
Member

javanna commented Jun 29, 2022

I believe that this is not only a problem around expanding *. If users have default_field set, even without wildcard patterns but explicitly referring to fields that have index set to false, they would previously get skipped while they are now queried. On one hand these fields are referred to explicitly hence we are just doing what the user asked, on the other hand they were skipped before and those queries will be much slower now. Also, the default field could be set in the index settings and the users sending queries may not be aware or even have the rights to change it.

@javanna
Copy link
Member

javanna commented Jun 30, 2022

If users have default_field set, even without wildcard patterns but explicitly referring to fields that have index set to false, they would previously get skipped while they are now queried.

That is the case only when the lenient flag is set to true, while its default is false when the fields are set explicitly. On the other hand the lenient flag is set to true whenever * is used to search all fields, and the default field when not specified anywhere is exactly *.

@ywelsch
Copy link
Contributor

ywelsch commented Jun 30, 2022

A different way to think about this is that doc-value-only fields are fields where users are willing to trade some query performance for storage savings. In this context, you could argue that switching "index:false" on a field (e.g. in a template) should not lead to a different behavior on the query side, which is what this is proposing. While "index:false" was previously not an equivalent thing from a query perspective, with the introduction of doc-value-only fields, it has been so, and I would rather preserve query behavior (i.e. not break future users) rather than slightly extending the query semantics we used to have prior to doc-value-only fields.

@javanna
Copy link
Member

javanna commented Jul 4, 2022

We have discussed this with the team and concluded that we are not going to exclude doc_value only fields from the existing expansion logic. We have embraced slower queries over time and it feels wrong that * may expand to a subset of the fields only (the faster ones). The user is asking to query all fields, hence all searchable fields will be queried, and runtime fields have also been included since they were introduced. If users are not happy with the perfomance of their query, they should restrict the fields that get queried within the query or by setting default_field either at the query level or at the index settings level.

Also, doc_value only queries are currently rejected whenever search.allow_expensive_queries is set to false, same as what happens for queries against runtime fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team won't fix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants