Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: look to pushdown case insensitive functions #118304

Closed
costin opened this issue Dec 10, 2024 · 1 comment · Fixed by #118870
Closed

ESQL: look to pushdown case insensitive functions #118304

costin opened this issue Dec 10, 2024 · 1 comment · Fixed by #118870
Assignees
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@costin
Copy link
Member

costin commented Dec 10, 2024

Description

Until #103599 gets resolved (one way or another), the common "pattern" for case insensitive filters is to use to_upper/to_lower against a given value, in order to preserve the filtering performance (and not convert it into a full scan):

...
| WHERE  TO_UPPER(string_a) == "String"	// this will never match since the value contains lower case chars
| WHERE  TO_LOWER(string_b) == "abc"      // rewritten as a case-insensitive term query
...

One way to achieve this is to identify the to_lower/to_upper == "value" pattern and replace it with InsensitiveEquals operator which can then be:

  • either pushed down
  • translated into a dedicated operator
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 10, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@bpintea bpintea self-assigned this Dec 16, 2024
elasticsearchmachine pushed a commit that referenced this issue Dec 23, 2024
This adds an optimization rule to rewrite TO_UPPER/TO_LOWER comparisons
against a string into an InsensitiveEquals comparison. The rewrite can
also result right away into a TRUE/FALSE, in case the string doesn't
match the caseness of the function.

This also allows later pushing down the predicate to lucene as a
case-insensitive term-query.

Fixes #118304.
bpintea added a commit to bpintea/elasticsearch that referenced this issue Dec 23, 2024
This adds an optimization rule to rewrite TO_UPPER/TO_LOWER comparisons
against a string into an InsensitiveEquals comparison. The rewrite can
also result right away into a TRUE/FALSE, in case the string doesn't
match the caseness of the function.

This also allows later pushing down the predicate to lucene as a
case-insensitive term-query.

Fixes elastic#118304.
elasticsearchmachine pushed a commit that referenced this issue Dec 23, 2024
* ESQL: Rewrite TO_UPPER/TO_LOWER comparisons (#118870)

This adds an optimization rule to rewrite TO_UPPER/TO_LOWER comparisons
against a string into an InsensitiveEquals comparison. The rewrite can
also result right away into a TRUE/FALSE, in case the string doesn't
match the caseness of the function.

This also allows later pushing down the predicate to lucene as a
case-insensitive term-query.

Fixes #118304.

* Disable `TO_UPPER(null)`-tests prior to 8.17 (#119213)

TO_UPPER/TO_LOWER resolution incorrectly returned child's type (that
could also be `null`, type `NULL`), instead of KEYWORD/TEXT. So a test
like `TO_UPPER(null) == "..."` fails on type mismatch. This was fixed
collaterally by #114334 (8.17.0)

Also, correct some of the tests skipping (that had however no impact,
due to testing range).

(cherry picked from commit edb3818)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants