-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sql: fix trigram span generation for similarity filters
This commit fixes a bug in the generation of trigram inverted index spans for similarity filters. The bug could cause rows to be incorrectly filtered out of results. Previously, we did not generate padded trigrams when building the inverted spans for similarity filters. Now, we generate padded trigrams to correct the bug. For example, for a filter such as `col % 'aab'`, we would generate the single trigram `'aab'` and the corresponding span `['aab'-'aab']`. This span does not contain all indexed trigrams of values that are similar to `'aab'`. As an example, it covers none of the trigrams of `'aaaaaa'`, which are `{' a',' aa','aa ','aaa'}`. Now, for the same expression `col % 'aab'`, we generate the padded trigrams `{' a',' aa','aab','ab '}` and the corresponding spans `[' a'-' a'], [' aa'-' aa'], ['aab'-'aab'], ['ab '-'ab ']` which contain some of the trigrams of `'aaaaaa'`. Fixes #89609 Release note (bug fix): A bug has been fixed that caused incorrect results for queries with string similar filters (e.g., `col % 'abc'`) on tables with trigram indexes. This bug is only present in 22.2 pre-release versions up to and including v22.2.0-beta.3.
- Loading branch information
Showing
5 changed files
with
120 additions
and
46 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters