-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
LUCENE-9099: Correctly handle repeats in ORDERED and UNORDERED interv…
…als (#1097) If you have repeating intervals in an ordered or unordered interval source, you currently get somewhat confusing behaviour: * `ORDERED(a, a, b)` will return an extra interval over just a b if it first matches a a b, meaning that you can get incorrect results if used in a `CONTAINING` filter - `CONTAINING(ORDERED(x, y), ORDERED(a, a, b))` will match on the document `a x a b y` * `UNORDERED(a, a)` will match on documents that just containg a single a. This commit adds a RepeatingIntervalsSource that correctly handles repeats within ordered and unordered sources. It also changes the way that gaps are calculated within ordered and unordered sources, by using a new width() method on IntervalIterator. The default implementation just returns end() - start() + 1, but RepeatingIntervalsSource instead returns the sum of the widths of its child iterators. This preserves maxgaps filtering on ordered and unordered sources that contain repeats. In order to correctly handle matches in this scenario, IntervalsSource#matches now always returns an explicit IntervalsMatchesIterator rather than a plain MatchesIterator, which adds gaps() and width() methods so that submatches can be combined in the same way that subiterators are. Extra checks have been added to checkIntervals() to ensure that the same intervals are returned by both iterator and matches, and a fix to DisjunctionIntervalIterator#matches() is also included - DisjunctionIntervalIterator minimizes its intervals, while MatchesUtils.disjunction does not, so there was a discrepancy between the two methods.
- Loading branch information
1 parent
3246b26
commit aa916ba
Showing
23 changed files
with
747 additions
and
120 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.