-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-8949: Allow LeafFieldComparators to publish Feature Values #831
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @atris; could you add some tests for the new API? Could you also describe use cases we are aiming for? Finally, could you also warn in the javadocs for the new method that it may be a somewhat costly API, e.g. for a SORTED
doc values field.
lucene/core/src/java/org/apache/lucene/search/LeafFieldComparator.java
Outdated
Show resolved
Hide resolved
lucene/core/src/java/org/apache/lucene/search/MultiLeafFieldComparator.java
Outdated
Show resolved
Hide resolved
Thanks @mikemccand. I have added tests for the same and updated the documentation. The use case of the feature is that it allows cross segment comparison of values. One use case is when working with concurrent searches, currently, there is no way to globally identify the minimum/maximum of all. This inhibits cool things, like taking impacts to sorted indices for early termination. |
I appreciate how you are trying to break a large change into smaller changes, but for this one I'd like to see how you would leverage this new API to implement the optimizations that you mentioned? |
Thanks for your comments, @jpountz I abstracted this specific change out because of the reason you mentioned, and because I believe that this change independently allows us some new functionality, such as comparison by values across multiple segments, something that we do not allow today. Does the second functionality seem a useful enough thing to support in itself? |
I can imagine it be useful, but on the other hand this increases the API surface area of comparators while we have been talking about simplifying them, and like Mike pointed out this is a bit trappy as this sounds like a harmless method call, but this is actually costly for string or geo-distance comparators. So I'd like to better understand what benefits it might bring before making the call of adding this new API. |
Hi @jpountz , RE: this PR, I think it is a prerequisite for improvements like https://issues.apache.org/jira/browse/LUCENE-8988 and shared PQ based early termination. I was wondering if we could merge this PR and mark the API as experimental, along with a warning that it could be costly for some specific iterators. WDYT, please? |
…Set File (apache#831) Co-authored-by: Steffen Moldenhauer <[email protected]>
…Set File (#831) Co-authored-by: Steffen Moldenhauer <[email protected]>
No description provided.