-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] inner_hits in nested neural query should return all the chunks #2113
Comments
Neural search with |
@yuye-aws Inner hits are not supported in hybrid query. There is a feature request for this (opensearch-project/neural-search#718), but at the moment there is no path forward |
I'm not using hybrid query, just a plain neural query. |
Are both features not supported due to the same blocking issue? |
Sorry, my bad. Neural query is different, I'm not sure why nested doesn't work, in the code of neural we delegate execution to knn query, so you may want to check how it's done in knn. Easy test would be to try if plain knn query supports "nested" clause |
Already tried in my fifth step. |
In step 5 you do have neural query. I mean the knn query, something like in following example but with nested:
|
@yuye-aws I found this change in knn #1182, the essense of it is: in case of nested documents we need to return only one that gave the max score, and drop others. It became new default behavior instead of old one where all nested docs (meaning inner hits) are returned. From knn it's inherited by neural query. |
This does not make sense, because the score_mode can also be avg, where we expect to see all the scores. |
Shall we make a PR to knn repo? After all, nested k-NN query also needs avg score mode. |
Replied in #1743 (comment). Also, resolving this issue can help resolve a user issue: opensearch-project/ml-commons#2612. I was considering to implement a new search response processor to retrieved most relevant chunks, but is fortunately blocked by the current issue: opensearch-project/ml-commons#2612 (comment) |
Would love this! |
What is the bug?
I am using text_chunking and text_embedding processor to ingest documents into an index. The text_chunking search example works well, but the inner_hits only returns a single element from the chunked string list. It does not matter when I set the score_mode to max or avg.
How can one reproduce the bug?
What is the expected behavior?
The inner_hits should return matching score and offset of all the retrieved documents.
What is your host/environment?
Mac OS
Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered: