-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rerank phase in coordinate node after reduced #60946
Comments
Pinging @elastic/es-search (:Search/Ranking) |
@demonatic Indeed, currently there is no way to do rescoring on a coordinating node. You can check if a pinned query could be of help to you. If you can provide the desired docIDs in advance, ES ensures that will be top ranked in your result list regardless of a query. |
@mayya-sharipova I'm afraid the pinned query won't help in my case. Since we can't yet fully decide which document should be put on top on data node, because each top-3 recommendation items on data node may potentially lose globally on coordinaing node and if lost, they must downgrade their score to the level of normal non-recommendation items and sort with these normals by text relevance, so they might go below a normal one in final result and therefore couldn't be pinned. |
This is a promising idea that might also benefit for this problem #28521 (comment) that blocks a solution for #27243 (support field collapsing + rescore) |
Reranking at the coordinator level can be implemented on the client side imo. Or one can also write a small plugin that wraps the _search rest action. Whatever we provide in Elasticsearch would be different than the current rescorer since it works currently on a per-shard basis. We can apply queries and scripts because we have access to the data. However, when moved at the coordinator level, the rescorer can only work on the content of the top hits so that's easier to implement client-side if you require a complex logic. |
We currently run a friend search on our self-developed search engine by our company, and now we want to try migrating it to elasticsearch. But we encounter a problem seems no way to solve without modifying elasticsearch's source code.
The basic idea of our friend search is simple, we have some external kv pairs imply friend id and corresponding intimacy value that can pass to elasticsearch. We want to query first and get some match documents with corresponding text relevance score in descending order, then we select top-3 friend with highest intimacy from top-N match docuements by using external kv-pairs as recommendation items, sort top-3 items by previous text relevance score, and then sort other none-recommendation documents also by previous text relevance score and put these documents behind 3 recommendation items.
We have developed some plugins for our own search engine, I'll describe the basic logic in elasticsearch's counterpart way to claim why I think elasticsearch can't achieve that. After sort the documents by text relevance on data node, we use a rescore script to lift the score of 3 documents within N window_size that has the highest intimacy, suppose the origin text relevance score is xxx, we lift the 3 doc's _score to 100000xxx, and other non-recommend docs's score remain the same, so after rescore the shard result is: the docs with top-3 highest intimacy are put on top-3 result with text relevance in descending order, and others also rank by text relevance in descending order. On coordinate node side since each shard(suppose we have 4 shard) has lifted 3 docs, 3x4-3 candidate recommend docs will lose after merge and need to downgrade their _score and resort by previous text-relevance. We have implemented this logic in our own engine, but we read into the elasticsearch's source code and it finds out that elasticsearch seems to run all kinds of query scripts only on data node, and there's no way to interfere with coordinate node's reduce and merge process(resultConsumer.reduce()) to downgrade the losed recommendation candidate items' score. So adding a rerank phase on coordinate node (just as our current search engine do) would be nice.
The text was updated successfully, but these errors were encountered: