Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rerank phase in coordinate node after reduced #60946

Closed
demonatic opened this issue Aug 11, 2020 · 5 comments
Closed

Add rerank phase in coordinate node after reduced #60946

demonatic opened this issue Aug 11, 2020 · 5 comments
Labels
>enhancement :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@demonatic
Copy link

We currently run a friend search on our self-developed search engine by our company, and now we want to try migrating it to elasticsearch. But we encounter a problem seems no way to solve without modifying elasticsearch's source code.

The basic idea of our friend search is simple, we have some external kv pairs imply friend id and corresponding intimacy value that can pass to elasticsearch. We want to query first and get some match documents with corresponding text relevance score in descending order, then we select top-3 friend with highest intimacy from top-N match docuements by using external kv-pairs as recommendation items, sort top-3 items by previous text relevance score, and then sort other none-recommendation documents also by previous text relevance score and put these documents behind 3 recommendation items.

We have developed some plugins for our own search engine, I'll describe the basic logic in elasticsearch's counterpart way to claim why I think elasticsearch can't achieve that. After sort the documents by text relevance on data node, we use a rescore script to lift the score of 3 documents within N window_size that has the highest intimacy, suppose the origin text relevance score is xxx, we lift the 3 doc's _score to 100000xxx, and other non-recommend docs's score remain the same, so after rescore the shard result is: the docs with top-3 highest intimacy are put on top-3 result with text relevance in descending order, and others also rank by text relevance in descending order. On coordinate node side since each shard(suppose we have 4 shard) has lifted 3 docs, 3x4-3 candidate recommend docs will lose after merge and need to downgrade their _score and resort by previous text-relevance. We have implemented this logic in our own engine, but we read into the elasticsearch's source code and it finds out that elasticsearch seems to run all kinds of query scripts only on data node, and there's no way to interfere with coordinate node's reduce and merge process(resultConsumer.reduce()) to downgrade the losed recommendation candidate items' score. So adding a rerank phase on coordinate node (just as our current search engine do) would be nice.

@demonatic demonatic added >enhancement needs:triage Requires assignment of a team area label labels Aug 11, 2020
@danielmitterdorfer danielmitterdorfer added :Search Relevance/Ranking Scoring, rescoring, rank evaluation. team-discuss labels Aug 11, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Ranking)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Aug 11, 2020
@danielmitterdorfer danielmitterdorfer removed the needs:triage Requires assignment of a team area label label Aug 11, 2020
@mayya-sharipova
Copy link
Contributor

@demonatic Indeed, currently there is no way to do rescoring on a coordinating node.

You can check if a pinned query could be of help to you. If you can provide the desired docIDs in advance, ES ensures that will be top ranked in your result list regardless of a query.

@demonatic
Copy link
Author

@mayya-sharipova I'm afraid the pinned query won't help in my case. Since we can't yet fully decide which document should be put on top on data node, because each top-3 recommendation items on data node may potentially lose globally on coordinaing node and if lost, they must downgrade their score to the level of normal non-recommendation items and sort with these normals by text relevance, so they might go below a normal one in final result and therefore couldn't be pinned.

@Morikko
Copy link

Morikko commented Aug 13, 2020

This is a promising idea that might also benefit for this problem #28521 (comment) that blocks a solution for #27243 (support field collapsing + rescore)

@jimczi
Copy link
Contributor

jimczi commented Dec 4, 2020

Reranking at the coordinator level can be implemented on the client side imo. Or one can also write a small plugin that wraps the _search rest action. Whatever we provide in Elasticsearch would be different than the current rescorer since it works currently on a per-shard basis. We can apply queries and scripts because we have access to the data. However, when moved at the coordinator level, the rescorer can only work on the content of the top hits so that's easier to implement client-side if you require a complex logic.
I am going to close this issue because the benefit are not clear and we don't plan to work on a new rescorer in the near future.

@jimczi jimczi closed this as completed Dec 4, 2020
@javanna javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

7 participants