Add rerank phase in coordinate node after reduced #60946

demonatic · 2020-08-11T03:41:40Z

We currently run a friend search on our self-developed search engine by our company, and now we want to try migrating it to elasticsearch. But we encounter a problem seems no way to solve without modifying elasticsearch's source code.

The basic idea of our friend search is simple, we have some external kv pairs imply friend id and corresponding intimacy value that can pass to elasticsearch. We want to query first and get some match documents with corresponding text relevance score in descending order, then we select top-3 friend with highest intimacy from top-N match docuements by using external kv-pairs as recommendation items, sort top-3 items by previous text relevance score, and then sort other none-recommendation documents also by previous text relevance score and put these documents behind 3 recommendation items.

We have developed some plugins for our own search engine, I'll describe the basic logic in elasticsearch's counterpart way to claim why I think elasticsearch can't achieve that. After sort the documents by text relevance on data node, we use a rescore script to lift the score of 3 documents within N window_size that has the highest intimacy, suppose the origin text relevance score is xxx, we lift the 3 doc's _score to 100000xxx, and other non-recommend docs's score remain the same, so after rescore the shard result is: the docs with top-3 highest intimacy are put on top-3 result with text relevance in descending order, and others also rank by text relevance in descending order. On coordinate node side since each shard(suppose we have 4 shard) has lifted 3 docs, 3x4-3 candidate recommend docs will lose after merge and need to downgrade their _score and resort by previous text-relevance. We have implemented this logic in our own engine, but we read into the elasticsearch's source code and it finds out that elasticsearch seems to run all kinds of query scripts only on data node, and there's no way to interfere with coordinate node's reduce and merge process(resultConsumer.reduce()) to downgrade the losed recommendation candidate items' score. So adding a rerank phase on coordinate node (just as our current search engine do) would be nice.

elasticmachine · 2020-08-11T06:40:20Z

Pinging @elastic/es-search (:Search/Ranking)

mayya-sharipova · 2020-08-11T21:01:38Z

@demonatic Indeed, currently there is no way to do rescoring on a coordinating node.

You can check if a pinned query could be of help to you. If you can provide the desired docIDs in advance, ES ensures that will be top ranked in your result list regardless of a query.

demonatic · 2020-08-12T01:30:35Z

@mayya-sharipova I'm afraid the pinned query won't help in my case. Since we can't yet fully decide which document should be put on top on data node, because each top-3 recommendation items on data node may potentially lose globally on coordinaing node and if lost, they must downgrade their score to the level of normal non-recommendation items and sort with these normals by text relevance, so they might go below a normal one in final result and therefore couldn't be pinned.

Morikko · 2020-08-13T09:37:11Z

This is a promising idea that might also benefit for this problem #28521 (comment) that blocks a solution for #27243 (support field collapsing + rescore)

jimczi · 2020-12-04T15:11:20Z

Reranking at the coordinator level can be implemented on the client side imo. Or one can also write a small plugin that wraps the _search rest action. Whatever we provide in Elasticsearch would be different than the current rescorer since it works currently on a per-shard basis. We can apply queries and scripts because we have access to the data. However, when moved at the coordinator level, the rescorer can only work on the content of the top hits so that's easier to implement client-side if you require a complex logic.
I am going to close this issue because the benefit are not clear and we don't plan to work on a new rescorer in the near future.

demonatic added >enhancement needs:triage Requires assignment of a team area label labels Aug 11, 2020

danielmitterdorfer added :Search Relevance/Ranking Scoring, rescoring, rank evaluation. team-discuss labels Aug 11, 2020

elasticmachine added the Team:Search Meta label for search team label Aug 11, 2020

danielmitterdorfer removed the needs:triage Requires assignment of a team area label label Aug 11, 2020

Morikko mentioned this issue Aug 13, 2020

Rescore collapsed documents #28521

Merged

jimczi closed this as completed Dec 4, 2020

jimczi removed the team-discuss label Dec 4, 2020

joshdevins mentioned this issue Mar 25, 2021

Support for multi-stage ranking, per-shard or global #70877

Open

javanna added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rerank phase in coordinate node after reduced #60946

Add rerank phase in coordinate node after reduced #60946

demonatic commented Aug 11, 2020

elasticmachine commented Aug 11, 2020

mayya-sharipova commented Aug 11, 2020

demonatic commented Aug 12, 2020

Morikko commented Aug 13, 2020

jimczi commented Dec 4, 2020

Add rerank phase in coordinate node after reduced #60946

Add rerank phase in coordinate node after reduced #60946

Comments

demonatic commented Aug 11, 2020

elasticmachine commented Aug 11, 2020

mayya-sharipova commented Aug 11, 2020

demonatic commented Aug 12, 2020

Morikko commented Aug 13, 2020

jimczi commented Dec 4, 2020