[Meta] ReRankProcessor enhancement: ReRank by Field #926

brianf-aws · 2024-10-07T20:38:03Z

Is your feature request related to a problem?

Currently if a user has access to a field in their document or in a search response we would like to be able to re-rank by that metric. Currently the ReRank process supports ML_OpenSearch as a way to re rank. We would like to provide users with a way to perform a 2nd level re ranking, We would like to make it a ByFieldRerankProcessor

What solution would you like?

The rescoring logic (i.e update the _score field to reflect a new score) is already provided in the RescoringReRankProcessor. All that would be required is to implement ByFieldRerankProcessor to use the scores provided by the document or a previous search response.

Ideally the interface would look like this (My implementation is in the Neural Search repo)

{
    "response_processors": [
        {
            "rerank": {
                "by_field": {
                    "target_field": "ml_score",
                     "remove_target_field": true, ## Default false
                     "keep_previous_score" : true ## Default false
                }
            }
        }
    ]
}

It was discussed that previous scores should be kept as an option as this may hinder the expectation of the user, thus we added a field called keep_previous_score to allow this. We also want to give an option to be able delete the field (via remove_target_field) provided to perform the re ranking as this is redundant data.

What alternatives have you considered?

Creating a separate response processor, in open search core, that replaced and sorted the response. This was initially proposed in OpenSearch Core opensearch-project/OpenSearch#15631. But after offline discussion we decided that this functionality could be transferred to a processor that already does re ranking, in the Neural Search repo.

Do you have any additional context?

This functionality was brought up as a necessity to enhance the ML Inference Processor in the ML-Commons codebase

The text was updated successfully, but these errors were encountered:

brianf-aws added enhancement untriaged labels Oct 7, 2024

ylwu-amzn added v2.18.0 Roadmap:Vector Database/GenAI Project-wide roadmap label labels Oct 7, 2024

opensearch-infra bot added this to OpenSearch Roadmap Oct 7, 2024

github-project-automation bot moved this to New in OpenSearch Roadmap Oct 7, 2024

ylwu-amzn removed the untriaged label Oct 7, 2024

ylwu-amzn assigned brianf-aws Oct 7, 2024

brianf-aws mentioned this issue Oct 14, 2024

ByFieldRerank Processor (ReRankProcessor enhancement) #932

Merged

5 tasks

martin-gaievski closed this as completed in #932 Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Meta] ReRankProcessor enhancement: ReRank by Field #926

[Meta] ReRankProcessor enhancement: ReRank by Field #926

brianf-aws commented Oct 7, 2024 •

edited

Loading

[Meta] ReRankProcessor enhancement: ReRank by Field #926

[Meta] ReRankProcessor enhancement: ReRank by Field #926

Comments

brianf-aws commented Oct 7, 2024 • edited Loading

Is your feature request related to a problem?

What solution would you like?

What alternatives have you considered?

Do you have any additional context?

brianf-aws commented Oct 7, 2024 •

edited

Loading