-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Add options to configure minimal score and single match score fo normalization processor for Hybrid search #299
Comments
@martin-gaievski, could you elaborate on why setting a minimum value is beneficial? You mentioned that 'As part of the normalization, scores are adjusted to be in the interval [0, .. 1.0]. That means a matching document with the minimal score will receive a score of 0.0.' Would there be any issues if a document with the minimal score is assigned 0.0 after normalization? Regarding |
Yes, it's an issue, as in most scenarios score 0.0 means this document is no match, so may drop a valid search hit.
This currently makes sense for min/max normalization, and yes we do return 1.0 in case of a single matching doc. My idea was to keep default behavior same, but override if you provided a value for "single_match_score" |
Got it. I think it's a good idea to provide a feature that allows users to override those values, but it would be great if we could also offer the best possible default values. |
I love this feature request. As a workaround, I'm using the following kludge that may be brittle. I'm doing a HYB with a LEX and NEURAL sub queries
This trims off both the undesired LEX and Neural results. Is it, however, too much of a hack? I plan to recheck it with each release (I'm on AWS OSS v2.17) |
Is your feature request related to a problem?
There is no way of setting min score for a hit in the final result list from Hybrid query, as well as it's not possible to set a score that will be returned in case there is only one match from all sub-queries for a min-max normalization technique. That may affect relevance of the result.
What solution would you like?
For min score: set the minimal score a part of the configuration for normalization processor. Possible request can look like:
For a single match score: set the single match score as a parameter for normalization technique. For techniques that do not support such parameter is will be ignored. Possible request can look like:
What alternatives have you considered?
With current implementation additional post-processing is required.
Do you have any additional context?
As part of the normalization scores are adjusted to be in the interval [0, .. 1.0]. That means matching doc with the minimal score will receive score 0.0.
For a min-max normalization technique, if there is only one matching document there will be a single score, X. As per formula that leads to a "division by zero" case, as score is calculated as
(X - min)/(max - min)
.The text was updated successfully, but these errors were encountered: