-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: [hybrid_search] The rerank effect needs more improvements when setting different metric type for different vector field using "WeightedRanker" reranker #31368
Labels
kind/bug
Issues or changes related a bug
kind/improvement
Changes related to something improve, likes ut and code refactor
stale
indicates no udpates for 30 days
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Milestone
Comments
binbinlv
added
kind/bug
Issues or changes related a bug
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
labels
Mar 18, 2024
yanliang567
changed the title
[Bug]: [hybrid_search] The rerank effect may be bad when setting different metric type for different vector field using "WeightedRanker" reranker
[Bug]: [hybrid_search] The rerank effect need more improvements when setting different metric type for different vector field using "WeightedRanker" reranker
Mar 18, 2024
yanliang567
added
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
kind/improvement
Changes related to something improve, likes ut and code refactor
and removed
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
labels
Mar 18, 2024
/unassign |
binbinlv
changed the title
[Bug]: [hybrid_search] The rerank effect need more improvements when setting different metric type for different vector field using "WeightedRanker" reranker
[Bug]: [hybrid_search] The rerank effect needs more improvements when setting different metric type for different vector field using "WeightedRanker" reranker
Mar 18, 2024
working on it |
sre-ci-robot
pushed a commit
that referenced
this issue
Apr 9, 2024
issue: #25639 #31368 pr :#32020 Signed-off-by: zhenshan.cao <[email protected]>
sre-ci-robot
pushed a commit
that referenced
this issue
Apr 9, 2024
issue: #25639 #31368 Signed-off-by: zhenshan.cao <[email protected]>
This was referenced Apr 15, 2024
sre-ci-robot
pushed a commit
that referenced
this issue
Apr 16, 2024
issue: #31368 pr: #32289 Signed-off-by: binbin lv <[email protected]>
yellow-shine
pushed a commit
to yellow-shine/milvus
that referenced
this issue
Apr 18, 2024
issue: milvus-io#31368 pr: milvus-io#32289 Signed-off-by: binbin lv <[email protected]>
sre-ci-robot
pushed a commit
that referenced
this issue
Apr 29, 2024
issue: #31368 Signed-off-by: binbin lv <[email protected]>
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
kind/bug
Issues or changes related a bug
kind/improvement
Changes related to something improve, likes ut and code refactor
stale
indicates no udpates for 30 days
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Is there an existing issue for this?
Environment
Current Behavior
The rerank effect may be bad when setting different metric type for different vector field using "WeightedRanker" reranker:
for example:
when setting metric type "COSINE" for float vector field A, and setting "L2" for float vector field B, then now it will choose the metric type for the first vector field in schema as the sorted way for the hybrid search result.
But this way seems not reflecting the similarity after reranking, because "COSINE" is the larger the similar, and "L2" is the smaller the similar.
And another point is that it seems not very meaningful to weighted sum of two values in huge range difference, just like the range of "COSINE" is "-1 ~ 1", and "L2" is "-∞ ~ +∞“.
Expected Behavior
A better WeightedRanker algorithm design which could reflect the real similarity
Steps To Reproduce
Milvus Log
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: