-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update document scores based on ranker node #2048
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a test case for this new feature. At least to check that documents have scores between 0 and 1 in both cases (logits_dim==1 and logits_dim>1)
Hey @julian-risch, I thought about this a little more and I came up with a few issues:
And I think that it will actually break things if we apply sigmoid for multiple labels. Thinking further about this, I'd still like to change this and add a What do you think? |
Hi @mathislucka thanks for bringing this up again.
|
@mathislucka I remember we had a brief discussion on this. Are you still working on that? What's the status? |
@mathislucka could you please check the status here? Thank you. 🙂 |
Hey @julian-risch and @tstadel , I'm not sure about the status. For things like GPL we will need the raw score. For an end-user using sigmoid activation will be most comprehensible. I'd like to have the current code as default behaviour but allow the user to pass in a custom callable which will be called to transform the score. The problem is that as @tstadel mentioned passing in a Callable does not work well with loading nodes from yaml. So I'm not sure what to do here. Any ideas? |
@mathislucka @julian-risch how about having just a single activation function |
@mathislucka What do you think of the suggestion by Thomas? Would you have time to work on implementing that or should I do that? |
I like the idea and I could implement it. I won't get to it today though. So if you want to merge this earlier I'm glad if you could take over. |
@mathislucka it's not urgent. no need to tackle it today. We were just unsure in our sprint planning whether you can continue working on the issue or whether somebody from the core team needs to take over. Looking forward to your implementation of the idea then and let me know if you need any support. 👍 |
Just a quick question for @tstadel: When you say infer from |
@mathislucka |
# Conflicts: # haystack/nodes/ranker/sentence_transformers.py
I think it might work that way, but I am not sure how to fix this mypy issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍 I fixed the mypy issues and slightly adapted the tests. Let's wait for them to run through and if all goes well I'll merge afterward.
* ranker should return scores for later usage * fix wrong tuple order * adjust ranker scores; add tests * Update Documentation & Code Style * fix mypy * Update Documentation & Code Style * fix mypy * Update Documentation & Code Style * relax ranker test tolerance * update ranker test score Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Julian Risch <[email protected]>
Proposed changes:
This adds a score to Documents coming from a Ranker.
Cross-Encoders produce scores and as a Document already has a score property, this property can be used safely.
The score can be useful to show in a UI or for other purposes.
As nothing in the sorting code is changed, this should be a non-breaking change.
Status (please check what you already did):
closes #2706