-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Speed up reader tests #3476
Conversation
d0a6c50
to
f0f17c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just left one question out of interest.
assert metrics_top_2["Retriever"]["mrr"] == 0.5 | ||
assert metrics_top_2["Retriever"]["map"] == 0.5 | ||
assert metrics_top_2["Retriever"]["recall_multi_hit"] == 0.5 | ||
assert metrics_top_2["Retriever"]["recall_single_hit"] == 0.5 | ||
assert metrics_top_2["Retriever"]["precision"] == 0.1 | ||
assert metrics_top_2["Retriever"]["ndcg"] == 0.5 | ||
|
||
metrics_top_3 = eval_result.calculate_metrics(simulated_top_k_reader=3, document_scope="document_id") | ||
metrics_top_5 = eval_result.calculate_metrics(simulated_top_k_reader=5, document_scope="document_id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason why you changed simulated_top_k_reader
here to 5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The metrics only changed and became better when changing it to 5 here. Keeping it at 3 would have required the metrics to be changed to match the ones when using the simulated_top_k_reader
of 1 and 2.
Related Issues
Proposed Changes:
Changed the reader fixture in
conftest.py
to use a smaller reader (500mb to ~165mb) model that still produces sensible results. This will help to speed up tests and slightly reduce the disk space usage for downloading models.How did you test it?
Updated the existing unit tests to work with the new smaller reader. The only real changes needed to happen when comparing metrics of the reader performance. They changed slightly which is to be expected.
Notes for the reviewer
Checklist