-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/msmarco psg #117
Feature/msmarco psg #117
Conversation
This pull request introduces 3 alerts when merging 649804e into bf50423 - view on LGTM.com new alerts:
|
…/capreolus into feature/msmarco_psg
… mode for done file
… to finish before using the ckpt
… evaluator and msmarco benchmark accordingly. so that benchmark can specify by itself if they wana include train_qids in dev set for non-neural-net algorithms, where no training is needed
…s in qrels should be considered (2) update all the places that called eval_runs and _eval_runs, ensuring the inputed qrels are filtered (3) change the api of trainer.train, removing qrels and relevance_level, instead, sending an eval_fn(runs) to the train() which handles the evaluation logit completely
This pull request introduces 2 alerts when merging 7664df9 into 18e31b7 - view on LGTM.com new alerts:
|
a67740d
to
eaf3395
Compare
This pull request introduces 1 alert when merging 569276e into 1767d5a - view on LGTM.com new alerts:
|
This pull request introduces 1 alert when merging 1bbf0f2 into 1767d5a - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! I left some minor questions/comments
Creation Date : 06/12/2018 | ||
Last Modified : 1/21/2019 | ||
Authors : Daniel Campos <[email protected]>, Rutger van Haasteren <[email protected]> | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually do we just keep the headings as it is?
This pull request introduces 1 alert when merging bbe134e into 1767d5a - view on LGTM.com new alerts:
|
Just sending this PR to better track the progress :) Don't worry about it now
Now running ms marco psg while only reranking the top100 data looks right.
Confusing stuff to solve
sampler.generate_example
andsampler.get_preds_in_trec_format
seem to align with each other, and the dev records are prepared in this run so it's not because of overdue cache data. Still checking what's happening here. (Again this does not happen for the reranking top100 case.)Features/Support to add
(maybe less urgent) handle the msmarco downloading using the allenai/ir_datasetsSidenote (about the running time of some operation)
1.1 pytrec_eval: 40 sec
1.2 (trec_eval: 4 secs)
Saercher.load_trec_run()
)6.1 (3k iteration) < 2 hours
6.2 (30k iter) 11~12 hours
7.1 (top100): several hours
7.2 (top1k) > 1.5 days