Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transfer to msmarco document dataset #22

Open
Berlin-98 opened this issue Dec 20, 2022 · 0 comments
Open

transfer to msmarco document dataset #22

Berlin-98 opened this issue Dec 20, 2022 · 0 comments

Comments

@Berlin-98
Copy link

Hi~
I am using this repo to do experiment on msmacro document dataset, but i feel a little confuse about the difference between repos of Condenser, tevatron and coCondenser. I follow the guide of "coCondenser MS-MARCO Passage Retrieval" and try to transfer the data to msmacro document dataset and the checkpoint to condenser. I think if i want reproduce the result of the coCondenser paper, i just need to encode and then Index Search? is that right? If i want to transfer the data to marco document and the condenser checkpoint, i need to follow the steps of finetuning stage one and two? first finetune a checkpoint and save to retriever_model_s1/ and then use the trained checkpoint to mining hard negatives and then use the hard negatives to further finetune the model and save to retriever_model_s2, and finally search the result of dev set? is that right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant