Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproducing your results on MS MARCO #3

Open
Narabzad opened this issue Sep 23, 2021 · 8 comments
Open

reproducing your results on MS MARCO #3

Narabzad opened this issue Sep 23, 2021 · 8 comments

Comments

@Narabzad
Copy link

Hi,

Thank you for your great work!
I am willing to replicate your results on MS MARCO passage collection and I have a question regarding Luyu/co-condenser-marco model. Is this the final model that you used to retrieve documents? Or do I need to train it on MS MARCO relevant query/passage pairs?
Is it possible to provide a little bit more detail on how should I use your dense toolkit with this model?

Thank you in advance!

@luyug
Copy link
Owner

luyug commented Sep 24, 2021

Hello,

Please take a look at the coCondenser fine-tuning tutorial. It should answer most of your questions.

We can leave this issue open for now in case you run into other problems.

@Narabzad
Copy link
Author

Thank you for the great tutorial !
Just one issue that I have found is
--passage_reps corpus/corpus/'*.pt' should be --passage_reps encoding/corpus/'*.pt' in this link https://github.com/texttron/tevatron/tree/main/examples/coCondenser-marco#index-search

@luyug
Copy link
Owner

luyug commented Oct 1, 2021

Thanks for catching that!

@Narabzad
Copy link
Author

Hi,

I was able to replicate the MRR@10 that you reported in the paper ( 0.38) but I was wondering what is the difference between the number that is reported on the leaderboard ( 0.44) vs 0.38?
How do I replicate that? is it on a different set?

@shunyuzh
Copy link

Hi, @luyug

Thanks for your awesome work.
I have similar question on NQ. Is it possible to give more details to reproduce the results (84.3=MRR@5) on NQ in the paper, just like the detailed MS MARCO tutorial demo?

Or if it need some time, could you tell me whether your SOTA model on NQ is trained with mined hard negatives or with both BM hard negatives and mined hard negatives as DPR github?

Thanks.

@Yuan0320
Copy link

Yuan0320 commented Nov 4, 2022

Hi @luyug,

Thanks for your great work! I also have the confusion about the difference between the reported result and leaderboard (0.38 vs. 0.44). Is there any update on this?

@cadurosar
Copy link

Also interested, from what I remember the main difference is that there's also a reranker applied, would it be possible to get the checkpoint of the reranker?

@caiyinqiong
Copy link

caiyinqiong commented Apr 7, 2023

Hi,
Thank you for your great work!
I encounter some issues when I tried to reproduce the results on MARCO passage. I have referred to the aforementioned tutorial, but still cannot solve it (the problem seems to be in the step of mining hard negatives).

First, I run Fine-tuning Stage 1 with

CUDA_VISIBLE_DEVICES=3 python -m tevatron.driver.train \
  --output_dir model_msmarco_s1 \
  --model_name_or_path ../data/co-condenser-marco \
  --save_steps 20000 \
  --train_dir ../data/msmarco-passage/train_dir \
  --data_cache_dir ../data/msmarco-passage-train-cache \
  --fp16 \
  --dataloader_num_workers 2 \
  --per_device_train_batch_size 8 \
  --train_n_passages 8 \
  --learning_rate 5e-6 \
  --q_max_len 16 \
  --p_max_len 128 \
  --num_train_epochs 3 \
  --logging_steps 500 \

, and get MRR@10=0.3596, R@1000=0.9771. (Your reported results are MRR@10=0.357, R@1000=0.978).

Then, I run the hard negative mining with random sampling 30 negatives from the top-200 retrieval results of model_msmarco_s1 by modifying scripts/hn_mining.py (according to the parameters in build_train_hn.py).

Second, I run Fine-tuning Stage 2 with

CUDA_VISIBLE_DEVICES=3 python -m tevatron.driver.train \
  --output_dir model_msmarco_s2 \
  --model_name_or_path ../data/co-condenser-marco \
  --save_steps 20000 \
  --train_dir ../data/msmarco-passage/tain_dir_hn_dr_cocondenser200 \
  --data_cache_dir ../data/msmarco-passage-tain_hn_dr_cocondenser200-cache \
  --fp16 \
  --dataloader_num_workers 2 \
  --per_device_train_batch_size 8 \
  --train_n_passages 8 \
  --learning_rate 5e-6 \
  --q_max_len 16 \
  --p_max_len 128 \
  --num_train_epochs 2 \
  --logging_steps 500 \

, and get MRR@10=0.3657, R@1000=0.9761. (Your reported results are MRR@10=0.382, R@1000=0.984).

There are several possible issues that I would like to confirm:

  1. The training data for Fine-tuning Stage 2 only is hard negatives, having not been concatenated with BM25 negatives?
  2. The initial parameters are from co-condenser-marco, not the checkpoint of model_msmarco_s1?
  3. The setting of per_device_train_batch_size, train_n_passages, learning_rate, and num_train_epochs in Fine-tuning Stage 2 ?

Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants