Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run files of Vanilla BERT checkpoints do not match test folds in data/robust #21

Open
krasserm opened this issue Apr 9, 2020 · 2 comments

Comments

@krasserm
Copy link

krasserm commented Apr 9, 2020

First of all, thanks a lot for your interesting work on CEDR and for the code in this repository.

I downloaded the Vanilla BERT and CEDR-KNRM checkpoints from #18 and checked the query ids in the .run files contained in the downloaded archive. While the sets of query ids in cedrknrm-robust-f[1-5].run match those in data/robust/f[1-5].test.run, the sets of query ids in vbert-robust-f[1-5].run do not match those in data/robust/f[1-5].test.run (e.g. the set of query ids in vbert-robust-f1.run is different from the set of query ids in data/robust/f1.test.run, and also cedrknrm-robust-f1.run).

Why are the folds for Vanilla BERT and CEDR-KNRM different? On which folds have the Vanilla BERT checkpoints been trained/validated? Given that the test folds of the Vanilla BERT and CEDR-KNRM checkpoints are different I assume that the provided Vanilla BERT checkpoints have not been used as initial weights for obtaining the provided CEDR-KNRM checkpoints. Is this assumption correct? If yes, which Vanilla BERT checkpoints have been used to initialize CEDR-KNRM training? Do you mind sharing these checkpoints too?

I'm currently investigate issues reproducing the results published in the paper. More on that in a separate ticket ...

@krasserm
Copy link
Author

krasserm commented Apr 9, 2020

To be more precise regarding

e.g. the set of query ids in vbert-robust-f1.run is different from the set of query ids in data/robust/f1.test.run, and also cedrknrm-robust-f1.run

the number of common query ids in vbert-robust-f1.run and data/robust/f{x}.test.run for x = 1..5 is:

  • x = 1: 7
  • x = 2: 13
  • x = 3: 12
  • x = 4: 7
  • x = 5: 11

@seanmacavaney
Copy link
Contributor

Hi Marin,

Thanks for pointing out this inconsistency! I suspect that it can be explained by a mismatch between the original code used for running the experiments (which reflect the vbert-robust-f1.run files), and the simplified example we released here. Specifically, I'm thinking it may have been a problem with the code that exported the data/robust/f{x}.test.run files from the original source. But I'll need to spend some time digging into exactly what happened.

  • sean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants