Run files of Vanilla BERT checkpoints do not match test folds in data/robust #21

krasserm · 2020-04-09T07:02:29Z

First of all, thanks a lot for your interesting work on CEDR and for the code in this repository.

I downloaded the Vanilla BERT and CEDR-KNRM checkpoints from #18 and checked the query ids in the .run files contained in the downloaded archive. While the sets of query ids in cedrknrm-robust-f[1-5].run match those in data/robust/f[1-5].test.run, the sets of query ids in vbert-robust-f[1-5].run do not match those in data/robust/f[1-5].test.run (e.g. the set of query ids in vbert-robust-f1.run is different from the set of query ids in data/robust/f1.test.run, and also cedrknrm-robust-f1.run).

Why are the folds for Vanilla BERT and CEDR-KNRM different? On which folds have the Vanilla BERT checkpoints been trained/validated? Given that the test folds of the Vanilla BERT and CEDR-KNRM checkpoints are different I assume that the provided Vanilla BERT checkpoints have not been used as initial weights for obtaining the provided CEDR-KNRM checkpoints. Is this assumption correct? If yes, which Vanilla BERT checkpoints have been used to initialize CEDR-KNRM training? Do you mind sharing these checkpoints too?

I'm currently investigate issues reproducing the results published in the paper. More on that in a separate ticket ...

The text was updated successfully, but these errors were encountered:

krasserm · 2020-04-09T07:19:29Z

To be more precise regarding

e.g. the set of query ids in vbert-robust-f1.run is different from the set of query ids in data/robust/f1.test.run, and also cedrknrm-robust-f1.run

the number of common query ids in vbert-robust-f1.run and data/robust/f{x}.test.run for x = 1..5 is:

x = 1: 7
x = 2: 13
x = 3: 12
x = 4: 7
x = 5: 11

seanmacavaney · 2020-04-09T13:09:57Z

Hi Marin,

Thanks for pointing out this inconsistency! I suspect that it can be explained by a mismatch between the original code used for running the experiments (which reflect the vbert-robust-f1.run files), and the simplified example we released here. Specifically, I'm thinking it may have been a problem with the code that exported the data/robust/f{x}.test.run files from the original source. But I'll need to spend some time digging into exactly what happened.

sean

krasserm mentioned this issue Apr 10, 2020

Difficulties to reproduce results on Robust 04 #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run files of Vanilla BERT checkpoints do not match test folds in data/robust #21

Run files of Vanilla BERT checkpoints do not match test folds in data/robust #21

krasserm commented Apr 9, 2020

krasserm commented Apr 9, 2020

seanmacavaney commented Apr 9, 2020

Run files of Vanilla BERT checkpoints do not match test folds in data/robust #21

Run files of Vanilla BERT checkpoints do not match test folds in data/robust #21

Comments

krasserm commented Apr 9, 2020

krasserm commented Apr 9, 2020

seanmacavaney commented Apr 9, 2020