Skip to content

Commit

Permalink
Update MS MARCO V2 regressions to add RM3 and Rocchio in missing cases (
Browse files Browse the repository at this point in the history
  • Loading branch information
lintool authored Sep 5, 2022
1 parent 5b8e292 commit 7a18f14
Show file tree
Hide file tree
Showing 49 changed files with 1,058 additions and 808 deletions.
34 changes: 23 additions & 11 deletions docs/regressions-dl21-doc-d2q-t5.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,13 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rm3.topics.dl21.txt \
-hits 1000 -bm25 -rm3 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-v2-doc-d2q-t5/ \
-topics src/main/resources/topics-and-qrels/topics.dl21.txt \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rocchio.topics.dl21.txt \
-hits 1000 -bm25 -rocchio &
```

Evaluation can be performed using `trec_eval`:
Expand All @@ -76,23 +83,28 @@ tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rm3.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rm3.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rm3.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rocchio.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rocchio.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rocchio.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-d2q-t5.bm25-default+rocchio.topics.dl21.txt
```

## Effectiveness

With the above commands, you should be able to reproduce the following results:

| **MAP@100** | **BM25 (default)**| **+RM3** |
|:-------------------------------------------------------------------------------------------------------------|-----------|-----------|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.2387 | 0.2608 |
| **MRR@100** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.8866 | 0.8342 |
| **nDCG@10** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.5792 | 0.5392 |
| **R@100** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.3443 | 0.3580 |
| **R@1000** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.7066 | 0.7572 |
| **MAP@100** | **BM25 (default)**| **+RM3** | **+Rocchio**|
|:-------------------------------------------------------------------------------------------------------------|-----------|-----------|-----------|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.2387 | 0.2608 | 0.2610 |
| **MRR@100** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.8866 | 0.8342 | 0.8459 |
| **nDCG@10** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.5792 | 0.5392 | 0.5509 |
| **R@100** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.3443 | 0.3580 | 0.3616 |
| **R@1000** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.7066 | 0.7572 | 0.7583 |

Some of these regressions correspond to official TREC 2021 Deep Learning Track "baseline" submissions:

Expand Down
34 changes: 23 additions & 11 deletions docs/regressions-dl21-doc-segmented-d2q-t5.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,13 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rm3.topics.dl21.txt \
-hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -bm25 -rm3 &
target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-v2-doc-segmented-d2q-t5/ \
-topics src/main/resources/topics-and-qrels/topics.dl21.txt \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rocchio.topics.dl21.txt \
-hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -bm25 -rocchio &
```

Evaluation can be performed using `trec_eval`:
Expand All @@ -76,23 +83,28 @@ tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rm3.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rm3.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rm3.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rocchio.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rocchio.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rocchio.topics.dl21.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-d2q-t5.bm25-default+rocchio.topics.dl21.txt
```

## Effectiveness

With the above commands, you should be able to reproduce the following results:

| **MAP@100** | **BM25 (default)**| **+RM3** |
|:-------------------------------------------------------------------------------------------------------------|-----------|-----------|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.2683 | 0.3192 |
| **MRR@100** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.9454 | 0.8960 |
| **nDCG@10** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.6289 | 0.6555 |
| **R@100** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.3656 | 0.4119 |
| **R@1000** | **BM25 (default)**| **+RM3** |
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.7202 | 0.7941 |
| **MAP@100** | **BM25 (default)**| **+RM3** | **+Rocchio**|
|:-------------------------------------------------------------------------------------------------------------|-----------|-----------|-----------|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.2683 | 0.3192 | 0.3218 |
| **MRR@100** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.9454 | 0.8960 | 0.9049 |
| **nDCG@10** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.6289 | 0.6555 | 0.6462 |
| **R@100** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.3656 | 0.4119 | 0.4172 |
| **R@1000** | **BM25 (default)**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.7202 | 0.7941 | 0.7969 |

Some of these regressions correspond to official TREC 2021 Deep Learning Track "baseline" submissions:

Expand Down
48 changes: 36 additions & 12 deletions docs/regressions-dl21-doc-segmented-unicoil-0shot-v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ target/appassembler/bin/IndexCollection \
-input /path/to/msmarco-v2-doc-segmented-unicoil-0shot-v2 \
-index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-0shot-v2/ \
-generator DefaultLuceneDocumentGenerator \
-threads 18 -impact -pretokenized \
-threads 18 -impact -pretokenized -storeDocvectors \
>& logs/log.msmarco-v2-doc-segmented-unicoil-0shot-v2 &
```

Expand All @@ -97,6 +97,20 @@ target/appassembler/bin/SearchCollection \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot.topics.dl21.unicoil.0shot.txt \
-hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -impact -pretokenized &

target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-0shot-v2/ \
-topics src/main/resources/topics-and-qrels/topics.dl21.unicoil.0shot.tsv.gz \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rm3.topics.dl21.unicoil.0shot.txt \
-hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -impact -pretokenized -rm3 &

target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-0shot-v2/ \
-topics src/main/resources/topics-and-qrels/topics.dl21.unicoil.0shot.tsv.gz \
-topicreader TsvInt \
-output runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rocchio.topics.dl21.unicoil.0shot.txt \
-hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -impact -pretokenized -rocchio &
```

Evaluation can be performed using `trec_eval`:
Expand All @@ -106,23 +120,33 @@ tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot.topics.dl21.unicoil.0shot.txt

tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rm3.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rm3.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rm3.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rm3.topics.dl21.unicoil.0shot.txt

tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m map src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rocchio.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rocchio.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rocchio.topics.dl21.unicoil.0shot.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 100 -m recip_rank -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl21-doc.txt runs/run.msmarco-v2-doc-segmented-unicoil-0shot-v2.unicoil-0shot+rocchio.topics.dl21.unicoil.0shot.txt
```

## Effectiveness

With the above commands, you should be able to reproduce the following results:

| **MAP@100** | **uniCOIL (with doc2query-T5) zero-shot**|
|:-------------------------------------------------------------------------------------------------------------|-----------|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.2718 |
| **MRR@100** | **uniCOIL (with doc2query-T5) zero-shot**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.9684 |
| **nDCG@10** | **uniCOIL (with doc2query-T5) zero-shot**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.6783 |
| **R@100** | **uniCOIL (with doc2query-T5) zero-shot**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.3700 |
| **R@1000** | **uniCOIL (with doc2query-T5) zero-shot**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.7069 |
| **MAP@100** | **uniCOIL (with doc2query-T5) zero-shot**| **+RM3** | **+Rocchio**|
|:-------------------------------------------------------------------------------------------------------------|-----------|-----------|-----------|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.2718 | 0.3297 | 0.3434 |
| **MRR@100** | **uniCOIL (with doc2query-T5) zero-shot**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.9684 | 0.9357 | 0.9649 |
| **nDCG@10** | **uniCOIL (with doc2query-T5) zero-shot**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.6783 | 0.6979 | 0.7061 |
| **R@100** | **uniCOIL (with doc2query-T5) zero-shot**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.3700 | 0.4237 | 0.4374 |
| **R@1000** | **uniCOIL (with doc2query-T5) zero-shot**| **+RM3** | **+Rocchio**|
| [DL21 (Doc)](https://microsoft.github.io/msmarco/TREC-Deep-Learning) | 0.7069 | 0.7608 | 0.7809 |

This run roughly corresponds to run `p_unicoil0` submitted to the TREC 2021 Deep Learning Track under the "baseline" group.
The difference is that here we are using pre-encoded queries, whereas the official submission performed query encoding on the fly.
Expand Down
Loading

0 comments on commit 7a18f14

Please sign in to comment.