Skip to content

Commit

Permalink
Update experiments-msmarco-doc.md (#1434)
Browse files Browse the repository at this point in the history
Minor tweaks: make naming convention consistent with the MS MARCO passage guide.
  • Loading branch information
lintool authored Dec 3, 2020
1 parent 9bf92fa commit 3cdedcc
Showing 1 changed file with 5 additions and 6 deletions.
11 changes: 5 additions & 6 deletions docs/experiments-msmarco-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,9 @@ There's no need to uncompress the file, as Anserini can directly index gzipped f
Build the index with the following command:

```
nohup sh target/appassembler/bin/IndexCollection -collection CleanTrecCollection \
-generator DefaultLuceneDocumentGenerator -threads 1 -input collections/msmarco-doc \
-index indexes/msmarco-doc/lucene-index.msmarco-doc.pos+docvectors+rawdocs \
-storePositions -storeDocvectors -storeRaw >& logs/log.msmarco-doc.pos+docvectors+rawdocs &
sh target/appassembler/bin/IndexCollection -threads 1 -collection CleanTrecCollection \
-generator DefaultLuceneDocumentGenerator -input collections/msmarco-doc \
-index indexes/msmarco-doc/lucene-index-msmarco -storePositions -storeDocvectors -storeRaw
```

On a modern desktop with an SSD, indexing takes around 40 minutes.
Expand All @@ -40,7 +39,7 @@ The dev queries are already stored in our repo:

```
target/appassembler/bin/SearchCollection -topicreader TsvInt \
-index indexes/msmarco-doc/lucene-index.msmarco-doc.pos+docvectors+rawdocs \
-index indexes/msmarco-doc/lucene-index-msmarco \
-topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-output runs/run.msmarco-doc.dev.bm25.txt -bm25
```
Expand Down Expand Up @@ -93,7 +92,7 @@ To perform a run with these parameters, issue the following command:

```
target/appassembler/bin/SearchCollection -topicreader TsvString \
-index indexes/msmarco-doc/lucene-index.msmarco-doc.pos+docvectors+rawdocs \
-index indexes/msmarco-doc/lucene-index-msmarco \
-topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-output runs/run.msmarco-doc.dev.bm25.tuned.txt -bm25 -bm25.k1 3.44 -bm25.b 0.87
```
Expand Down

0 comments on commit 3cdedcc

Please sign in to comment.