diff --git a/README.md b/README.md
index 8b7be2c46d..0b56c7d70c 100644
--- a/README.md
+++ b/README.md
@@ -54,16 +54,16 @@ For the most part, these runs are based on [_default_ parameter settings](https:
   + Bag-of-words models: [baselines](docs/regressions-msmarco-passage.md), [doc2query](docs/regressions-msmarco-passage-doc2query.md), [doc2query-T5](docs/regressions-msmarco-passage-docTTTTTquery.md)
   + Sparse learned models: [DeepImpact](docs/regressions-msmarco-passage-deepimpact.md), [uniCOIL with doc2query-T5](docs/regressions-msmarco-passage-unicoil.md), [uniCOIL with TILDE](docs/regressions-msmarco-passage-unicoil-tilde-expansion.md), [SPLADEv2](docs/regressions-msmarco-passage-distill-splade-max.md)
 + Regressions for MS MARCO (V1) Document Ranking:
-  + Per doc method: [baselines](docs/regressions-msmarco-doc.md), [doc2query-T5](docs/regressions-msmarco-doc-docTTTTTquery-per-doc.md)
-  + Per passage method: [baselines](docs/regressions-msmarco-doc-per-passage.md) ([v2](docs/regressions-msmarco-doc-per-passage-v2.md), [v3](docs/regressions-msmarco-doc-per-passage-v3.md))[*](docs/experiments-msmarco-doc-doc2query-details.md), [doc2query-T5](docs/regressions-msmarco-doc-docTTTTTquery-per-passage.md) ([v3](docs/regressions-msmarco-doc-docTTTTTquery-per-passage-v3.md))[*](docs/experiments-msmarco-doc-doc2query-details.md)
+  + Complete doc[*](docs/experiments-msmarco-doc-doc2query-details.md): [baselines](docs/regressions-msmarco-doc.md), [doc2query-T5](docs/regressions-msmarco-doc-docTTTTTquery.md)
+  + Segmented doc[*](docs/experiments-msmarco-doc-doc2query-details.md): [baselines](docs/regressions-msmarco-doc-segmented.md), [doc2query-T5](docs/regressions-msmarco-doc-segmented-docTTTTTquery.md)
 + Regressions for TREC 2019 Deep Learning Track:
   + Passage ranking: [baselines](docs/regressions-dl19-passage.md), [doc2query-T5](docs/regressions-dl19-passage-docTTTTTquery.md)
-  + Document ranking, per doc method: [baselines](docs/regressions-dl19-doc.md), [doc2query-T5](docs/regressions-dl19-doc-docTTTTTquery-per-doc.md)
-  + Document ranking, per passage method: [baselines](docs/regressions-dl19-doc-per-passage.md), [doc2query-T5](docs/regressions-dl19-doc-docTTTTTquery-per-passage.md)
+  + Document ranking, complete doc[*](docs/experiments-msmarco-doc-doc2query-details.md): [baselines](docs/regressions-dl19-doc.md), [doc2query-T5](docs/regressions-dl19-doc-docTTTTTquery.md)
+  + Document ranking, segmented doc[*](docs/experiments-msmarco-doc-doc2query-details.md): [baselines](docs/regressions-dl19-doc-segmented.md), [doc2query-T5](docs/regressions-dl19-doc-segmented-docTTTTTquery.md)
 + Regressions for TREC 2020 Deep Learning Track:
   + Passage ranking: [baselines](docs/regressions-dl20-passage.md), [doc2query-T5](docs/regressions-dl20-passage-docTTTTTquery.md)
-  + Document ranking, per doc method: [baselines](docs/regressions-dl20-doc.md), [doc2query-T5](docs/regressions-dl20-doc-docTTTTTquery-per-doc.md)
-  + Document ranking, per passage method: [baselines](docs/regressions-dl20-doc-per-passage.md), [doc2query-T5](docs/regressions-dl20-doc-docTTTTTquery-per-passage.md)
+  + Document ranking, complete doc[*](docs/experiments-msmarco-doc-doc2query-details.md): [baselines](docs/regressions-dl20-doc.md), [doc2query-T5](docs/regressions-dl20-doc-docTTTTTquery.md)
+  + Document ranking, segmented doc[*](docs/experiments-msmarco-doc-doc2query-details.md): [baselines](docs/regressions-dl20-doc-segmented.md), [doc2query-T5](docs/regressions-dl20-doc-segmented-docTTTTTquery.md)
 + Regressions for MS MARCO (V2) Passage Ranking:
   + Bag-of-words models: [baselines](docs/regressions-msmarco-v2-passage.md), [on augmented corpus](docs/regressions-msmarco-v2-passage-augmented.md)
   + Sparse learned models: [uniCOIL noexp zero-shot](docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md)
diff --git a/docs/experiments-msmarco-doc-doc2query-details.md b/docs/experiments-msmarco-doc-doc2query-details.md
index c39bb63129..02c0989a44 100644
--- a/docs/experiments-msmarco-doc-doc2query-details.md
+++ b/docs/experiments-msmarco-doc-doc2query-details.md
@@ -1,4 +1,4 @@
-# Anserini: Reproducibility Notes for MS MARCO V1 Doc Ranking
+# Anserini: Reproducibility Notes for MS MARCO V1
 
 <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Reproducibility is hard.</p>&mdash; Jimmy Lin (@lintool) <a href="https://twitter.com/lintool/status/1458853999298465796?ref_src=twsrc%5Etfw">November 11, 2021</a></blockquote>
 
@@ -22,8 +22,22 @@ This was for dense retrieval experiments, as we were not aware of the doc2query-
 It is very likely, but we cannot know for sure, that this was the same segmentation that generated the original doc2query-T5 expansions.
 Fortunately, Xueguang was able to save a copy of this segmented corpus.
 
-So, now we have:
+---
 
-+ `doc-per-passage-v2`: materialized corpus with 20,545,677 segments.
-+ `doc-per-passage-v3`: same as above, except with URL. Note that bag-of-words search over this variant yields higher effectiveness than above, but for input to an encoder, you probably don't want to include the URL.
-+ `doc-docTTTTTquery-per-doc-v3`: `doc-per-passage-v3`, but with the doc2query-T5 expansions added in. 
+In January 2022, we completely refactored the doc2query-T5 expansion data for the MS MARCO (V1) corpora.
+They are now available as Huggingface Datasets:
+
++ [`msmarco_v1_passage_doc2query-t5_expansions`](https://huggingface.co/datasets/castorini/msmarco_v1_passage_doc2query-t5_expansions): passage expansions
++ [`msmarco_v1_doc_doc2query-t5_expansions`](https://huggingface.co/datasets/castorini/msmarco_v1_doc_doc2query-t5_expansions): document expansions
++ [`msmarco_v1_doc_segmented_doc2query-t5_expansions`](https://huggingface.co/datasets/castorini/msmarco_v1_doc_segmented_doc2query-t5_expansions): document segment expansions
+
+So now we have the following new regressions:
+
++ `msmarco-doc`: document corpus in Anserini's jsonl format with 3,213,835 documents. Each contains URL, title, body, delimited by newlines.
++ `msmarco-doc-docTTTTTquery`: same as above, but with docTTTTTquery expansions, delimited by another newline.
++ `msmarco-segmented`: segmented document corpus in Anserini's jsonl format with 20,545,677 segments. Each contains URL, title, segment, delimited by newlines.
++ `msmarco-segmented-docTTTTTquery`: same as above, but with docTTTTTquery expansions, delimited by another newline.
+
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match, this may be the reason.
+
+*TODO:* Circle back and add links to scripts once everything has been verified and checked in.
diff --git a/docs/regressions-backgroundlinking18.md b/docs/regressions-backgroundlinking18.md
index 67c1aa8001..8ee485daef 100644
--- a/docs/regressions-backgroundlinking18.md
+++ b/docs/regressions-backgroundlinking18.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection WashingtonPostCollection \
   -input /path/to/wapo.v2 \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -generator WashingtonPostGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.wapo.v2 &
@@ -34,19 +34,19 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking18.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v2.bm25.topics.backgroundlinking18.txt \
   -backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking18.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v2.bm25+rm3.topics.backgroundlinking18.txt \
   -backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking18.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v2.bm25+rm3+df.topics.backgroundlinking18.txt \
   -backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
diff --git a/docs/regressions-backgroundlinking19.md b/docs/regressions-backgroundlinking19.md
index dbeec9cd53..c57965ea71 100644
--- a/docs/regressions-backgroundlinking19.md
+++ b/docs/regressions-backgroundlinking19.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection WashingtonPostCollection \
   -input /path/to/wapo.v2 \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -generator WashingtonPostGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.wapo.v2 &
@@ -34,19 +34,19 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking19.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v2.bm25.topics.backgroundlinking19.txt \
   -backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking19.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v2.bm25+rm3.topics.backgroundlinking19.txt \
   -backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking19.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v2.bm25+rm3+df.topics.backgroundlinking19.txt \
   -backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
diff --git a/docs/regressions-backgroundlinking20.md b/docs/regressions-backgroundlinking20.md
index 9905891d01..b2fcec96c3 100644
--- a/docs/regressions-backgroundlinking20.md
+++ b/docs/regressions-backgroundlinking20.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection WashingtonPostCollection \
   -input /path/to/wapo.v3 \
-  -index indexes/lucene-index.wapo.v3 \
+  -index indexes/lucene-index.wapo.v3/ \
   -generator WashingtonPostGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.wapo.v3 &
@@ -34,19 +34,19 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v3 \
+  -index indexes/lucene-index.wapo.v3/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking20.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v3.bm25.topics.backgroundlinking20.txt \
   -backgroundlinking -backgroundlinking.k 100 -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v3 \
+  -index indexes/lucene-index.wapo.v3/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking20.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v3.bm25+rm3.topics.backgroundlinking20.txt \
   -backgroundlinking -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v3 \
+  -index indexes/lucene-index.wapo.v3/ \
   -topics src/main/resources/topics-and-qrels/topics.backgroundlinking20.txt -topicreader BackgroundLinking \
   -output runs/run.wapo.v3.bm25+rm3+df.topics.backgroundlinking20.txt \
   -backgroundlinking -backgroundlinking.datefilter -backgroundlinking.k 100 -bm25 -rm3 -hits 100 &
diff --git a/docs/regressions-car17v1.5.md b/docs/regressions-car17v1.5.md
index 2049086e4d..16684b2f77 100644
--- a/docs/regressions-car17v1.5.md
+++ b/docs/regressions-car17v1.5.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CarCollection \
   -input /path/to/car-paragraphCorpus.v1.5 \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.car-paragraphCorpus.v1.5 &
@@ -35,37 +35,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v1.5.bm25.topics.car17v1.5.benchmarkY1test.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v1.5.bm25+rm3.topics.car17v1.5.benchmarkY1test.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v1.5.bm25+ax.topics.car17v1.5.benchmarkY1test.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v1.5.ql.topics.car17v1.5.benchmarkY1test.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v1.5.ql+rm3.topics.car17v1.5.benchmarkY1test.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v1.5 \
+  -index indexes/lucene-index.car-paragraphCorpus.v1.5/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v1.5.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v1.5.ql+ax.topics.car17v1.5.benchmarkY1test.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-car17v2.0-doc2query.md b/docs/regressions-car17v2.0-doc2query.md
index 0dd87a2e3e..21889cde9f 100644
--- a/docs/regressions-car17v2.0-doc2query.md
+++ b/docs/regressions-car17v2.0-doc2query.md
@@ -18,7 +18,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/car-paragraphCorpus.v2.0-doc2query \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 30 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.car-paragraphCorpus.v2.0-doc2query &
@@ -41,37 +41,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0-doc2query.bm25.topics.car17v2.0.benchmarkY1test.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0-doc2query.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0-doc2query.bm25+ax.topics.car17v2.0.benchmarkY1test.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0-doc2query.ql.topics.car17v2.0.benchmarkY1test.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0-doc2query.ql+rm3.topics.car17v2.0.benchmarkY1test.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0-doc2query.ql+ax.topics.car17v2.0.benchmarkY1test.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-car17v2.0.md b/docs/regressions-car17v2.0.md
index 4ece511e3a..176f6e307f 100644
--- a/docs/regressions-car17v2.0.md
+++ b/docs/regressions-car17v2.0.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CarCollection \
   -input /path/to/car-paragraphCorpus.v2.0 \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.car-paragraphCorpus.v2.0 &
@@ -35,37 +35,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0.bm25.topics.car17v2.0.benchmarkY1test.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0.bm25+rm3.topics.car17v2.0.benchmarkY1test.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0.bm25+ax.topics.car17v2.0.benchmarkY1test.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0.ql.topics.car17v2.0.benchmarkY1test.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0.ql+rm3.topics.car17v2.0.benchmarkY1test.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.car-paragraphCorpus.v2.0 \
+  -index indexes/lucene-index.car-paragraphCorpus.v2.0/ \
   -topics src/main/resources/topics-and-qrels/topics.car17v2.0.benchmarkY1test.txt -topicreader Car \
   -output runs/run.car-paragraphCorpus.v2.0.ql+ax.topics.car17v2.0.benchmarkY1test.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-clef06-fr.md b/docs/regressions-clef06-fr.md
index 988de66e5a..e50ee2cd56 100644
--- a/docs/regressions-clef06-fr.md
+++ b/docs/regressions-clef06-fr.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/clef06-fr \
-  -index indexes/lucene-index.clef06-fr \
+  -index indexes/lucene-index.clef06-fr/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw -language fr \
   >& logs/log.clef06-fr &
@@ -37,7 +37,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.clef06-fr \
+  -index indexes/lucene-index.clef06-fr/ \
   -topics src/main/resources/topics-and-qrels/topics.clef06fr.mono.fr.txt -topicreader TsvString \
   -output runs/run.clef06-fr.bm25.topics.clef06fr.mono.fr.txt \
   -bm25 -language fr &
diff --git a/docs/regressions-core17.md b/docs/regressions-core17.md
index 364e516bb2..39fb774772 100644
--- a/docs/regressions-core17.md
+++ b/docs/regressions-core17.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection NewYorkTimesCollection \
   -input /path/to/nyt \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.nyt &
@@ -34,37 +34,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -topics src/main/resources/topics-and-qrels/topics.core17.txt -topicreader Trec \
   -output runs/run.nyt.bm25.topics.core17.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -topics src/main/resources/topics-and-qrels/topics.core17.txt -topicreader Trec \
   -output runs/run.nyt.bm25+rm3.topics.core17.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -topics src/main/resources/topics-and-qrels/topics.core17.txt -topicreader Trec \
   -output runs/run.nyt.bm25+ax.topics.core17.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -topics src/main/resources/topics-and-qrels/topics.core17.txt -topicreader Trec \
   -output runs/run.nyt.ql.topics.core17.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -topics src/main/resources/topics-and-qrels/topics.core17.txt -topicreader Trec \
   -output runs/run.nyt.ql+rm3.topics.core17.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.nyt \
+  -index indexes/lucene-index.nyt/ \
   -topics src/main/resources/topics-and-qrels/topics.core17.txt -topicreader Trec \
   -output runs/run.nyt.ql+ax.topics.core17.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-core18.md b/docs/regressions-core18.md
index 01bdf605a5..29b08d8eae 100644
--- a/docs/regressions-core18.md
+++ b/docs/regressions-core18.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection WashingtonPostCollection \
   -input /path/to/wapo.v2 \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -generator WashingtonPostGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.wapo.v2 &
@@ -34,37 +34,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.core18.txt -topicreader Trec \
   -output runs/run.wapo.v2.bm25.topics.core18.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.core18.txt -topicreader Trec \
   -output runs/run.wapo.v2.bm25+rm3.topics.core18.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.core18.txt -topicreader Trec \
   -output runs/run.wapo.v2.bm25+ax.topics.core18.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.core18.txt -topicreader Trec \
   -output runs/run.wapo.v2.ql.topics.core18.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.core18.txt -topicreader Trec \
   -output runs/run.wapo.v2.ql+rm3.topics.core18.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wapo.v2 \
+  -index indexes/lucene-index.wapo.v2/ \
   -topics src/main/resources/topics-and-qrels/topics.core18.txt -topicreader Trec \
   -output runs/run.wapo.v2.ql+ax.topics.core18.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-cw09b.md b/docs/regressions-cw09b.md
index 070574c23c..44bb5f5b1f 100644
--- a/docs/regressions-cw09b.md
+++ b/docs/regressions-cw09b.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection ClueWeb09Collection \
   -input /path/to/cw09b \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 44 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.cw09b &
@@ -39,97 +39,97 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.51-100.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25.topics.web.51-100.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.101-150.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25.topics.web.101-150.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.151-200.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25.topics.web.151-200.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.51-100.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25+rm3.topics.web.51-100.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.101-150.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25+rm3.topics.web.101-150.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.151-200.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25+rm3.topics.web.151-200.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.51-100.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25+ax.topics.web.51-100.txt \
   -bm25 -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.101-150.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25+ax.topics.web.101-150.txt \
   -bm25 -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.151-200.txt -topicreader Webxml \
   -output runs/run.cw09b.bm25+ax.topics.web.151-200.txt \
   -bm25 -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.51-100.txt -topicreader Webxml \
   -output runs/run.cw09b.ql.topics.web.51-100.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.101-150.txt -topicreader Webxml \
   -output runs/run.cw09b.ql.topics.web.101-150.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.151-200.txt -topicreader Webxml \
   -output runs/run.cw09b.ql.topics.web.151-200.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.51-100.txt -topicreader Webxml \
   -output runs/run.cw09b.ql+rm3.topics.web.51-100.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.101-150.txt -topicreader Webxml \
   -output runs/run.cw09b.ql+rm3.topics.web.101-150.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.151-200.txt -topicreader Webxml \
   -output runs/run.cw09b.ql+rm3.topics.web.151-200.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.51-100.txt -topicreader Webxml \
   -output runs/run.cw09b.ql+ax.topics.web.51-100.txt \
   -qld -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.101-150.txt -topicreader Webxml \
   -output runs/run.cw09b.ql+ax.topics.web.101-150.txt \
   -qld -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw09b \
+  -index indexes/lucene-index.cw09b/ \
   -topics src/main/resources/topics-and-qrels/topics.web.151-200.txt -topicreader Webxml \
   -output runs/run.cw09b.ql+ax.topics.web.151-200.txt \
   -qld -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
diff --git a/docs/regressions-cw12.md b/docs/regressions-cw12.md
index 1497c025af..be46e58f81 100644
--- a/docs/regressions-cw12.md
+++ b/docs/regressions-cw12.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection ClueWeb12Collection \
   -input /path/to/cw12 \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 44 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.cw12 &
@@ -35,45 +35,45 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12.bm25.topics.web.201-250.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12.bm25.topics.web.251-300.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12.bm25+rm3.topics.web.201-250.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12.bm25+rm3.topics.web.251-300.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12.ql.topics.web.201-250.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12.ql.topics.web.251-300.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12.ql+rm3.topics.web.201-250.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12 \
+  -index indexes/lucene-index.cw12/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12.ql+rm3.topics.web.251-300.txt \
   -qld -rm3 &
diff --git a/docs/regressions-cw12b13.md b/docs/regressions-cw12b13.md
index eef15ab17f..a00d7b1550 100644
--- a/docs/regressions-cw12b13.md
+++ b/docs/regressions-cw12b13.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection ClueWeb12Collection \
   -input /path/to/cw12b13 \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 44 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.cw12b13 &
@@ -35,67 +35,67 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12b13.bm25.topics.web.201-250.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12b13.bm25.topics.web.251-300.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12b13.bm25+rm3.topics.web.201-250.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12b13.bm25+rm3.topics.web.251-300.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12b13.bm25+ax.topics.web.201-250.txt \
   -bm25 -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12b13.bm25+ax.topics.web.251-300.txt \
   -bm25 -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12b13.ql.topics.web.201-250.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12b13.ql.topics.web.251-300.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12b13.ql+rm3.topics.web.201-250.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12b13.ql+rm3.topics.web.251-300.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.201-250.txt -topicreader Webxml \
   -output runs/run.cw12b13.ql+ax.topics.web.201-250.txt \
   -qld -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.cw12b13 \
+  -index indexes/lucene-index.cw12b13/ \
   -topics src/main/resources/topics-and-qrels/topics.web.251-300.txt -topicreader Webxml \
   -output runs/run.cw12b13.ql+ax.topics.web.251-300.txt \
   -qld -axiom -axiom.deterministic -axiom.beta 0.1 -rerankCutoff 20 &
diff --git a/docs/regressions-disk12.md b/docs/regressions-disk12.md
index 4a3daa9bb3..7bcd7a61d1 100644
--- a/docs/regressions-disk12.md
+++ b/docs/regressions-disk12.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection TrecCollection \
   -input /path/to/disk12 \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.disk12 &
@@ -37,97 +37,97 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.51-100.txt -topicreader Trec \
   -output runs/run.disk12.bm25.topics.adhoc.51-100.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.101-150.txt -topicreader Trec \
   -output runs/run.disk12.bm25.topics.adhoc.101-150.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.151-200.txt -topicreader Trec \
   -output runs/run.disk12.bm25.topics.adhoc.151-200.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.51-100.txt -topicreader Trec \
   -output runs/run.disk12.bm25+rm3.topics.adhoc.51-100.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.101-150.txt -topicreader Trec \
   -output runs/run.disk12.bm25+rm3.topics.adhoc.101-150.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.151-200.txt -topicreader Trec \
   -output runs/run.disk12.bm25+rm3.topics.adhoc.151-200.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.51-100.txt -topicreader Trec \
   -output runs/run.disk12.bm25+ax.topics.adhoc.51-100.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.101-150.txt -topicreader Trec \
   -output runs/run.disk12.bm25+ax.topics.adhoc.101-150.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.151-200.txt -topicreader Trec \
   -output runs/run.disk12.bm25+ax.topics.adhoc.151-200.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.51-100.txt -topicreader Trec \
   -output runs/run.disk12.ql.topics.adhoc.51-100.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.101-150.txt -topicreader Trec \
   -output runs/run.disk12.ql.topics.adhoc.101-150.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.151-200.txt -topicreader Trec \
   -output runs/run.disk12.ql.topics.adhoc.151-200.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.51-100.txt -topicreader Trec \
   -output runs/run.disk12.ql+rm3.topics.adhoc.51-100.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.101-150.txt -topicreader Trec \
   -output runs/run.disk12.ql+rm3.topics.adhoc.101-150.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.151-200.txt -topicreader Trec \
   -output runs/run.disk12.ql+rm3.topics.adhoc.151-200.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.51-100.txt -topicreader Trec \
   -output runs/run.disk12.ql+ax.topics.adhoc.51-100.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.101-150.txt -topicreader Trec \
   -output runs/run.disk12.ql+ax.topics.adhoc.101-150.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk12 \
+  -index indexes/lucene-index.disk12/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.151-200.txt -topicreader Trec \
   -output runs/run.disk12.ql+ax.topics.adhoc.151-200.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-disk45.md b/docs/regressions-disk45.md
index d4d34013a4..a0bc17c295 100644
--- a/docs/regressions-disk45.md
+++ b/docs/regressions-disk45.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection TrecCollection \
   -input /path/to/disk45 \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.disk45 &
@@ -36,97 +36,97 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.351-400.txt -topicreader Trec \
   -output runs/run.disk45.bm25.topics.adhoc.351-400.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.401-450.txt -topicreader Trec \
   -output runs/run.disk45.bm25.topics.adhoc.401-450.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.robust04.txt -topicreader Trec \
   -output runs/run.disk45.bm25.topics.robust04.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.351-400.txt -topicreader Trec \
   -output runs/run.disk45.bm25+rm3.topics.adhoc.351-400.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.401-450.txt -topicreader Trec \
   -output runs/run.disk45.bm25+rm3.topics.adhoc.401-450.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.robust04.txt -topicreader Trec \
   -output runs/run.disk45.bm25+rm3.topics.robust04.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.351-400.txt -topicreader Trec \
   -output runs/run.disk45.bm25+ax.topics.adhoc.351-400.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.401-450.txt -topicreader Trec \
   -output runs/run.disk45.bm25+ax.topics.adhoc.401-450.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.robust04.txt -topicreader Trec \
   -output runs/run.disk45.bm25+ax.topics.robust04.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.351-400.txt -topicreader Trec \
   -output runs/run.disk45.ql.topics.adhoc.351-400.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.401-450.txt -topicreader Trec \
   -output runs/run.disk45.ql.topics.adhoc.401-450.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.robust04.txt -topicreader Trec \
   -output runs/run.disk45.ql.topics.robust04.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.351-400.txt -topicreader Trec \
   -output runs/run.disk45.ql+rm3.topics.adhoc.351-400.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.401-450.txt -topicreader Trec \
   -output runs/run.disk45.ql+rm3.topics.adhoc.401-450.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.robust04.txt -topicreader Trec \
   -output runs/run.disk45.ql+rm3.topics.robust04.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.351-400.txt -topicreader Trec \
   -output runs/run.disk45.ql+ax.topics.adhoc.351-400.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.401-450.txt -topicreader Trec \
   -output runs/run.disk45.ql+ax.topics.adhoc.401-450.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.disk45 \
+  -index indexes/lucene-index.disk45/ \
   -topics src/main/resources/topics-and-qrels/topics.robust04.txt -topicreader Trec \
   -output runs/run.disk45.ql+ax.topics.robust04.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-dl19-doc-docTTTTTquery-per-doc.md b/docs/regressions-dl19-doc-docTTTTTquery.md
similarity index 69%
rename from docs/regressions-dl19-doc-docTTTTTquery-per-doc.md
rename to docs/regressions-dl19-doc-docTTTTTquery.md
index 8ff81c8743..704e4309b9 100644
--- a/docs/regressions-dl19-doc-docTTTTTquery-per-doc.md
+++ b/docs/regressions-dl19-doc-docTTTTTquery.md
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) w/ per-doc docTTTTTquery
+# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2019 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,10 +10,14 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc-docTTTTTquery-per-doc.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-doc.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc-docTTTTTquery.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc-docTTTTTquery.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -22,14 +26,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-doc \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -input /path/to/msmarco-doc-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-doc &
+  -threads 7 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-docTTTTTquery &
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-doc/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-docTTTTTquery/` should be a directory containing the expanded document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -43,40 +48,40 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-default.topics.dl19-doc.txt \
   -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default+rm3.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-default+rm3.topics.dl19-doc.txt \
   -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-tuned.topics.dl19-doc.txt \
   -bm25 -bm25.k1 4.68 -bm25.b 0.87 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned+rm3.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-tuned+rm3.topics.dl19-doc.txt \
   -bm25 -bm25.k1 4.68 -bm25.b 0.87 -rm3 -hits 100 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-default.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default+rm3.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-default+rm3.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-tuned.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned+rm3.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-tuned+rm3.topics.dl19-doc.txt
 ```
 
 ## Effectiveness
@@ -85,7 +90,7 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2699    | 0.3044    | 0.2620    | 0.2812    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2700    | 0.3045    | 0.2620    | 0.2814    |
 
 
 R@100                                   | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
@@ -95,7 +100,7 @@ R@100                                   | BM25 (default)| +RM3      | BM25 (tune
 
 nDCG@10                                 | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.5968    | 0.5895    | 0.5967    | 0.6075    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.5968    | 0.5897    | 0.5972    | 0.6080    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl19-doc-docTTTTTquery-per-passage.md b/docs/regressions-dl19-doc-segmented-docTTTTTquery.md
similarity index 63%
rename from docs/regressions-dl19-doc-docTTTTTquery-per-passage.md
rename to docs/regressions-dl19-doc-segmented-docTTTTTquery.md
index c9fafd645b..7503748baf 100644
--- a/docs/regressions-dl19-doc-docTTTTTquery-per-passage.md
+++ b/docs/regressions-dl19-doc-segmented-docTTTTTquery.md
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) w/ per-passage docTTTTTquery
+# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) Segmented w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2019 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc-docTTTTTquery-per-passage.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-passage.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc-segmented-docTTTTTquery.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc-segmented-docTTTTTquery.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -23,14 +27,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-passage \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -input /path/to/msmarco-doc-segmented-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-passage &
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-segmented-docTTTTTquery &
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented-docTTTTTquery/` should be a directory containing the expanded segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -44,40 +49,40 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default.topics.dl19-doc.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default+rm3.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default+rm3.topics.dl19-doc.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned.topics.dl19-doc.txt \
   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned+rm3.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned+rm3.topics.dl19-doc.txt \
   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default+rm3.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default+rm3.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned+rm3.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned+rm3.topics.dl19-doc.txt
 ```
 
 ## Effectiveness
@@ -86,17 +91,17 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2791    | 0.3025    | 0.2655    | 0.2895    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2798    | 0.3021    | 0.2658    | 0.2893    |
 
 
 R@100                                   | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.4092    | 0.4394    | 0.4020    | 0.4235    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.4093    | 0.4392    | 0.4026    | 0.4237    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.6099    | 0.6318    | 0.6271    | 0.6256    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.6119    | 0.6297    | 0.6273    | 0.6239    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl19-doc-per-passage.md b/docs/regressions-dl19-doc-segmented.md
similarity index 67%
rename from docs/regressions-dl19-doc-per-passage.md
rename to docs/regressions-dl19-doc-segmented.md
index 257d8bd76e..ad90df2ce3 100644
--- a/docs/regressions-dl19-doc-per-passage.md
+++ b/docs/regressions-dl19-doc-segmented.md
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)
+# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) Segmented
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2019 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** none
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc-per-passage.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc-per-passage.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc-segmented.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc-segmented.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -23,14 +27,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-per-passage \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -input /path/to/msmarco-doc-segmented \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-per-passage &
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-segmented &
 ```
 
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented/` should be a directory containing the segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -44,72 +49,72 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default.topics.dl19-doc.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+rm3.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+rm3.topics.dl19-doc.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+ax.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+ax.topics.dl19-doc.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+prf.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+prf.topics.dl19-doc.txt \
   -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned.topics.dl19-doc.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+rm3.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+rm3.topics.dl19-doc.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+ax.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+ax.topics.dl19-doc.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+prf.topics.dl19-doc.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+prf.topics.dl19-doc.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-default.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-default.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-default+rm3.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-default+rm3.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-default+ax.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-default+ax.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-default+prf.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-default+prf.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned+rm3.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned+rm3.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned+ax.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned+ax.topics.dl19-doc.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned+prf.topics.dl19-doc.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m ndcg_cut.10 src/main/resources/topics-and-qrels/qrels.dl19-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned+prf.topics.dl19-doc.txt
 ```
 
 ## Effectiveness
@@ -118,17 +123,17 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2441    | 0.2880    | 0.3015    | 0.2821    | 0.2394    | 0.2656    | 0.2934    | 0.2838    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2449    | 0.2884    | 0.2981    | 0.2827    | 0.2398    | 0.2658    | 0.2975    | 0.2828    |
 
 
 R@100                                   | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.3840    | 0.4356    | 0.4501    | 0.4477    | 0.3903    | 0.4126    | 0.4437    | 0.4362    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.3840    | 0.4355    | 0.4490    | 0.4476    | 0.3903    | 0.4133    | 0.4491    | 0.4361    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.5276    | 0.5750    | 0.5590    | 0.5591    | 0.5364    | 0.5379    | 0.5546    | 0.5478    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.5302    | 0.5764    | 0.5556    | 0.5599    | 0.5389    | 0.5405    | 0.5574    | 0.5476    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl19-doc.md b/docs/regressions-dl19-doc.md
index cd8ef510a6..071445ed05 100644
--- a/docs/regressions-dl19-doc.md
+++ b/docs/regressions-dl19-doc.md
@@ -10,26 +10,31 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** none
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl19-doc.yaml).
 Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl19-doc.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
 
 ```
 target/appassembler/bin/IndexCollection \
-  -collection CleanTrecCollection \
+  -collection JsonCollection \
   -input /path/to/msmarco-doc \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
+  -threads 7 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-doc &
 ```
 
-The directory `/path/to/msmarco-doc/` should be a directory containing the official document collection (a single file), in TREC format.
+The directory `/path/to/msmarco-doc/` should be a directory containing the document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -43,49 +48,49 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-default.topics.dl19-doc.txt \
   -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-default+rm3.topics.dl19-doc.txt \
   -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-default+ax.topics.dl19-doc.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-default+prf.topics.dl19-doc.txt \
   -bm25 -bm25prf -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned.topics.dl19-doc.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned+rm3.topics.dl19-doc.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned+ax.topics.dl19-doc.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -axiom -axiom.deterministic -rerankCutoff 20 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-doc.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned+prf.topics.dl19-doc.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -bm25prf -hits 100 &
@@ -117,17 +122,17 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2443    | 0.2772    | 0.2452    | 0.2541    | 0.2318    | 0.2700    | 0.2816    | 0.2758    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.2434    | 0.2774    | 0.2454    | 0.2541    | 0.2311    | 0.2684    | 0.2792    | 0.2774    |
 
 
 R@100                                   | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.3948    | 0.4189    | 0.3945    | 0.4004    | 0.3862    | 0.4193    | 0.4399    | 0.4287    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.3949    | 0.4189    | 0.3946    | 0.4003    | 0.3853    | 0.4186    | 0.4378    | 0.4295    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.5190    | 0.5169    | 0.4730    | 0.5105    | 0.5140    | 0.5485    | 0.5245    | 0.5280    |
+[DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)| 0.5176    | 0.5170    | 0.4732    | 0.5107    | 0.5139    | 0.5445    | 0.5203    | 0.5294    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl19-passage-docTTTTTquery.md b/docs/regressions-dl19-passage-docTTTTTquery.md
index fceebed0f0..98dd75084b 100644
--- a/docs/regressions-dl19-passage-docTTTTTquery.md
+++ b/docs/regressions-dl19-passage-docTTTTTquery.md
@@ -17,7 +17,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage-docTTTTTquery \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage-docTTTTTquery &
@@ -38,37 +38,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-default.topics.dl19-passage.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-default+rm3.topics.dl19-passage.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned.topics.dl19-passage.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned+rm3.topics.dl19-passage.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned2.topics.dl19-passage.txt \
   -bm25 -bm25.k1 2.18 -bm25.b 0.86 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned2+rm3.topics.dl19-passage.txt \
   -bm25 -bm25.k1 2.18 -bm25.b 0.86 -rm3 &
diff --git a/docs/regressions-dl19-passage.md b/docs/regressions-dl19-passage.md
index 7589791834..aeb07f5201 100644
--- a/docs/regressions-dl19-passage.md
+++ b/docs/regressions-dl19-passage.md
@@ -16,7 +16,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage &
@@ -37,49 +37,49 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default.topics.dl19-passage.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+rm3.topics.dl19-passage.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+ax.topics.dl19-passage.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+prf.topics.dl19-passage.txt \
   -bm25 -bm25prf &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned.topics.dl19-passage.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+rm3.topics.dl19-passage.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+ax.topics.dl19-passage.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl19-passage.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+prf.topics.dl19-passage.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -bm25prf &
diff --git a/docs/regressions-dl20-doc-docTTTTTquery-per-doc.md b/docs/regressions-dl20-doc-docTTTTTquery.md
similarity index 71%
rename from docs/regressions-dl20-doc-docTTTTTquery-per-doc.md
rename to docs/regressions-dl20-doc-docTTTTTquery.md
index 121e5d9de9..2a131f3d29 100644
--- a/docs/regressions-dl20-doc-docTTTTTquery-per-doc.md
+++ b/docs/regressions-dl20-doc-docTTTTTquery.md
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) w/ per-doc docTTTTTquery
+# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2020 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,10 +10,14 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc-docTTTTTquery-per-doc.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-doc.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc-docTTTTTquery.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc-docTTTTTquery.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -22,14 +26,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-doc \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -input /path/to/msmarco-doc-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-doc &
+  -threads 7 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-docTTTTTquery &
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-doc/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-docTTTTTquery/` should be a directory containing the expanded document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -43,40 +48,40 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default.topics.dl20.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-default.topics.dl20.txt \
   -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default+rm3.topics.dl20.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-default+rm3.topics.dl20.txt \
   -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned.topics.dl20.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-tuned.topics.dl20.txt \
   -bm25 -bm25.k1 4.68 -bm25.b 0.87 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned+rm3.topics.dl20.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-tuned+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 4.68 -bm25.b 0.87 -rm3 -hits 100 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-default.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default+rm3.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-default+rm3.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-tuned.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned+rm3.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery.bm25-tuned+rm3.topics.dl20.txt
 ```
 
 ## Effectiveness
@@ -85,7 +90,7 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.4230    | 0.4228    | 0.4098    | 0.4104    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.4230    | 0.4229    | 0.4099    | 0.4104    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
@@ -100,7 +105,7 @@ MRR                                     | BM25 (default)| +RM3      | BM25 (tune
 
 R@100                                   | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.6412    | 0.6555    | 0.6178    | 0.6127    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.6414    | 0.6555    | 0.6178    | 0.6127    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl20-doc-docTTTTTquery-per-passage.md b/docs/regressions-dl20-doc-segmented-docTTTTTquery.md
similarity index 66%
rename from docs/regressions-dl20-doc-docTTTTTquery-per-passage.md
rename to docs/regressions-dl20-doc-segmented-docTTTTTquery.md
index 7b64bc21e5..d59268c9c4 100644
--- a/docs/regressions-dl20-doc-docTTTTTquery-per-passage.md
+++ b/docs/regressions-dl20-doc-segmented-docTTTTTquery.md
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) w/ per-passage docTTTTTquery
+# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) Segmented w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2020 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc-docTTTTTquery-per-passage.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-passage.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc-segmented-docTTTTTquery.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc-segmented-docTTTTTquery.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -23,14 +27,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-passage \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -input /path/to/msmarco-doc-segmented-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-passage &
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-segmented-docTTTTTquery &
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented-docTTTTTquery/` should be a directory containing the expanded segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -44,40 +49,40 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default.topics.dl20.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default+rm3.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default+rm3.topics.dl20.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned.topics.dl20.txt \
   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned+rm3.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default+rm3.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default+rm3.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned+rm3.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned+rm3.topics.dl20.txt
 ```
 
 ## Effectiveness
@@ -86,12 +91,12 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.4150    | 0.4269    | 0.4042    | 0.4023    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.4150    | 0.4268    | 0.4047    | 0.4025    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5957    | 0.5848    | 0.5931    | 0.5723    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5957    | 0.5850    | 0.5943    | 0.5724    |
 
 
 MRR                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
@@ -101,7 +106,7 @@ MRR                                     | BM25 (default)| +RM3      | BM25 (tune
 
 R@100                                   | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.6201    | 0.6443    | 0.6192    | 0.6392    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.6201    | 0.6443    | 0.6195    | 0.6394    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl20-doc-per-passage.md b/docs/regressions-dl20-doc-segmented.md
similarity index 66%
rename from docs/regressions-dl20-doc-per-passage.md
rename to docs/regressions-dl20-doc-segmented.md
index b587a3bf18..1ba926308f 100644
--- a/docs/regressions-dl20-doc-per-passage.md
+++ b/docs/regressions-dl20-doc-segmented.md
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)
+# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) Segmented
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2020 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** none
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc-per-passage.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc-per-passage.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc-segmented.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc-segmented.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -23,14 +27,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-per-passage \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -input /path/to/msmarco-doc-segmented \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-per-passage &
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-segmented &
 ```
 
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented/` should be a directory containing the segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -44,72 +49,72 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default.topics.dl20.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+rm3.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+rm3.topics.dl20.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+ax.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+ax.topics.dl20.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+prf.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+prf.topics.dl20.txt \
   -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned.topics.dl20.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+rm3.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+ax.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+ax.topics.dl20.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+prf.topics.dl20.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+prf.topics.dl20.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-default.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-default.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-default+rm3.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-default+rm3.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-default+ax.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-default+ax.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-default+prf.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-default+prf.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned+rm3.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned+rm3.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned+ax.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned+ax.topics.dl20.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-per-passage.bm25-tuned+prf.topics.dl20.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m ndcg_cut.10 -c -m recip_rank -c -m recall.100 src/main/resources/topics-and-qrels/qrels.dl20-doc.txt runs/run.msmarco-doc-segmented.bm25-tuned+prf.topics.dl20.txt
 ```
 
 ## Effectiveness
@@ -118,22 +123,22 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.3584    | 0.3769    | 0.3854    | 0.3672    | 0.3456    | 0.3471    | 0.3495    | 0.3629    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.3586    | 0.3774    | 0.3868    | 0.3686    | 0.3458    | 0.3472    | 0.3486    | 0.3627    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5271    | 0.5159    | 0.5250    | 0.5217    | 0.5213    | 0.4983    | 0.4942    | 0.5260    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5281    | 0.5179    | 0.5227    | 0.5238    | 0.5213    | 0.4979    | 0.4948    | 0.5251    |
 
 
 MRR                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.8479    | 0.8136    | 0.8123    | 0.7911    | 0.8684    | 0.7807    | 0.8102    | 0.8478    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.8479    | 0.8136    | 0.8028    | 0.7911    | 0.8684    | 0.7807    | 0.8019    | 0.8478    |
 
 
 R@100                                   | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5823    | 0.6224    | 0.6332    | 0.5994    | 0.5715    | 0.6013    | 0.6086    | 0.6064    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5823    | 0.6224    | 0.6362    | 0.6012    | 0.5723    | 0.6025    | 0.6114    | 0.6048    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl20-doc.md b/docs/regressions-dl20-doc.md
index 9d44ade798..8bd43b3758 100644
--- a/docs/regressions-dl20-doc.md
+++ b/docs/regressions-dl20-doc.md
@@ -10,26 +10,31 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** none
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/dl20-doc.yaml).
 Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/dl20-doc.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
 
 ```
 target/appassembler/bin/IndexCollection \
-  -collection CleanTrecCollection \
+  -collection JsonCollection \
   -input /path/to/msmacro-doc \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
+  -threads 7 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmacro-doc &
 ```
 
-The directory `/path/to/msmarco-doc/` should be a directory containing the official document collection (a single file), in TREC format.
+The directory `/path/to/msmarco-doc/` should be a directory containing the document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -43,37 +48,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmacro-doc.bm25-default.topics.dl20.txt \
   -bm25 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmacro-doc.bm25-default+rm3.topics.dl20.txt \
   -bm25 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmacro-doc.bm25-tuned.topics.dl20.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmacro-doc.bm25-tuned+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -rm3 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmacro-doc.bm25-tuned2.topics.dl20.txt \
   -bm25 -bm25.k1 4.46 -bm25.b 0.82 -hits 100 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmacro-doc.bm25-tuned2+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 4.46 -bm25.b 0.82 -rm3 -hits 100 &
@@ -101,22 +106,22 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.3791    | 0.4006    | 0.3630    | 0.3588    | 0.3583    | 0.3618    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.3793    | 0.4014    | 0.3631    | 0.3592    | 0.3581    | 0.3619    |
 
 
 nDCG@10                                 | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5271    | 0.5248    | 0.5087    | 0.5117    | 0.5078    | 0.5202    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.5286    | 0.5225    | 0.5070    | 0.5124    | 0.5061    | 0.5238    |
 
 
 MRR                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.8521    | 0.8541    | 0.8641    | 0.8188    | 0.8541    | 0.8458    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.8521    | 0.8541    | 0.8641    | 0.8186    | 0.8522    | 0.8582    |
 
 
 R@100                                   | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.6110    | 0.6392    | 0.5926    | 0.5983    | 0.5860    | 0.5998    |
+[DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)| 0.6110    | 0.6414    | 0.5935    | 0.5977    | 0.5860    | 0.5995    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-dl20-passage-docTTTTTquery.md b/docs/regressions-dl20-passage-docTTTTTquery.md
index 7bfe7b4232..fd1771189b 100644
--- a/docs/regressions-dl20-passage-docTTTTTquery.md
+++ b/docs/regressions-dl20-passage-docTTTTTquery.md
@@ -17,7 +17,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage-docTTTTTquery \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage-docTTTTTquery &
@@ -38,37 +38,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-default.topics.dl20.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-default+rm3.topics.dl20.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned.topics.dl20.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned2.topics.dl20.txt \
   -bm25 -bm25.k1 2.18 -bm25.b 0.86 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned2+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 2.18 -bm25.b 0.86 -rm3 &
diff --git a/docs/regressions-dl20-passage.md b/docs/regressions-dl20-passage.md
index 6e99131df4..d346ba9abe 100644
--- a/docs/regressions-dl20-passage.md
+++ b/docs/regressions-dl20-passage.md
@@ -16,7 +16,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage &
@@ -37,49 +37,49 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default.topics.dl20.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+rm3.topics.dl20.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+ax.topics.dl20.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+prf.topics.dl20.txt \
   -bm25 -bm25prf &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned.topics.dl20.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+rm3.topics.dl20.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+ax.topics.dl20.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl20.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+prf.topics.dl20.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -bm25prf &
diff --git a/docs/regressions-dl21-doc-segmented-unicoil-noexp-0shot.md b/docs/regressions-dl21-doc-segmented-unicoil-noexp-0shot.md
index ff38bbb9f4..adf55fbd98 100644
--- a/docs/regressions-dl21-doc-segmented-unicoil-noexp-0shot.md
+++ b/docs/regressions-dl21-doc-segmented-unicoil-noexp-0shot.md
@@ -21,7 +21,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-v2-doc-segmented-unicoil-noexp-0shot \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -impact -pretokenized \
   >& logs/log.msmarco-v2-doc-segmented-unicoil-noexp-0shot &
@@ -41,7 +41,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.unicoil-noexp.0shot.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented-unicoil-noexp-0shot.unicoil-noexp-0shot.topics.dl21.unicoil-noexp.0shot.tsv.gz \
   -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -impact -pretokenized &
diff --git a/docs/regressions-dl21-doc-segmented.md b/docs/regressions-dl21-doc-segmented.md
index c1d8021aeb..ae1670692d 100644
--- a/docs/regressions-dl21-doc-segmented.md
+++ b/docs/regressions-dl21-doc-segmented.md
@@ -25,7 +25,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2DocCollection \
   -input /path/to/msmarco-v2-doc-segmented \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-doc-segmented &
@@ -46,25 +46,25 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default.topics.dl21.txt \
   -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+rm3.topics.dl21.txt \
   -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+ax.topics.dl21.txt \
   -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+prf.topics.dl21.txt \
   -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 -bm25 -bm25prf &
diff --git a/docs/regressions-dl21-doc.md b/docs/regressions-dl21-doc.md
index 1fddb1bd49..33e15b71a1 100644
--- a/docs/regressions-dl21-doc.md
+++ b/docs/regressions-dl21-doc.md
@@ -25,7 +25,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2DocCollection \
   -input /path/to/msmarco-v2-doc \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-doc &
@@ -46,25 +46,25 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default.topics.dl21.txt \
   -hits 1000 -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+rm3.topics.dl21.txt \
   -hits 1000 -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+ax.topics.dl21.txt \
   -hits 1000 -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+prf.topics.dl21.txt \
   -hits 1000 -bm25 -bm25prf &
diff --git a/docs/regressions-dl21-passage-augmented.md b/docs/regressions-dl21-passage-augmented.md
index baf66c709d..7f741f8b31 100644
--- a/docs/regressions-dl21-passage-augmented.md
+++ b/docs/regressions-dl21-passage-augmented.md
@@ -20,7 +20,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2PassageCollection \
   -input /path/to/msmarco-v2-passage-augmented \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-passage-augmented &
@@ -41,25 +41,25 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default.topics.dl21.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+rm3.topics.dl21.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+ax.topics.dl21.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+prf.topics.dl21.txt \
   -bm25 -bm25prf &
diff --git a/docs/regressions-dl21-passage-unicoil-noexp-0shot.md b/docs/regressions-dl21-passage-unicoil-noexp-0shot.md
index ece55f7d87..72d270a653 100644
--- a/docs/regressions-dl21-passage-unicoil-noexp-0shot.md
+++ b/docs/regressions-dl21-passage-unicoil-noexp-0shot.md
@@ -21,7 +21,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-v2-passage-unicoil-noexp-0shot \
-  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -impact -pretokenized \
   >& logs/log.msmarco-v2-passage-unicoil-noexp-0shot &
@@ -41,7 +41,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.unicoil-noexp.0shot.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-unicoil-noexp-0shot.unicoil-noexp-0shot.topics.dl21.unicoil-noexp.0shot.tsv.gz \
   -impact -pretokenized &
diff --git a/docs/regressions-dl21-passage.md b/docs/regressions-dl21-passage.md
index fdcd1ca289..ec3bb511ac 100644
--- a/docs/regressions-dl21-passage.md
+++ b/docs/regressions-dl21-passage.md
@@ -20,7 +20,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2PassageCollection \
   -input /path/to/msmarco-v2-passage \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-passage &
@@ -41,25 +41,25 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default.topics.dl21.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+rm3.topics.dl21.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+ax.topics.dl21.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.dl21.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+prf.topics.dl21.txt \
   -bm25 -bm25prf &
diff --git a/docs/regressions-fever.md b/docs/regressions-fever.md
index 0f1d341a1c..daaae55207 100644
--- a/docs/regressions-fever.md
+++ b/docs/regressions-fever.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection FeverParagraphCollection \
   -input /path/to/fever \
-  -index indexes/lucene-index.fever-paragraph \
+  -index indexes/lucene-index.fever-paragraph/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.fever &
@@ -33,13 +33,13 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.fever-paragraph \
+  -index indexes/lucene-index.fever-paragraph/ \
   -topics src/main/resources/topics-and-qrels/topics.fever.dev.txt -topicreader TsvInt \
   -output runs/run.fever.bm25-default.topics.fever.dev.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.fever-paragraph \
+  -index indexes/lucene-index.fever-paragraph/ \
   -topics src/main/resources/topics-and-qrels/topics.fever.dev.txt -topicreader TsvInt \
   -output runs/run.fever.bm25-tuned.topics.fever.dev.txt \
   -bm25 -bm25.k1 0.9 -bm25.b 0.1 &
diff --git a/docs/regressions-fire12-bn.md b/docs/regressions-fire12-bn.md
index 635d26dad4..07c212fe98 100644
--- a/docs/regressions-fire12-bn.md
+++ b/docs/regressions-fire12-bn.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CleanTrecCollection \
   -input /path/to/fire12-bn \
-  -index indexes/lucene-index.fire12-bn \
+  -index indexes/lucene-index.fire12-bn/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw -language bn \
   >& logs/log.fire12-bn &
@@ -36,7 +36,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.fire12-bn \
+  -index indexes/lucene-index.fire12-bn/ \
   -topics src/main/resources/topics-and-qrels/topics.fire12bn.176-225.txt -topicreader Trec \
   -output runs/run.fire12-bn.bm25.topics.fire12bn.176-225.txt \
   -bm25 -language bn &
diff --git a/docs/regressions-fire12-en.md b/docs/regressions-fire12-en.md
index 4224afa714..1ca9638db6 100644
--- a/docs/regressions-fire12-en.md
+++ b/docs/regressions-fire12-en.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CleanTrecCollection \
   -input /path/to/fire12-en \
-  -index indexes/lucene-index.fire12-en \
+  -index indexes/lucene-index.fire12-en/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw -language en \
   >& logs/log.fire12-en &
@@ -36,7 +36,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.fire12-en \
+  -index indexes/lucene-index.fire12-en/ \
   -topics src/main/resources/topics-and-qrels/topics.fire12en.176-225.txt -topicreader Trec \
   -output runs/run.fire12-en.bm25.topics.fire12en.176-225.txt \
   -bm25 -language en &
diff --git a/docs/regressions-fire12-hi.md b/docs/regressions-fire12-hi.md
index a36557e2d7..8208dbc686 100644
--- a/docs/regressions-fire12-hi.md
+++ b/docs/regressions-fire12-hi.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CleanTrecCollection \
   -input /path/to/fire12-hi \
-  -index indexes/lucene-index.fire12-hi \
+  -index indexes/lucene-index.fire12-hi/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw -language hi \
   >& logs/log.fire12-hi &
@@ -36,7 +36,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.fire12-hi \
+  -index indexes/lucene-index.fire12-hi/ \
   -topics src/main/resources/topics-and-qrels/topics.fire12hi.176-225.txt -topicreader Trec \
   -output runs/run.fire12-hi.bm25.topics.fire12hi.176-225.txt \
   -bm25 -language hi &
diff --git a/docs/regressions-gov2.md b/docs/regressions-gov2.md
index 980aa0a037..91bad5b258 100644
--- a/docs/regressions-gov2.md
+++ b/docs/regressions-gov2.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection TrecwebCollection \
   -input /path/to/gov2 \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 44 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.gov2 &
@@ -37,97 +37,97 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte04.701-750.txt -topicreader Trec \
   -output runs/run.gov2.bm25.topics.terabyte04.701-750.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte05.751-800.txt -topicreader Trec \
   -output runs/run.gov2.bm25.topics.terabyte05.751-800.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte06.801-850.txt -topicreader Trec \
   -output runs/run.gov2.bm25.topics.terabyte06.801-850.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte04.701-750.txt -topicreader Trec \
   -output runs/run.gov2.bm25+rm3.topics.terabyte04.701-750.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte05.751-800.txt -topicreader Trec \
   -output runs/run.gov2.bm25+rm3.topics.terabyte05.751-800.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte06.801-850.txt -topicreader Trec \
   -output runs/run.gov2.bm25+rm3.topics.terabyte06.801-850.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte04.701-750.txt -topicreader Trec \
   -output runs/run.gov2.bm25+ax.topics.terabyte04.701-750.txt \
   -bm25 -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte05.751-800.txt -topicreader Trec \
   -output runs/run.gov2.bm25+ax.topics.terabyte05.751-800.txt \
   -bm25 -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte06.801-850.txt -topicreader Trec \
   -output runs/run.gov2.bm25+ax.topics.terabyte06.801-850.txt \
   -bm25 -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte04.701-750.txt -topicreader Trec \
   -output runs/run.gov2.ql.topics.terabyte04.701-750.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte05.751-800.txt -topicreader Trec \
   -output runs/run.gov2.ql.topics.terabyte05.751-800.txt \
   -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte06.801-850.txt -topicreader Trec \
   -output runs/run.gov2.ql.topics.terabyte06.801-850.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte04.701-750.txt -topicreader Trec \
   -output runs/run.gov2.ql+rm3.topics.terabyte04.701-750.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte05.751-800.txt -topicreader Trec \
   -output runs/run.gov2.ql+rm3.topics.terabyte05.751-800.txt \
   -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte06.801-850.txt -topicreader Trec \
   -output runs/run.gov2.ql+rm3.topics.terabyte06.801-850.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte04.701-750.txt -topicreader Trec \
   -output runs/run.gov2.ql+ax.topics.terabyte04.701-750.txt \
   -qld -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte05.751-800.txt -topicreader Trec \
   -output runs/run.gov2.ql+ax.topics.terabyte05.751-800.txt \
   -qld -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.gov2 \
+  -index indexes/lucene-index.gov2/ \
   -topics src/main/resources/topics-and-qrels/topics.terabyte06.801-850.txt -topicreader Trec \
   -output runs/run.gov2.ql+ax.topics.terabyte06.801-850.txt \
   -qld -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-mb11.md b/docs/regressions-mb11.md
index 812c8b23d6..41b5ece3ab 100644
--- a/docs/regressions-mb11.md
+++ b/docs/regressions-mb11.md
@@ -15,7 +15,7 @@ Indexing the Tweets2011 collection:
 target/appassembler/bin/IndexCollection \
   -collection TweetCollection \
   -input /path/to/mb11 \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -generator TweetGenerator \
   -threads 44 -storePositions -storeDocvectors -storeRaw -uniqueDocid -tweet.keepUrls -tweet.stemming \
   >& logs/log.mb11 &
@@ -43,67 +43,67 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2011.txt -topicreader Microblog \
   -output runs/run.mb11.bm25.topics.microblog2011.txt \
   -searchtweets -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2012.txt -topicreader Microblog \
   -output runs/run.mb11.bm25.topics.microblog2012.txt \
   -searchtweets -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2011.txt -topicreader Microblog \
   -output runs/run.mb11.bm25+rm3.topics.microblog2011.txt \
   -searchtweets -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2012.txt -topicreader Microblog \
   -output runs/run.mb11.bm25+rm3.topics.microblog2012.txt \
   -searchtweets -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2011.txt -topicreader Microblog \
   -output runs/run.mb11.bm25+ax.topics.microblog2011.txt \
   -searchtweets -bm25 -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2012.txt -topicreader Microblog \
   -output runs/run.mb11.bm25+ax.topics.microblog2012.txt \
   -searchtweets -bm25 -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2011.txt -topicreader Microblog \
   -output runs/run.mb11.ql.topics.microblog2011.txt \
   -searchtweets -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2012.txt -topicreader Microblog \
   -output runs/run.mb11.ql.topics.microblog2012.txt \
   -searchtweets -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2011.txt -topicreader Microblog \
   -output runs/run.mb11.ql+rm3.topics.microblog2011.txt \
   -searchtweets -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2012.txt -topicreader Microblog \
   -output runs/run.mb11.ql+rm3.topics.microblog2012.txt \
   -searchtweets -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2011.txt -topicreader Microblog \
   -output runs/run.mb11.ql+ax.topics.microblog2011.txt \
   -searchtweets -qld -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb11 \
+  -index indexes/lucene-index.mb11/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2012.txt -topicreader Microblog \
   -output runs/run.mb11.ql+ax.topics.microblog2012.txt \
   -searchtweets -qld -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-mb13.md b/docs/regressions-mb13.md
index b3b5692d2a..0fff7a1fcb 100644
--- a/docs/regressions-mb13.md
+++ b/docs/regressions-mb13.md
@@ -15,7 +15,7 @@ Indexing the Tweets2013 collection:
 target/appassembler/bin/IndexCollection \
   -collection TweetCollection \
   -input /path/to/mb13 \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -generator TweetGenerator \
   -threads 44 -storePositions -storeDocvectors -storeRaw -uniqueDocid -optimize -tweet.keepUrls -tweet.stemming \
   >& logs/log.mb13 &
@@ -43,67 +43,67 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2013.txt -topicreader Microblog \
   -output runs/run.mb13.bm25.topics.microblog2013.txt \
   -searchtweets -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2014.txt -topicreader Microblog \
   -output runs/run.mb13.bm25.topics.microblog2014.txt \
   -searchtweets -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2013.txt -topicreader Microblog \
   -output runs/run.mb13.bm25+rm3.topics.microblog2013.txt \
   -searchtweets -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2014.txt -topicreader Microblog \
   -output runs/run.mb13.bm25+rm3.topics.microblog2014.txt \
   -searchtweets -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2013.txt -topicreader Microblog \
   -output runs/run.mb13.bm25+ax.topics.microblog2013.txt \
   -searchtweets -bm25 -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2014.txt -topicreader Microblog \
   -output runs/run.mb13.bm25+ax.topics.microblog2014.txt \
   -searchtweets -bm25 -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2013.txt -topicreader Microblog \
   -output runs/run.mb13.ql.topics.microblog2013.txt \
   -searchtweets -qld &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2014.txt -topicreader Microblog \
   -output runs/run.mb13.ql.topics.microblog2014.txt \
   -searchtweets -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2013.txt -topicreader Microblog \
   -output runs/run.mb13.ql+rm3.topics.microblog2013.txt \
   -searchtweets -qld -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2014.txt -topicreader Microblog \
   -output runs/run.mb13.ql+rm3.topics.microblog2014.txt \
   -searchtweets -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2013.txt -topicreader Microblog \
   -output runs/run.mb13.ql+ax.topics.microblog2013.txt \
   -searchtweets -qld -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mb13 \
+  -index indexes/lucene-index.mb13/ \
   -topics src/main/resources/topics-and-qrels/topics.microblog2014.txt -topicreader Microblog \
   -output runs/run.mb13.ql+ax.topics.microblog2014.txt \
   -searchtweets -qld -axiom -axiom.beta 1.0 -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-mrtydi-v1.1-ar.md b/docs/regressions-mrtydi-v1.1-ar.md
index 8c85707539..e9a98b1db3 100644
--- a/docs/regressions-mrtydi-v1.1-ar.md
+++ b/docs/regressions-mrtydi-v1.1-ar.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-ar \
-  -index indexes/lucene-index.mrtydi-v1.1-arabic \
+  -index indexes/lucene-index.mrtydi-v1.1-arabic/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language ar \
   >& logs/log.mrtydi-v1.1-ar &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-arabic \
+  -index indexes/lucene-index.mrtydi-v1.1-arabic/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ar.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ar.bm25.topics.mrtydi-v1.1-ar.train.txt.gz \
   -bm25 -hits 100 -language ar &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-arabic \
+  -index indexes/lucene-index.mrtydi-v1.1-arabic/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ar.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ar.bm25.topics.mrtydi-v1.1-ar.dev.txt.gz \
   -bm25 -hits 100 -language ar &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-arabic \
+  -index indexes/lucene-index.mrtydi-v1.1-arabic/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ar.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ar.bm25.topics.mrtydi-v1.1-ar.test.txt.gz \
   -bm25 -hits 100 -language ar &
diff --git a/docs/regressions-mrtydi-v1.1-bn.md b/docs/regressions-mrtydi-v1.1-bn.md
index 45354cf198..f27a1dcadd 100644
--- a/docs/regressions-mrtydi-v1.1-bn.md
+++ b/docs/regressions-mrtydi-v1.1-bn.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-bn \
-  -index indexes/lucene-index.mrtydi-v1.1-bengali \
+  -index indexes/lucene-index.mrtydi-v1.1-bengali/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language bn \
   >& logs/log.mrtydi-v1.1-bn &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-bengali \
+  -index indexes/lucene-index.mrtydi-v1.1-bengali/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-bn.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-bn.bm25.topics.mrtydi-v1.1-bn.train.txt.gz \
   -bm25 -hits 100 -language bn &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-bengali \
+  -index indexes/lucene-index.mrtydi-v1.1-bengali/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-bn.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-bn.bm25.topics.mrtydi-v1.1-bn.dev.txt.gz \
   -bm25 -hits 100 -language bn &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-bengali \
+  -index indexes/lucene-index.mrtydi-v1.1-bengali/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-bn.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-bn.bm25.topics.mrtydi-v1.1-bn.test.txt.gz \
   -bm25 -hits 100 -language bn &
diff --git a/docs/regressions-mrtydi-v1.1-en.md b/docs/regressions-mrtydi-v1.1-en.md
index 0cf3e41f32..65dfc491c9 100644
--- a/docs/regressions-mrtydi-v1.1-en.md
+++ b/docs/regressions-mrtydi-v1.1-en.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-en \
-  -index indexes/lucene-index.mrtydi-v1.1-english \
+  -index indexes/lucene-index.mrtydi-v1.1-english/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language en \
   >& logs/log.mrtydi-v1.1-en &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-english \
+  -index indexes/lucene-index.mrtydi-v1.1-english/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-en.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-en.bm25.topics.mrtydi-v1.1-en.train.txt.gz \
   -bm25 -hits 100 -language en &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-english \
+  -index indexes/lucene-index.mrtydi-v1.1-english/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-en.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-en.bm25.topics.mrtydi-v1.1-en.dev.txt.gz \
   -bm25 -hits 100 -language en &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-english \
+  -index indexes/lucene-index.mrtydi-v1.1-english/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-en.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-en.bm25.topics.mrtydi-v1.1-en.test.txt.gz \
   -bm25 -hits 100 -language en &
diff --git a/docs/regressions-mrtydi-v1.1-fi.md b/docs/regressions-mrtydi-v1.1-fi.md
index df7c366795..c6a666b478 100644
--- a/docs/regressions-mrtydi-v1.1-fi.md
+++ b/docs/regressions-mrtydi-v1.1-fi.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-fi \
-  -index indexes/lucene-index.mrtydi-v1.1-finnish \
+  -index indexes/lucene-index.mrtydi-v1.1-finnish/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language fi \
   >& logs/log.mrtydi-v1.1-fi &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-finnish \
+  -index indexes/lucene-index.mrtydi-v1.1-finnish/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-fi.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-fi.bm25.topics.mrtydi-v1.1-fi.train.txt.gz \
   -bm25 -hits 100 -language fi &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-finnish \
+  -index indexes/lucene-index.mrtydi-v1.1-finnish/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-fi.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-fi.bm25.topics.mrtydi-v1.1-fi.dev.txt.gz \
   -bm25 -hits 100 -language fi &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-finnish \
+  -index indexes/lucene-index.mrtydi-v1.1-finnish/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-fi.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-fi.bm25.topics.mrtydi-v1.1-fi.test.txt.gz \
   -bm25 -hits 100 -language fi &
diff --git a/docs/regressions-mrtydi-v1.1-id.md b/docs/regressions-mrtydi-v1.1-id.md
index 5f4a73881b..fe9ad21dca 100644
--- a/docs/regressions-mrtydi-v1.1-id.md
+++ b/docs/regressions-mrtydi-v1.1-id.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-id \
-  -index indexes/lucene-index.mrtydi-v1.1-indonesian \
+  -index indexes/lucene-index.mrtydi-v1.1-indonesian/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language id \
   >& logs/log.mrtydi-v1.1-id &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-indonesian \
+  -index indexes/lucene-index.mrtydi-v1.1-indonesian/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-id.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-id.bm25.topics.mrtydi-v1.1-id.train.txt.gz \
   -bm25 -hits 100 -language id &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-indonesian \
+  -index indexes/lucene-index.mrtydi-v1.1-indonesian/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-id.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-id.bm25.topics.mrtydi-v1.1-id.dev.txt.gz \
   -bm25 -hits 100 -language id &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-indonesian \
+  -index indexes/lucene-index.mrtydi-v1.1-indonesian/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-id.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-id.bm25.topics.mrtydi-v1.1-id.test.txt.gz \
   -bm25 -hits 100 -language id &
diff --git a/docs/regressions-mrtydi-v1.1-ja.md b/docs/regressions-mrtydi-v1.1-ja.md
index 2fc54626df..74a88188f6 100644
--- a/docs/regressions-mrtydi-v1.1-ja.md
+++ b/docs/regressions-mrtydi-v1.1-ja.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-ja \
-  -index indexes/lucene-index.mrtydi-v1.1-japanese \
+  -index indexes/lucene-index.mrtydi-v1.1-japanese/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language ja \
   >& logs/log.mrtydi-v1.1-ja &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-japanese \
+  -index indexes/lucene-index.mrtydi-v1.1-japanese/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ja.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ja.bm25.topics.mrtydi-v1.1-ja.train.txt.gz \
   -bm25 -hits 100 -language ja &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-japanese \
+  -index indexes/lucene-index.mrtydi-v1.1-japanese/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ja.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ja.bm25.topics.mrtydi-v1.1-ja.dev.txt.gz \
   -bm25 -hits 100 -language ja &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-japanese \
+  -index indexes/lucene-index.mrtydi-v1.1-japanese/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ja.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ja.bm25.topics.mrtydi-v1.1-ja.test.txt.gz \
   -bm25 -hits 100 -language ja &
diff --git a/docs/regressions-mrtydi-v1.1-ko.md b/docs/regressions-mrtydi-v1.1-ko.md
index 086b9f7013..e751484625 100644
--- a/docs/regressions-mrtydi-v1.1-ko.md
+++ b/docs/regressions-mrtydi-v1.1-ko.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-ko \
-  -index indexes/lucene-index.mrtydi-v1.1-korean \
+  -index indexes/lucene-index.mrtydi-v1.1-korean/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language ko \
   >& logs/log.mrtydi-v1.1-ko &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-korean \
+  -index indexes/lucene-index.mrtydi-v1.1-korean/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ko.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ko.bm25.topics.mrtydi-v1.1-ko.train.txt.gz \
   -bm25 -hits 100 -language ko &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-korean \
+  -index indexes/lucene-index.mrtydi-v1.1-korean/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ko.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ko.bm25.topics.mrtydi-v1.1-ko.dev.txt.gz \
   -bm25 -hits 100 -language ko &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-korean \
+  -index indexes/lucene-index.mrtydi-v1.1-korean/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ko.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ko.bm25.topics.mrtydi-v1.1-ko.test.txt.gz \
   -bm25 -hits 100 -language ko &
diff --git a/docs/regressions-mrtydi-v1.1-ru.md b/docs/regressions-mrtydi-v1.1-ru.md
index 3a3fa9ba64..674d6955d9 100644
--- a/docs/regressions-mrtydi-v1.1-ru.md
+++ b/docs/regressions-mrtydi-v1.1-ru.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-ru \
-  -index indexes/lucene-index.mrtydi-v1.1-russian \
+  -index indexes/lucene-index.mrtydi-v1.1-russian/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language ru \
   >& logs/log.mrtydi-v1.1-ru &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-russian \
+  -index indexes/lucene-index.mrtydi-v1.1-russian/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ru.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ru.bm25.topics.mrtydi-v1.1-ru.train.txt.gz \
   -bm25 -hits 100 -language ru &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-russian \
+  -index indexes/lucene-index.mrtydi-v1.1-russian/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ru.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ru.bm25.topics.mrtydi-v1.1-ru.dev.txt.gz \
   -bm25 -hits 100 -language ru &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-russian \
+  -index indexes/lucene-index.mrtydi-v1.1-russian/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-ru.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-ru.bm25.topics.mrtydi-v1.1-ru.test.txt.gz \
   -bm25 -hits 100 -language ru &
diff --git a/docs/regressions-mrtydi-v1.1-sw.md b/docs/regressions-mrtydi-v1.1-sw.md
index a32c9a331d..7c11768aff 100644
--- a/docs/regressions-mrtydi-v1.1-sw.md
+++ b/docs/regressions-mrtydi-v1.1-sw.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-sw \
-  -index indexes/lucene-index.mrtydi-v1.1-swahili \
+  -index indexes/lucene-index.mrtydi-v1.1-swahili/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -pretokenized \
   >& logs/log.mrtydi-v1.1-sw &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-swahili \
+  -index indexes/lucene-index.mrtydi-v1.1-swahili/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-sw.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-sw.bm25.topics.mrtydi-v1.1-sw.train.txt.gz \
   -bm25 -hits 100 -pretokenized &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-swahili \
+  -index indexes/lucene-index.mrtydi-v1.1-swahili/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-sw.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-sw.bm25.topics.mrtydi-v1.1-sw.dev.txt.gz \
   -bm25 -hits 100 -pretokenized &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-swahili \
+  -index indexes/lucene-index.mrtydi-v1.1-swahili/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-sw.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-sw.bm25.topics.mrtydi-v1.1-sw.test.txt.gz \
   -bm25 -hits 100 -pretokenized &
diff --git a/docs/regressions-mrtydi-v1.1-te.md b/docs/regressions-mrtydi-v1.1-te.md
index 4cce65e874..bfce2b842b 100644
--- a/docs/regressions-mrtydi-v1.1-te.md
+++ b/docs/regressions-mrtydi-v1.1-te.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-te \
-  -index indexes/lucene-index.mrtydi-v1.1-telugu \
+  -index indexes/lucene-index.mrtydi-v1.1-telugu/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -pretokenized \
   >& logs/log.mrtydi-v1.1-te &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-telugu \
+  -index indexes/lucene-index.mrtydi-v1.1-telugu/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-te.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-te.bm25.topics.mrtydi-v1.1-te.train.txt.gz \
   -bm25 -hits 100 -pretokenized &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-telugu \
+  -index indexes/lucene-index.mrtydi-v1.1-telugu/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-te.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-te.bm25.topics.mrtydi-v1.1-te.dev.txt.gz \
   -bm25 -hits 100 -pretokenized &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-telugu \
+  -index indexes/lucene-index.mrtydi-v1.1-telugu/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-te.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-te.bm25.topics.mrtydi-v1.1-te.test.txt.gz \
   -bm25 -hits 100 -pretokenized &
diff --git a/docs/regressions-mrtydi-v1.1-th.md b/docs/regressions-mrtydi-v1.1-th.md
index f37c2d0a69..a9b32dc761 100644
--- a/docs/regressions-mrtydi-v1.1-th.md
+++ b/docs/regressions-mrtydi-v1.1-th.md
@@ -13,7 +13,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MrTyDiCollection \
   -input /path/to/mrtydi-v1.1-th \
-  -index indexes/lucene-index.mrtydi-v1.1-thai \
+  -index indexes/lucene-index.mrtydi-v1.1-thai/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 1 -storePositions -storeDocvectors -storeRaw -language th \
   >& logs/log.mrtydi-v1.1-th &
@@ -28,17 +28,17 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-thai \
+  -index indexes/lucene-index.mrtydi-v1.1-thai/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-th.train.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-th.bm25.topics.mrtydi-v1.1-th.train.txt.gz \
   -bm25 -hits 100 -language th &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-thai \
+  -index indexes/lucene-index.mrtydi-v1.1-thai/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-th.dev.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-th.bm25.topics.mrtydi-v1.1-th.dev.txt.gz \
   -bm25 -hits 100 -language th &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.mrtydi-v1.1-thai \
+  -index indexes/lucene-index.mrtydi-v1.1-thai/ \
   -topics src/main/resources/topics-and-qrels/topics.mrtydi-v1.1-th.test.txt.gz -topicreader TsvInt \
   -output runs/run.mrtydi-v1.1-th.bm25.topics.mrtydi-v1.1-th.test.txt.gz \
   -bm25 -hits 100 -language th &
diff --git a/docs/regressions-msmarco-doc-docTTTTTquery-per-passage-v3.md b/docs/regressions-msmarco-doc-docTTTTTquery-per-passage-v3.md
deleted file mode 100644
index 02db374d6e..0000000000
--- a/docs/regressions-msmarco-doc-docTTTTTquery-per-passage-v3.md
+++ /dev/null
@@ -1,136 +0,0 @@
-# Anserini: Regressions for MS MARCO Document Ranking
-
-This page documents regression experiments for the [MS MARCO document ranking task](https://github.com/microsoft/MSMARCO-Document-Ranking), which is integrated into Anserini's regression testing framework.
-Note that there are four different regression conditions for this task, and this page describes the following:
-
-+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
-+ **Expansion Condition:** doc2query-T5
-
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-docTTTTTquery-per-passage-v3` variant (there's also `msmarco-doc-docTTTTTquery-per-passage`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
-
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage-v3.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage-v3.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
-
-## Indexing
-
-Typical indexing command:
-
-```
-target/appassembler/bin/IndexCollection \
-  -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-passage-v3 \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3 \
-  -generator DefaultLuceneDocumentGenerator \
-  -threads 16 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-passage-v3 &
-```
-
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
-
-For additional details, see explanation of [common indexing options](common-indexing-options.md).
-
-## Retrieval
-
-Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
-The regression experiments here evaluate on the 5193 dev set questions.
-
-After indexing has completed, you should be able to perform retrieval as follows:
-
-```
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-default.topics.msmarco-doc.dev.txt \
-  -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-tuned.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-```
-
-Evaluation can be performed using `trec_eval`:
-
-```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-default.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-tuned.topics.msmarco-doc.dev.txt
-```
-
-## Effectiveness
-
-With the above commands, you should be able to reproduce the following results:
-
-MAP                                     | BM25 (default)| BM25 (tuned)|
-:---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.3184    | 0.3213    |
-
-
-R@100                                   | BM25 (default)| BM25 (tuned)|
-:---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.8479    | 0.8627    |
-
-
-R@1000                                  | BM25 (default)| BM25 (tuned)|
-:---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.9490    | 0.9530    |
-
-Explanation of settings:
-
-+ The setting "default" refers the default BM25 settings of `k1=0.9`, `b=0.4`.
-+ The setting "tuned" refers to `k1=2.56`, `b=0.59`, tuned to optimize for recall@100 (i.e., for first-stage retrieval) on 2019/12.
-
-In these runs, we are retrieving the top 1000 hits for each query and using `trec_eval` to evaluate all 1000 hits.
-Since we're in the passage condition, we fetch the 10000 passages and select the top 1000 documents using MaxP.
-This lets us measure R@100 and R@1000; the latter is particularly important when these runs are used as first-stage retrieval.
-Beware, an official MS MARCO document ranking task leaderboard submission comprises only 100 hits per query.
-See [this page](experiments-msmarco-doc-leaderboard.md) for details on Anserini baseline runs that were submitted to the official leaderboard.
-
-The MaxP passage retrieval functionality is available in `SearchCollection`.
-To generate an MS MARCO submission with the BM25 default parameters, corresponding to "BM25 (default)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-default.txt -format msmarco \
-   -bm25 -bm25.k1 0.9 -bm25.b 0.4 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-default.txt
-
-#####################
-MRR @100: 0.317905445196054
-QueriesRanked: 5193
-#####################
-```
-
-Note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
-
-To generate an MS MARCO submission with the BM25 tuned parameters, corresponding to "BM25 (tuned)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-tuned.txt -format msmarco \
-   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-tuned.txt
-
-#####################
-MRR @100: 0.3209184381409182
-QueriesRanked: 5193
-#####################
-```
-
-Again, note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
\ No newline at end of file
diff --git a/docs/regressions-msmarco-doc-docTTTTTquery-per-doc.md b/docs/regressions-msmarco-doc-docTTTTTquery.md
similarity index 76%
rename from docs/regressions-msmarco-doc-docTTTTTquery-per-doc.md
rename to docs/regressions-msmarco-doc-docTTTTTquery.md
index 102ce83759..029149c8b8 100644
--- a/docs/regressions-msmarco-doc-docTTTTTquery-per-doc.md
+++ b/docs/regressions-msmarco-doc-docTTTTTquery.md
@@ -6,10 +6,14 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-docTTTTTquery-per-doc.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-doc.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-docTTTTTquery.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -18,14 +22,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-doc \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -input /path/to/msmarco-doc-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-doc &
+  -threads 7 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-docTTTTTquery &
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-doc/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-docTTTTTquery/` should be a directory containing the expanded document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -38,24 +43,24 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-default.topics.msmarco-doc.dev.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc \
+  -index indexes/lucene-index.msmarco-doc-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-docTTTTTquery.bm25-tuned.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 4.68 -bm25.b 0.87 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-default.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery.bm25-default.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery-per-doc.bm25-tuned.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery.bm25-tuned.topics.msmarco-doc.dev.txt
 ```
 
 ## Effectiveness
@@ -64,12 +69,12 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| BM25 (tuned)|
 :---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2886    | 0.3270    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2886    | 0.3273    |
 
 
 R@100                                   | BM25 (default)| BM25 (tuned)|
 :---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7990    | 0.8608    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7993    | 0.8612    |
 
 
 R@1000                                  | BM25 (default)| BM25 (tuned)|
diff --git a/docs/regressions-msmarco-doc-per-passage-v2.md b/docs/regressions-msmarco-doc-per-passage-v2.md
deleted file mode 100644
index a798f521cd..0000000000
--- a/docs/regressions-msmarco-doc-per-passage-v2.md
+++ /dev/null
@@ -1,184 +0,0 @@
-# Anserini: Regressions for MS MARCO Document Ranking
-
-This page documents regression experiments for the [MS MARCO document ranking task](https://github.com/microsoft/MSMARCO-Document-Ranking), which is integrated into Anserini's regression testing framework.
-Note that there are four different regression conditions for this task, and this page describes the following:
-
-+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
-+ **Expansion Condition:** none
-
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-per-passage-v2` variant (there's also `msmarco-doc-per-passage` and `msmarco-doc-per-passage-v3`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
-
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-per-passage-v2.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-per-passage-v2.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
-
-## Indexing
-
-Typical indexing command:
-
-```
-target/appassembler/bin/IndexCollection \
-  -collection JsonCollection \
-  -input /path/to/msmarco-doc-per-passage-v2 \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -generator DefaultLuceneDocumentGenerator \
-  -threads 16 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-per-passage-v2 &
-```
-
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
-
-For additional details, see explanation of [common indexing options](common-indexing-options.md).
-
-## Retrieval
-
-Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
-The regression experiments here evaluate on the 5193 dev set questions.
-
-After indexing has completed, you should be able to perform retrieval as follows:
-
-```
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-default.topics.msmarco-doc.dev.txt \
-  -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-default+rm3.topics.msmarco-doc.dev.txt \
-  -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-default+ax.topics.msmarco-doc.dev.txt \
-  -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-default+prf.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-tuned.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-tuned+rm3.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-tuned+ax.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v2 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v2.bm25-tuned+prf.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-```
-
-Evaluation can be performed using `trec_eval`:
-
-```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-default.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-default+rm3.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-default+ax.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-default+prf.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-tuned.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-tuned+rm3.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-tuned+ax.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v2.bm25-tuned+prf.topics.msmarco-doc.dev.txt
-```
-
-## Effectiveness
-
-With the above commands, you should be able to reproduce the following results:
-
-MAP                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
-:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2609    | 0.2324    | 0.2170    | 0.2189    | 0.2639    | 0.2342    | 0.2250    | 0.2184    |
-
-
-R@100                                   | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
-:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7737    | 0.7768    | 0.7578    | 0.7570    | 0.7884    | 0.7793    | 0.7730    | 0.7520    |
-
-
-R@1000                                  | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
-:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.9095    | 0.9266    | 0.9207    | 0.9135    | 0.9222    | 0.9239    | 0.9268    | 0.9101    |
-
-Explanation of settings:
-
-+ The setting "default" refers the default BM25 settings of `k1=0.9`, `b=0.4`.
-+ The setting "tuned" refers to `k1=2.16`, `b=0.61`, tuned to optimize for recall@100 (i.e., for first-stage retrieval) on 2019/12.
-
-In these runs, we are retrieving the top 1000 hits for each query and using `trec_eval` to evaluate all 1000 hits.
-Since we're in the passage condition, we fetch the 10000 passages and select the top 1000 documents using MaxP.
-This lets us measure R@100 and R@1000; the latter is particularly important when these runs are used as first-stage retrieval.
-Beware, an official MS MARCO document ranking task leaderboard submission comprises only 100 hits per query.
-See [this page](experiments-msmarco-doc-leaderboard.md) for details on Anserini baseline runs that were submitted to the official leaderboard.
-
-The MaxP passage retrieval functionality is available in `SearchCollection`.
-To generate an MS MARCO submission with the BM25 default parameters, corresponding to "BM25 (default)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v2.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v2.bm25-default.txt -format msmarco \
-   -bm25 -bm25.k1 0.9 -bm25.b 0.4 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v2.bm25-default.txt
-
-#####################
-MRR @100: 0.26029445206377066
-QueriesRanked: 5193
-#####################
-```
-
-Note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
-
-To generate an MS MARCO submission with the BM25 tuned parameters, corresponding to "BM25 (tuned)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v2.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v2.bm25-tuned.txt -format msmarco \
-   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v2.bm25-tuned.txt
-
-#####################
-MRR @100: 0.2633426142578288
-QueriesRanked: 5193
-#####################
-```
-
-Again, note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
diff --git a/docs/regressions-msmarco-doc-per-passage-v3.md b/docs/regressions-msmarco-doc-per-passage-v3.md
deleted file mode 100644
index a4c8ec5259..0000000000
--- a/docs/regressions-msmarco-doc-per-passage-v3.md
+++ /dev/null
@@ -1,184 +0,0 @@
-# Anserini: Regressions for MS MARCO Document Ranking
-
-This page documents regression experiments for the [MS MARCO document ranking task](https://github.com/microsoft/MSMARCO-Document-Ranking), which is integrated into Anserini's regression testing framework.
-Note that there are four different regression conditions for this task, and this page describes the following:
-
-+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
-+ **Expansion Condition:** none
-
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-per-passage-v3` variant (there's also `msmarco-doc-per-passage` and `msmarco-doc-per-passage-v2`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
-
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-per-passage-v3.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-per-passage-v3.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
-
-## Indexing
-
-Typical indexing command:
-
-```
-target/appassembler/bin/IndexCollection \
-  -collection JsonCollection \
-  -input /path/to/msmarco-doc-per-passage-v3 \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -generator DefaultLuceneDocumentGenerator \
-  -threads 16 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-per-passage-v3 &
-```
-
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
-
-For additional details, see explanation of [common indexing options](common-indexing-options.md).
-
-## Retrieval
-
-Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
-The regression experiments here evaluate on the 5193 dev set questions.
-
-After indexing has completed, you should be able to perform retrieval as follows:
-
-```
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-default.topics.msmarco-doc.dev.txt \
-  -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-default+rm3.topics.msmarco-doc.dev.txt \
-  -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-default+ax.topics.msmarco-doc.dev.txt \
-  -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-default+prf.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-tuned.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-tuned+rm3.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-tuned+ax.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-
-target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage-v3 \
-  -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage-v3.bm25-tuned+prf.topics.msmarco-doc.dev.txt \
-  -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
-```
-
-Evaluation can be performed using `trec_eval`:
-
-```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-default.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-default+rm3.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-default+ax.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-default+prf.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-tuned.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-tuned+rm3.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-tuned+ax.topics.msmarco-doc.dev.txt
-
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage-v3.bm25-tuned+prf.topics.msmarco-doc.dev.txt
-```
-
-## Effectiveness
-
-With the above commands, you should be able to reproduce the following results:
-
-MAP                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
-:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2690    | 0.2419    | 0.2208    | 0.2325    | 0.2762    | 0.2450    | 0.2330    | 0.2276    |
-
-
-R@100                                   | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
-:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7847    | 0.7882    | 0.7710    | 0.7722    | 0.8013    | 0.7961    | 0.7888    | 0.7687    |
-
-
-R@1000                                  | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
-:---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.9178    | 0.9355    | 0.9264    | 0.9185    | 0.9311    | 0.9363    | 0.9353    | 0.9157    |
-
-Explanation of settings:
-
-+ The setting "default" refers the default BM25 settings of `k1=0.9`, `b=0.4`.
-+ The setting "tuned" refers to `k1=2.16`, `b=0.61`, tuned to optimize for recall@100 (i.e., for first-stage retrieval) on 2019/12.
-
-In these runs, we are retrieving the top 1000 hits for each query and using `trec_eval` to evaluate all 1000 hits.
-Since we're in the passage condition, we fetch the 10000 passages and select the top 1000 documents using MaxP.
-This lets us measure R@100 and R@1000; the latter is particularly important when these runs are used as first-stage retrieval.
-Beware, an official MS MARCO document ranking task leaderboard submission comprises only 100 hits per query.
-See [this page](experiments-msmarco-doc-leaderboard.md) for details on Anserini baseline runs that were submitted to the official leaderboard.
-
-The MaxP passage retrieval functionality is available in `SearchCollection`.
-To generate an MS MARCO submission with the BM25 default parameters, corresponding to "BM25 (default)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v3.bm25-default.txt -format msmarco \
-   -bm25 -bm25.k1 0.9 -bm25.b 0.4 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v3.bm25-default.txt
-
-#####################
-MRR @100: 0.26851990908986706
-QueriesRanked: 5193
-#####################
-```
-
-Note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
-
-To generate an MS MARCO submission with the BM25 tuned parameters, corresponding to "BM25 (tuned)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v3.bm25-tuned.txt -format msmarco \
-   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v3.bm25-tuned.txt
-
-#####################
-MRR @100: 0.27551963417683756
-QueriesRanked: 5193
-#####################
-```
-
-Again, note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
diff --git a/docs/regressions-msmarco-doc-docTTTTTquery-per-passage.md b/docs/regressions-msmarco-doc-segmented-docTTTTTquery.md
similarity index 75%
rename from docs/regressions-msmarco-doc-docTTTTTquery-per-passage.md
rename to docs/regressions-msmarco-doc-segmented-docTTTTTquery.md
index 860aabbc19..32910387f4 100644
--- a/docs/regressions-msmarco-doc-docTTTTTquery-per-passage.md
+++ b/docs/regressions-msmarco-doc-segmented-docTTTTTquery.md
@@ -6,13 +6,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
-**NOTE**: This is the `msmarco-doc-docTTTTTquery-per-passage` variant (there's also `msmarco-doc-docTTTTTquery-per-passage-v3`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-segmented-docTTTTTquery.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-segmented-docTTTTTquery.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -21,14 +23,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-docTTTTTquery-per-passage \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -input /path/to/msmarco-doc-segmented-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-docTTTTTquery-per-passage &
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-segmented-docTTTTTquery &
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented-docTTTTTquery/` should be a directory containing the expanded segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -41,24 +44,24 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default.topics.msmarco-doc.dev.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-default.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-default.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-docTTTTTquery-per-passage.bm25-tuned.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented-docTTTTTquery.bm25-tuned.topics.msmarco-doc.dev.txt
 ```
 
 ## Effectiveness
@@ -67,12 +70,12 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| BM25 (tuned)|
 :---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.3182    | 0.3211    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.3184    | 0.3213    |
 
 
 R@100                                   | BM25 (default)| BM25 (tuned)|
 :---------------------------------------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.8481    | 0.8627    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.8479    | 0.8627    |
 
 
 R@1000                                  | BM25 (default)| BM25 (tuned)|
diff --git a/docs/regressions-msmarco-doc-per-passage.md b/docs/regressions-msmarco-doc-segmented.md
similarity index 73%
rename from docs/regressions-msmarco-doc-per-passage.md
rename to docs/regressions-msmarco-doc-segmented.md
index 6e31874cef..060dc4ca89 100644
--- a/docs/regressions-msmarco-doc-per-passage.md
+++ b/docs/regressions-msmarco-doc-segmented.md
@@ -6,13 +6,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** none
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
-**NOTE**: This is the `msmarco-doc-per-passage` variant (there's also `msmarco-doc-per-passage-v2` and `msmarco-doc-per-passage-v3`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
+The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-segmented.yaml).
+Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-segmented.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
-The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc-per-passage.yaml).
-Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc-per-passage.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
 
 ## Indexing
 
@@ -21,14 +23,15 @@ Typical indexing command:
 ```
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
-  -input /path/to/msmarco-doc-per-passage \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -input /path/to/msmarco-doc-segmented \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
-  >& logs/log.msmarco-doc-per-passage &
+  -threads 16 -storePositions -storeDocvectors -storeRaw \
+  >& logs/log.msmarco-doc-segmented &
 ```
 
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented/` should be a directory containing the segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -41,72 +44,72 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default.topics.msmarco-doc.dev.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+rm3.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+rm3.topics.msmarco-doc.dev.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+ax.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+ax.topics.msmarco-doc.dev.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-default+prf.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-default+prf.topics.msmarco-doc.dev.txt \
   -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+rm3.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+rm3.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+ax.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+ax.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc-per-passage \
+  -index indexes/lucene-index.msmarco-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
-  -output runs/run.msmarco-doc-per-passage.bm25-tuned+prf.topics.msmarco-doc.dev.txt \
+  -output runs/run.msmarco-doc-segmented.bm25-tuned+prf.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 ```
 
 Evaluation can be performed using `trec_eval`:
 
 ```
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-default.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-default.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-default+rm3.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-default+rm3.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-default+ax.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-default+ax.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-default+prf.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-default+prf.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-tuned.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-tuned.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-tuned+rm3.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-tuned+rm3.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-tuned+ax.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-tuned+ax.topics.msmarco-doc.dev.txt
 
-tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-per-passage.bm25-tuned+prf.topics.msmarco-doc.dev.txt
+tools/eval/trec_eval.9.0.4/trec_eval -c -m map -c -m recall.100 -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt runs/run.msmarco-doc-segmented.bm25-tuned+prf.topics.msmarco-doc.dev.txt
 ```
 
 ## Effectiveness
@@ -115,17 +118,17 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2688    | 0.2416    | 0.2229    | 0.2325    | 0.2756    | 0.2443    | 0.2350    | 0.2271    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2690    | 0.2419    | 0.2208    | 0.2325    | 0.2762    | 0.2450    | 0.2330    | 0.2276    |
 
 
 R@100                                   | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7849    | 0.7876    | 0.7703    | 0.7714    | 0.8009    | 0.7955    | 0.7909    | 0.7685    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7847    | 0.7882    | 0.7710    | 0.7722    | 0.8013    | 0.7961    | 0.7888    | 0.7687    |
 
 
 R@1000                                  | BM25 (default)| +RM3      | +Ax       | +PRF      | BM25 (tuned)| +RM3      | +Ax       | +PRF      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.9180    | 0.9355    | 0.9266    | 0.9187    | 0.9311    | 0.9359    | 0.9341    | 0.9162    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.9178    | 0.9355    | 0.9264    | 0.9185    | 0.9311    | 0.9363    | 0.9353    | 0.9157    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-msmarco-doc.md b/docs/regressions-msmarco-doc.md
index e157d6d213..487cd4f46b 100644
--- a/docs/regressions-msmarco-doc.md
+++ b/docs/regressions-msmarco-doc.md
@@ -6,26 +6,31 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** none
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](../src/main/resources/regression/msmarco-doc.yaml).
 Note that this page is automatically generated from [this template](../src/main/resources/docgen/templates/msmarco-doc.template) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
 
 ```
 target/appassembler/bin/IndexCollection \
-  -collection CleanTrecCollection \
+  -collection JsonCollection \
   -input /path/to/msmarco-doc \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -generator DefaultLuceneDocumentGenerator \
-  -threads 1 -storePositions -storeDocvectors -storeRaw \
+  -threads 7 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-doc &
 ```
 
-The directory `/path/to/msmarco-doc/` should be a directory containing the official document collection (a single file), in TREC format.
+The directory `/path/to/msmarco-doc/` should be a directory containing the document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
@@ -38,37 +43,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-default.topics.msmarco-doc.dev.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-default+rm3.topics.msmarco-doc.dev.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned+rm3.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 3.44 -bm25.b 0.87 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned2.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 4.46 -bm25.b 0.82 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-doc \
+  -index indexes/lucene-index.msmarco-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-doc.bm25-tuned2+rm3.topics.msmarco-doc.dev.txt \
   -bm25 -bm25.k1 4.46 -bm25.b 0.82 -rm3 &
@@ -96,17 +101,17 @@ With the above commands, you should be able to reproduce the following results:
 
 MAP                                     | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2310    | 0.1632    | 0.2788    | 0.2289    | 0.2775    | 0.2238    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.2305    | 0.1631    | 0.2784    | 0.2289    | 0.2774    | 0.2239    |
 
 
 R@100                                   | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7279    | 0.6765    | 0.8065    | 0.7872    | 0.8076    | 0.7789    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.7281    | 0.6767    | 0.8069    | 0.7878    | 0.8070    | 0.7791    |
 
 
 R@1000                                  | BM25 (default)| +RM3      | BM25 (tuned)| +RM3      | BM25 (tuned2)| +RM3      |
 :---------------------------------------|-----------|-----------|-----------|-----------|-----------|-----------|
-[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.8856    | 0.8785    | 0.9326    | 0.9320    | 0.9357    | 0.9307    |
+[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)| 0.8856    | 0.8791    | 0.9324    | 0.9314    | 0.9357    | 0.9305    |
 
 Explanation of settings:
 
diff --git a/docs/regressions-msmarco-passage-deepimpact.md b/docs/regressions-msmarco-passage-deepimpact.md
index 0f7008c4e7..3c913074ce 100644
--- a/docs/regressions-msmarco-passage-deepimpact.md
+++ b/docs/regressions-msmarco-passage-deepimpact.md
@@ -18,7 +18,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-passage-deepimpact \
-  -index indexes/lucene-index.msmarco-passage-deepimpact \
+  -index indexes/lucene-index.msmarco-passage-deepimpact/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -impact -pretokenized \
   >& logs/log.msmarco-passage-deepimpact &
@@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-deepimpact \
+  -index indexes/lucene-index.msmarco-passage-deepimpact/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.deepimpact.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-passage-deepimpact.deepimpact.topics.msmarco-passage.dev-subset.deepimpact.tsv.gz \
   -impact -pretokenized &
diff --git a/docs/regressions-msmarco-passage-distill-splade-max.md b/docs/regressions-msmarco-passage-distill-splade-max.md
index 36ecaeecc2..9b3d9df809 100644
--- a/docs/regressions-msmarco-passage-distill-splade-max.md
+++ b/docs/regressions-msmarco-passage-distill-splade-max.md
@@ -18,7 +18,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-passage-distill-splade-max \
-  -index indexes/lucene-index.msmarco-passage-distill-splade-max \
+  -index indexes/lucene-index.msmarco-passage-distill-splade-max/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -impact -pretokenized \
   >& logs/log.msmarco-passage-distill-splade-max &
@@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-distill-splade-max \
+  -index indexes/lucene-index.msmarco-passage-distill-splade-max/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.distill-splade-max.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-passage-distill-splade-max.distill-splade-max.topics.msmarco-passage.dev-subset.distill-splade-max.tsv.gz \
   -impact -pretokenized &
diff --git a/docs/regressions-msmarco-passage-doc2query.md b/docs/regressions-msmarco-passage-doc2query.md
index 76c39f0842..24052505fc 100644
--- a/docs/regressions-msmarco-passage-doc2query.md
+++ b/docs/regressions-msmarco-passage-doc2query.md
@@ -18,7 +18,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage-doc2query \
-  -index indexes/lucene-index.msmarco-passage-doc2query \
+  -index indexes/lucene-index.msmarco-passage-doc2query/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage-doc2query &
@@ -38,25 +38,25 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-doc2query \
+  -index indexes/lucene-index.msmarco-passage-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-doc2query.bm25-default.topics.msmarco-passage.dev-subset.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-doc2query \
+  -index indexes/lucene-index.msmarco-passage-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-doc2query.bm25-default+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-doc2query \
+  -index indexes/lucene-index.msmarco-passage-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-doc2query.bm25-tuned.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-doc2query \
+  -index indexes/lucene-index.msmarco-passage-doc2query/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-doc2query.bm25-tuned+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
diff --git a/docs/regressions-msmarco-passage-docTTTTTquery.md b/docs/regressions-msmarco-passage-docTTTTTquery.md
index 162b772ef8..6b37c997d8 100644
--- a/docs/regressions-msmarco-passage-docTTTTTquery.md
+++ b/docs/regressions-msmarco-passage-docTTTTTquery.md
@@ -17,7 +17,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage-docTTTTTquery \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage-docTTTTTquery &
@@ -37,37 +37,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-default.topics.msmarco-passage.dev-subset.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-default+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned2.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 2.18 -bm25.b 0.86 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-docTTTTTquery \
+  -index indexes/lucene-index.msmarco-passage-docTTTTTquery/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage-docTTTTTquery.bm25-tuned2+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 2.18 -bm25.b 0.86 -rm3 &
diff --git a/docs/regressions-msmarco-passage-unicoil-tilde-expansion.md b/docs/regressions-msmarco-passage-unicoil-tilde-expansion.md
index a323047c50..5fc200f55f 100644
--- a/docs/regressions-msmarco-passage-unicoil-tilde-expansion.md
+++ b/docs/regressions-msmarco-passage-unicoil-tilde-expansion.md
@@ -18,7 +18,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-passage-unicoil-tilde-expansion \
-  -index indexes/lucene-index.msmarco-passage-unicoil-tilde-expansion \
+  -index indexes/lucene-index.msmarco-passage-unicoil-tilde-expansion/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -impact -pretokenized \
   >& logs/log.msmarco-passage-unicoil-tilde-expansion &
@@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-unicoil-tilde-expansion \
+  -index indexes/lucene-index.msmarco-passage-unicoil-tilde-expansion/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.unicoil-tilde-expansion.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-passage-unicoil-tilde-expansion.unicoil-tilde-expansion.topics.msmarco-passage.dev-subset.unicoil-tilde-expansion.tsv.gz \
   -impact -pretokenized &
diff --git a/docs/regressions-msmarco-passage-unicoil.md b/docs/regressions-msmarco-passage-unicoil.md
index 4c4510b648..426a149450 100644
--- a/docs/regressions-msmarco-passage-unicoil.md
+++ b/docs/regressions-msmarco-passage-unicoil.md
@@ -18,7 +18,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-passage-unicoil \
-  -index indexes/lucene-index.msmarco-passage-unicoil \
+  -index indexes/lucene-index.msmarco-passage-unicoil/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -impact -pretokenized \
   >& logs/log.msmarco-passage-unicoil &
@@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage-unicoil \
+  -index indexes/lucene-index.msmarco-passage-unicoil/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.unicoil.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-passage-unicoil.unicoil.topics.msmarco-passage.dev-subset.unicoil.tsv.gz \
   -impact -pretokenized &
diff --git a/docs/regressions-msmarco-passage.md b/docs/regressions-msmarco-passage.md
index 9326eca732..b30a865bb1 100644
--- a/docs/regressions-msmarco-passage.md
+++ b/docs/regressions-msmarco-passage.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonCollection \
   -input /path/to/msmarco-passage \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 9 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-passage &
@@ -34,49 +34,49 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default.topics.msmarco-passage.dev-subset.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+ax.topics.msmarco-passage.dev-subset.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-default+prf.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25prf &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+rm3.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+ax.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-passage \
+  -index indexes/lucene-index.msmarco-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.txt -topicreader TsvInt \
   -output runs/run.msmarco-passage.bm25-tuned+prf.topics.msmarco-passage.dev-subset.txt \
   -bm25 -bm25.k1 0.82 -bm25.b 0.68 -bm25prf &
diff --git a/docs/regressions-msmarco-v2-doc-segmented-unicoil-noexp-0shot.md b/docs/regressions-msmarco-v2-doc-segmented-unicoil-noexp-0shot.md
index 3da1c35b49..fc48d0930c 100644
--- a/docs/regressions-msmarco-v2-doc-segmented-unicoil-noexp-0shot.md
+++ b/docs/regressions-msmarco-v2-doc-segmented-unicoil-noexp-0shot.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-v2-doc-segmented-unicoil-noexp-0shot \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -impact -pretokenized \
   >& logs/log.msmarco-v2-doc-segmented-unicoil-noexp-0shot &
@@ -32,12 +32,12 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.unicoil-noexp.0shot.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented-unicoil-noexp-0shot.unicoil-noexp-0shot.topics.msmarco-v2-doc.dev.unicoil-noexp.0shot.tsv.gz \
   -impact -pretokenized -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.unicoil-noexp.0shot.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented-unicoil-noexp-0shot.unicoil-noexp-0shot.topics.msmarco-v2-doc.dev2.unicoil-noexp.0shot.tsv.gz \
   -impact -pretokenized -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
diff --git a/docs/regressions-msmarco-v2-doc-segmented.md b/docs/regressions-msmarco-v2-doc-segmented.md
index 01b4e60d1e..e684aa4216 100644
--- a/docs/regressions-msmarco-v2-doc-segmented.md
+++ b/docs/regressions-msmarco-v2-doc-segmented.md
@@ -15,7 +15,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2DocCollection \
   -input /path/to/msmarco-v2-doc-segmented \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-doc-segmented &
@@ -35,45 +35,45 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default.topics.msmarco-v2-doc.dev.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+rm3.topics.msmarco-v2-doc.dev.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+rm3.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+ax.topics.msmarco-v2-doc.dev.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+ax.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+prf.topics.msmarco-v2-doc.dev.txt \
   -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc-segmented \
+  -index indexes/lucene-index.msmarco-v2-doc-segmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc-segmented.bm25-default+prf.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000 &
diff --git a/docs/regressions-msmarco-v2-doc.md b/docs/regressions-msmarco-v2-doc.md
index 8ae40241af..4f79faf401 100644
--- a/docs/regressions-msmarco-v2-doc.md
+++ b/docs/regressions-msmarco-v2-doc.md
@@ -15,7 +15,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2DocCollection \
   -input /path/to/msmarco-v2-doc \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-doc &
@@ -35,45 +35,45 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default.topics.msmarco-v2-doc.dev.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default.topics.msmarco-v2-doc.dev2.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+rm3.topics.msmarco-v2-doc.dev.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+rm3.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+ax.topics.msmarco-v2-doc.dev.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+ax.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+prf.topics.msmarco-v2-doc.dev.txt \
   -bm25 -bm25prf &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-doc \
+  -index indexes/lucene-index.msmarco-v2-doc/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-doc.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-doc.bm25-default+prf.topics.msmarco-v2-doc.dev2.txt \
   -bm25 -bm25prf &
diff --git a/docs/regressions-msmarco-v2-passage-augmented.md b/docs/regressions-msmarco-v2-passage-augmented.md
index bb3c586fe1..a330c7082f 100644
--- a/docs/regressions-msmarco-v2-passage-augmented.md
+++ b/docs/regressions-msmarco-v2-passage-augmented.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2PassageCollection \
   -input /path/to/msmarco-v2-passage-augmented \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-passage-augmented &
@@ -34,45 +34,45 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default.topics.msmarco-v2-passage.dev.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default.topics.msmarco-v2-passage.dev2.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+rm3.topics.msmarco-v2-passage.dev.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+rm3.topics.msmarco-v2-passage.dev2.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+ax.topics.msmarco-v2-passage.dev.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+ax.topics.msmarco-v2-passage.dev2.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+prf.topics.msmarco-v2-passage.dev.txt \
   -bm25 -bm25prf &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-augmented \
+  -index indexes/lucene-index.msmarco-v2-passage-augmented/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-augmented.bm25-default+prf.topics.msmarco-v2-passage.dev2.txt \
   -bm25 -bm25prf &
diff --git a/docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md b/docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md
index 17764a47c0..7f1e90b0b4 100644
--- a/docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md
+++ b/docs/regressions-msmarco-v2-passage-unicoil-noexp-0shot.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection JsonVectorCollection \
   -input /path/to/msmarco-v2-passage-unicoil-noexp-0shot \
-  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -impact -pretokenized \
   >& logs/log.msmarco-v2-passage-unicoil-noexp-0shot &
@@ -32,12 +32,12 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.unicoil-noexp.0shot.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-unicoil-noexp-0shot.unicoil-noexp-0shot.topics.msmarco-v2-passage.dev.unicoil-noexp.0shot.tsv.gz \
   -impact -pretokenized &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot \
+  -index indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.unicoil-noexp.0shot.tsv.gz -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage-unicoil-noexp-0shot.unicoil-noexp-0shot.topics.msmarco-v2-passage.dev2.unicoil-noexp.0shot.tsv.gz \
   -impact -pretokenized &
diff --git a/docs/regressions-msmarco-v2-passage.md b/docs/regressions-msmarco-v2-passage.md
index 1150be220d..de3de6107c 100644
--- a/docs/regressions-msmarco-v2-passage.md
+++ b/docs/regressions-msmarco-v2-passage.md
@@ -15,7 +15,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection MsMarcoV2PassageCollection \
   -input /path/to/msmarco-v2-passage \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 18 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.msmarco-v2-passage &
@@ -35,45 +35,45 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default.topics.msmarco-v2-passage.dev.txt \
   -bm25 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default.topics.msmarco-v2-passage.dev2.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+rm3.topics.msmarco-v2-passage.dev.txt \
   -bm25 -rm3 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+rm3.topics.msmarco-v2-passage.dev2.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+ax.topics.msmarco-v2-passage.dev.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+ax.topics.msmarco-v2-passage.dev2.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+prf.topics.msmarco-v2-passage.dev.txt \
   -bm25 -bm25prf &
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.msmarco-v2-passage \
+  -index indexes/lucene-index.msmarco-v2-passage/ \
   -topics src/main/resources/topics-and-qrels/topics.msmarco-v2-passage.dev2.txt -topicreader TsvInt \
   -output runs/run.msmarco-v2-passage.bm25-default+prf.topics.msmarco-v2-passage.dev2.txt \
   -bm25 -bm25prf &
diff --git a/docs/regressions-ntcir8-zh.md b/docs/regressions-ntcir8-zh.md
index 7fdb6c5e19..9dab361776 100644
--- a/docs/regressions-ntcir8-zh.md
+++ b/docs/regressions-ntcir8-zh.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CleanTrecCollection \
   -input /path/to/ntcir8-zh \
-  -index indexes/lucene-index.ntcir8-zh \
+  -index indexes/lucene-index.ntcir8-zh/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw -language zh -uniqueDocid -optimize \
   >& logs/log.ntcir8-zh &
@@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.ntcir8-zh \
+  -index indexes/lucene-index.ntcir8-zh/ \
   -topics src/main/resources/topics-and-qrels/topics.ntcir8zh.eval.txt -topicreader TsvString \
   -output runs/run.ntcir8-zh.bm25.topics.ntcir8zh.eval.txt \
   -bm25 -language zh &
diff --git a/docs/regressions-robust05.md b/docs/regressions-robust05.md
index 1d22933aa4..074b3abcc8 100644
--- a/docs/regressions-robust05.md
+++ b/docs/regressions-robust05.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection TrecCollection \
   -input /path/to/robust05 \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.robust05 &
@@ -33,37 +33,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -topics src/main/resources/topics-and-qrels/topics.robust05.txt -topicreader Trec \
   -output runs/run.robust05.bm25.topics.robust05.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -topics src/main/resources/topics-and-qrels/topics.robust05.txt -topicreader Trec \
   -output runs/run.robust05.bm25+rm3.topics.robust05.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -topics src/main/resources/topics-and-qrels/topics.robust05.txt -topicreader Trec \
   -output runs/run.robust05.bm25+ax.topics.robust05.txt \
   -bm25 -axiom -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -topics src/main/resources/topics-and-qrels/topics.robust05.txt -topicreader Trec \
   -output runs/run.robust05.ql.topics.robust05.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -topics src/main/resources/topics-and-qrels/topics.robust05.txt -topicreader Trec \
   -output runs/run.robust05.ql+rm3.topics.robust05.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.robust05 \
+  -index indexes/lucene-index.robust05/ \
   -topics src/main/resources/topics-and-qrels/topics.robust05.txt -topicreader Trec \
   -output runs/run.robust05.ql+ax.topics.robust05.txt \
   -qld -axiom -axiom.deterministic -rerankCutoff 20 &
diff --git a/docs/regressions-trec02-ar.md b/docs/regressions-trec02-ar.md
index 161bf4a766..8f6342feb5 100644
--- a/docs/regressions-trec02-ar.md
+++ b/docs/regressions-trec02-ar.md
@@ -14,7 +14,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection CleanTrecCollection \
   -input /path/to/trec02-ar \
-  -index indexes/lucene-index.trec02-ar \
+  -index indexes/lucene-index.trec02-ar/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw -language ar \
   >& logs/log.trec02-ar &
@@ -38,7 +38,7 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.trec02-ar \
+  -index indexes/lucene-index.trec02-ar/ \
   -topics src/main/resources/topics-and-qrels/topics.trec02ar-ar.txt -topicreader Trec \
   -output runs/run.trec02-ar.bm25.topics.trec02ar-ar.txt \
   -bm25 -language ar &
diff --git a/docs/regressions-wt10g.md b/docs/regressions-wt10g.md
index f4b20896be..a56782de18 100644
--- a/docs/regressions-wt10g.md
+++ b/docs/regressions-wt10g.md
@@ -12,7 +12,7 @@ Typical indexing command:
 target/appassembler/bin/IndexCollection \
   -collection TrecwebCollection \
   -input /path/to/wt10g \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -generator DefaultLuceneDocumentGenerator \
   -threads 16 -storePositions -storeDocvectors -storeRaw \
   >& logs/log.wt10g &
@@ -33,37 +33,37 @@ After indexing has completed, you should be able to perform retrieval as follows
 
 ```
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.451-550.txt -topicreader Trec \
   -output runs/run.wt10g.bm25.topics.adhoc.451-550.txt \
   -bm25 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.451-550.txt -topicreader Trec \
   -output runs/run.wt10g.bm25+rm3.topics.adhoc.451-550.txt \
   -bm25 -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.451-550.txt -topicreader Trec \
   -output runs/run.wt10g.bm25+ax.topics.adhoc.451-550.txt \
   -bm25 -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.451-550.txt -topicreader Trec \
   -output runs/run.wt10g.ql.topics.adhoc.451-550.txt \
   -qld &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.451-550.txt -topicreader Trec \
   -output runs/run.wt10g.ql+rm3.topics.adhoc.451-550.txt \
   -qld -rm3 &
 
 target/appassembler/bin/SearchCollection \
-  -index indexes/lucene-index.wt10g \
+  -index indexes/lucene-index.wt10g/ \
   -topics src/main/resources/topics-and-qrels/topics.adhoc.451-550.txt -topicreader Trec \
   -output runs/run.wt10g.ql+ax.topics.adhoc.451-550.txt \
   -qld -axiom -axiom.beta 0.1 -axiom.deterministic -rerankCutoff 20 &
diff --git a/src/main/python/run_regression.py b/src/main/python/run_regression.py
index b61e50e2ce..e2ec0f5c57 100644
--- a/src/main/python/run_regression.py
+++ b/src/main/python/run_regression.py
@@ -135,6 +135,7 @@ def construct_search_commands(yaml_data):
 def evaluate_and_verify(yaml_data, dry_run):
     fail_str = '\033[91m[FAIL]\033[0m '
     ok_str = '   [OK] '
+    failures = False
 
     logger.info('='*10 + ' Verifying Results: ' + yaml_data['corpus'] + ' ' + '='*10)
     for model in yaml_data['models']:
@@ -161,13 +162,14 @@ def evaluate_and_verify(yaml_data, dry_run):
                 if is_close(expected, actual):
                     logger.info(ok_str + result_str)
                 else:
-                    # Fail fast.
-                    logger.error(fail_str + result_str + ' - Failure encountered. Aborting!')
-                    sys.exit()
+                    logger.error(fail_str + result_str)
+                    failures = True
 
-    # If we've gotten to here and it's not a dry run, then all the runs have passed.
     if not dry_run:
-        logger.info("All Tests Passed!")
+        if failures:
+            logger.info('\033[91mFailed tests!\033[0m')
+        else:
+            logger.info("All Tests Passed!")
 
 
 def run_search(cmd):
diff --git a/src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-doc.template b/src/main/resources/docgen/templates/dl19-doc-docTTTTTquery.template
similarity index 79%
rename from src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-doc.template
rename to src/main/resources/docgen/templates/dl19-doc-docTTTTTquery.template
index d6aa6c1899..4fc44e5b0d 100644
--- a/src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-doc.template
+++ b/src/main/resources/docgen/templates/dl19-doc-docTTTTTquery.template
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) w/ per-doc docTTTTTquery
+# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2019 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -23,7 +27,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-doc/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-docTTTTTquery/` should be a directory containing the expanded document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-passage.template b/src/main/resources/docgen/templates/dl19-doc-segmented-docTTTTTquery.template
similarity index 74%
rename from src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-passage.template
rename to src/main/resources/docgen/templates/dl19-doc-segmented-docTTTTTquery.template
index 2e1051c649..46a06a0edd 100644
--- a/src/main/resources/docgen/templates/dl19-doc-docTTTTTquery-per-passage.template
+++ b/src/main/resources/docgen/templates/dl19-doc-segmented-docTTTTTquery.template
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) w/ per-passage docTTTTTquery
+# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) Segmented w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2019 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,12 +10,16 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -24,7 +28,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented-docTTTTTquery/` should be a directory containing the expanded segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl19-doc-per-passage.template b/src/main/resources/docgen/templates/dl19-doc-segmented.template
similarity index 75%
rename from src/main/resources/docgen/templates/dl19-doc-per-passage.template
rename to src/main/resources/docgen/templates/dl19-doc-segmented.template
index 138f722c43..2f225d5b82 100644
--- a/src/main/resources/docgen/templates/dl19-doc-per-passage.template
+++ b/src/main/resources/docgen/templates/dl19-doc-segmented.template
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html)
+# Anserini: Regressions for [DL19 (Doc)](https://trec.nist.gov/data/deep2019.html) Segmented
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2019 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,12 +10,16 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** none
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -24,7 +28,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented/` should be a directory containing the segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl19-doc.template b/src/main/resources/docgen/templates/dl19-doc.template
index bebfa2142d..5a5292ee7b 100644
--- a/src/main/resources/docgen/templates/dl19-doc.template
+++ b/src/main/resources/docgen/templates/dl19-doc.template
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** none
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -23,7 +27,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc/` should be a directory containing the official document collection (a single file), in TREC format.
+The directory `/path/to/msmarco-doc/` should be a directory containing the document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-doc.template b/src/main/resources/docgen/templates/dl20-doc-docTTTTTquery.template
similarity index 80%
rename from src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-doc.template
rename to src/main/resources/docgen/templates/dl20-doc-docTTTTTquery.template
index 5864d15613..01541f4074 100644
--- a/src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-doc.template
+++ b/src/main/resources/docgen/templates/dl20-doc-docTTTTTquery.template
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) w/ per-doc docTTTTTquery
+# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2020 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -23,7 +27,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-doc/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-docTTTTTquery/` should be a directory containing the expanded document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-passage.template b/src/main/resources/docgen/templates/dl20-doc-segmented-docTTTTTquery.template
similarity index 74%
rename from src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-passage.template
rename to src/main/resources/docgen/templates/dl20-doc-segmented-docTTTTTquery.template
index 9dcb73b3f9..a5fafdbcf4 100644
--- a/src/main/resources/docgen/templates/dl20-doc-docTTTTTquery-per-passage.template
+++ b/src/main/resources/docgen/templates/dl20-doc-segmented-docTTTTTquery.template
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) w/ per-passage docTTTTTquery
+# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) Segmented w/ docTTTTTquery
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2020 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,12 +10,16 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -24,7 +28,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented-docTTTTTquery/` should be a directory containing the expanded segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl20-doc-per-passage.template b/src/main/resources/docgen/templates/dl20-doc-segmented.template
similarity index 75%
rename from src/main/resources/docgen/templates/dl20-doc-per-passage.template
rename to src/main/resources/docgen/templates/dl20-doc-segmented.template
index 9d3093c0fa..27bbeaa5f9 100644
--- a/src/main/resources/docgen/templates/dl20-doc-per-passage.template
+++ b/src/main/resources/docgen/templates/dl20-doc-segmented.template
@@ -1,4 +1,4 @@
-# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html)
+# Anserini: Regressions for [DL20 (Doc)](https://trec.nist.gov/data/deep2020.html) Segmented
 
 This page describes experiments, integrated into Anserini's regression testing framework, for the TREC 2020 Deep Learning Track (Document Ranking Task) on the MS MARCO document collection using relevance judgments from NIST.
 
@@ -10,12 +10,16 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** none
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -24,7 +28,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented/` should be a directory containing the segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/dl20-doc.template b/src/main/resources/docgen/templates/dl20-doc.template
index 679201a0bc..d39af387e9 100644
--- a/src/main/resources/docgen/templates/dl20-doc.template
+++ b/src/main/resources/docgen/templates/dl20-doc.template
@@ -10,11 +10,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** none
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -23,7 +27,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc/` should be a directory containing the official document collection (a single file), in TREC format.
+The directory `/path/to/msmarco-doc/` should be a directory containing the document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage-v3.template b/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage-v3.template
deleted file mode 100644
index 77ff70f2dd..0000000000
--- a/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage-v3.template
+++ /dev/null
@@ -1,106 +0,0 @@
-# Anserini: Regressions for MS MARCO Document Ranking
-
-This page documents regression experiments for the [MS MARCO document ranking task](https://github.com/microsoft/MSMARCO-Document-Ranking), which is integrated into Anserini's regression testing framework.
-Note that there are four different regression conditions for this task, and this page describes the following:
-
-+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
-+ **Expansion Condition:** doc2query-T5
-
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-docTTTTTquery-per-passage-v3` variant (there's also `msmarco-doc-docTTTTTquery-per-passage`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
-
-The exact configurations for these regressions are stored in [this YAML file](${yaml}).
-Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
-
-## Indexing
-
-Typical indexing command:
-
-```
-${index_cmds}
-```
-
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
-
-For additional details, see explanation of [common indexing options](common-indexing-options.md).
-
-## Retrieval
-
-Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
-The regression experiments here evaluate on the 5193 dev set questions.
-
-After indexing has completed, you should be able to perform retrieval as follows:
-
-```
-${ranking_cmds}
-```
-
-Evaluation can be performed using `trec_eval`:
-
-```
-${eval_cmds}
-```
-
-## Effectiveness
-
-With the above commands, you should be able to reproduce the following results:
-
-${effectiveness}
-
-Explanation of settings:
-
-+ The setting "default" refers the default BM25 settings of `k1=0.9`, `b=0.4`.
-+ The setting "tuned" refers to `k1=2.56`, `b=0.59`, tuned to optimize for recall@100 (i.e., for first-stage retrieval) on 2019/12.
-
-In these runs, we are retrieving the top 1000 hits for each query and using `trec_eval` to evaluate all 1000 hits.
-Since we're in the passage condition, we fetch the 10000 passages and select the top 1000 documents using MaxP.
-This lets us measure R@100 and R@1000; the latter is particularly important when these runs are used as first-stage retrieval.
-Beware, an official MS MARCO document ranking task leaderboard submission comprises only 100 hits per query.
-See [this page](experiments-msmarco-doc-leaderboard.md) for details on Anserini baseline runs that were submitted to the official leaderboard.
-
-The MaxP passage retrieval functionality is available in `SearchCollection`.
-To generate an MS MARCO submission with the BM25 default parameters, corresponding to "BM25 (default)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-default.txt -format msmarco \
-   -bm25 -bm25.k1 0.9 -bm25.b 0.4 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-default.txt
-
-#####################
-MRR @100: 0.317905445196054
-QueriesRanked: 5193
-#####################
-```
-
-Note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
-
-To generate an MS MARCO submission with the BM25 tuned parameters, corresponding to "BM25 (tuned)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-tuned.txt -format msmarco \
-   -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-docTTTTTquery-per-passage-v3.bm25-tuned.txt
-
-#####################
-MRR @100: 0.3209184381409182
-QueriesRanked: 5193
-#####################
-```
-
-Again, note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
\ No newline at end of file
diff --git a/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-doc.template b/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery.template
similarity index 85%
rename from src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-doc.template
rename to src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery.template
index e447ab706d..17cff0dab3 100644
--- a/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-doc.template
+++ b/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery.template
@@ -6,11 +6,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -19,7 +23,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-doc/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-docTTTTTquery/` should be a directory containing the expanded document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/msmarco-doc-per-passage-v2.template b/src/main/resources/docgen/templates/msmarco-doc-per-passage-v2.template
deleted file mode 100644
index 9af82d7f6c..0000000000
--- a/src/main/resources/docgen/templates/msmarco-doc-per-passage-v2.template
+++ /dev/null
@@ -1,106 +0,0 @@
-# Anserini: Regressions for MS MARCO Document Ranking
-
-This page documents regression experiments for the [MS MARCO document ranking task](https://github.com/microsoft/MSMARCO-Document-Ranking), which is integrated into Anserini's regression testing framework.
-Note that there are four different regression conditions for this task, and this page describes the following:
-
-+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
-+ **Expansion Condition:** none
-
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-per-passage-v2` variant (there's also `msmarco-doc-per-passage` and `msmarco-doc-per-passage-v3`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
-
-The exact configurations for these regressions are stored in [this YAML file](${yaml}).
-Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
-
-## Indexing
-
-Typical indexing command:
-
-```
-${index_cmds}
-```
-
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
-
-For additional details, see explanation of [common indexing options](common-indexing-options.md).
-
-## Retrieval
-
-Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
-The regression experiments here evaluate on the 5193 dev set questions.
-
-After indexing has completed, you should be able to perform retrieval as follows:
-
-```
-${ranking_cmds}
-```
-
-Evaluation can be performed using `trec_eval`:
-
-```
-${eval_cmds}
-```
-
-## Effectiveness
-
-With the above commands, you should be able to reproduce the following results:
-
-${effectiveness}
-
-Explanation of settings:
-
-+ The setting "default" refers the default BM25 settings of `k1=0.9`, `b=0.4`.
-+ The setting "tuned" refers to `k1=2.16`, `b=0.61`, tuned to optimize for recall@100 (i.e., for first-stage retrieval) on 2019/12.
-
-In these runs, we are retrieving the top 1000 hits for each query and using `trec_eval` to evaluate all 1000 hits.
-Since we're in the passage condition, we fetch the 10000 passages and select the top 1000 documents using MaxP.
-This lets us measure R@100 and R@1000; the latter is particularly important when these runs are used as first-stage retrieval.
-Beware, an official MS MARCO document ranking task leaderboard submission comprises only 100 hits per query.
-See [this page](experiments-msmarco-doc-leaderboard.md) for details on Anserini baseline runs that were submitted to the official leaderboard.
-
-The MaxP passage retrieval functionality is available in `SearchCollection`.
-To generate an MS MARCO submission with the BM25 default parameters, corresponding to "BM25 (default)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v2.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v2.bm25-default.txt -format msmarco \
-   -bm25 -bm25.k1 0.9 -bm25.b 0.4 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v2.bm25-default.txt
-
-#####################
-MRR @100: 0.26029445206377066
-QueriesRanked: 5193
-#####################
-```
-
-Note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
-
-To generate an MS MARCO submission with the BM25 tuned parameters, corresponding to "BM25 (tuned)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v2.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v2.bm25-tuned.txt -format msmarco \
-   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v2.bm25-tuned.txt
-
-#####################
-MRR @100: 0.2633426142578288
-QueriesRanked: 5193
-#####################
-```
-
-Again, note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
diff --git a/src/main/resources/docgen/templates/msmarco-doc-per-passage-v3.template b/src/main/resources/docgen/templates/msmarco-doc-per-passage-v3.template
deleted file mode 100644
index 92c4ec6507..0000000000
--- a/src/main/resources/docgen/templates/msmarco-doc-per-passage-v3.template
+++ /dev/null
@@ -1,106 +0,0 @@
-# Anserini: Regressions for MS MARCO Document Ranking
-
-This page documents regression experiments for the [MS MARCO document ranking task](https://github.com/microsoft/MSMARCO-Document-Ranking), which is integrated into Anserini's regression testing framework.
-Note that there are four different regression conditions for this task, and this page describes the following:
-
-+ **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
-+ **Expansion Condition:** none
-
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-per-passage-v3` variant (there's also `msmarco-doc-per-passage` and `msmarco-doc-per-passage-v2`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
-
-The exact configurations for these regressions are stored in [this YAML file](${yaml}).
-Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
-
-## Indexing
-
-Typical indexing command:
-
-```
-${index_cmds}
-```
-
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
-
-For additional details, see explanation of [common indexing options](common-indexing-options.md).
-
-## Retrieval
-
-Topics and qrels are stored in [`src/main/resources/topics-and-qrels/`](../src/main/resources/topics-and-qrels/).
-The regression experiments here evaluate on the 5193 dev set questions.
-
-After indexing has completed, you should be able to perform retrieval as follows:
-
-```
-${ranking_cmds}
-```
-
-Evaluation can be performed using `trec_eval`:
-
-```
-${eval_cmds}
-```
-
-## Effectiveness
-
-With the above commands, you should be able to reproduce the following results:
-
-${effectiveness}
-
-Explanation of settings:
-
-+ The setting "default" refers the default BM25 settings of `k1=0.9`, `b=0.4`.
-+ The setting "tuned" refers to `k1=2.16`, `b=0.61`, tuned to optimize for recall@100 (i.e., for first-stage retrieval) on 2019/12.
-
-In these runs, we are retrieving the top 1000 hits for each query and using `trec_eval` to evaluate all 1000 hits.
-Since we're in the passage condition, we fetch the 10000 passages and select the top 1000 documents using MaxP.
-This lets us measure R@100 and R@1000; the latter is particularly important when these runs are used as first-stage retrieval.
-Beware, an official MS MARCO document ranking task leaderboard submission comprises only 100 hits per query.
-See [this page](experiments-msmarco-doc-leaderboard.md) for details on Anserini baseline runs that were submitted to the official leaderboard.
-
-The MaxP passage retrieval functionality is available in `SearchCollection`.
-To generate an MS MARCO submission with the BM25 default parameters, corresponding to "BM25 (default)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v3.bm25-default.txt -format msmarco \
-   -bm25 -bm25.k1 0.9 -bm25.b 0.4 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v3.bm25-default.txt
-
-#####################
-MRR @100: 0.26851990908986706
-QueriesRanked: 5193
-#####################
-```
-
-Note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
-
-To generate an MS MARCO submission with the BM25 tuned parameters, corresponding to "BM25 (tuned)" above:
-
-```bash
-$ target/appassembler/bin/SearchCollection -topicreader TsvString \
-   -topics src/main/resources/topics-and-qrels/topics.msmarco-doc.dev.txt \
-   -index indexes/lucene-index.msmarco-doc-per-passage-v3.pos+docvectors+raw \
-   -output runs/run.msmarco-doc-per-passage-v3.bm25-tuned.txt -format msmarco \
-   -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 1000 \
-   -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
-
-$ python tools/scripts/msmarco/msmarco_doc_eval.py \
-   --judgments src/main/resources/topics-and-qrels/qrels.msmarco-doc.dev.txt \
-   --run runs/run.msmarco-doc-per-passage-v3.bm25-tuned.txt
-
-#####################
-MRR @100: 0.27551963417683756
-QueriesRanked: 5193
-#####################
-```
-
-Again, note that the above command uses `-format msmarco` to directly generate a run in the MS MARCO output format.
diff --git a/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage.template b/src/main/resources/docgen/templates/msmarco-doc-segmented-docTTTTTquery.template
similarity index 85%
rename from src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage.template
rename to src/main/resources/docgen/templates/msmarco-doc-segmented-docTTTTTquery.template
index 31e1f226c5..b4a8243859 100644
--- a/src/main/resources/docgen/templates/msmarco-doc-docTTTTTquery-per-passage.template
+++ b/src/main/resources/docgen/templates/msmarco-doc-segmented-docTTTTTquery.template
@@ -6,14 +6,16 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** doc2query-T5
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-docTTTTTquery-per-passage` variant (there's also `msmarco-doc-docTTTTTquery-per-passage-v3`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -22,7 +24,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-docTTTTTquery-per-passage/` should be a directory containing the expanded document collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented-docTTTTTquery/` should be a directory containing the expanded segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/msmarco-doc-per-passage.template b/src/main/resources/docgen/templates/msmarco-doc-segmented.template
similarity index 84%
rename from src/main/resources/docgen/templates/msmarco-doc-per-passage.template
rename to src/main/resources/docgen/templates/msmarco-doc-segmented.template
index d2bed3f107..b3336f9915 100644
--- a/src/main/resources/docgen/templates/msmarco-doc-per-passage.template
+++ b/src/main/resources/docgen/templates/msmarco-doc-segmented.template
@@ -6,14 +6,16 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is first segmented into passages, each passage is treated as a unit of indexing
 + **Expansion Condition:** none
 
-In the passage indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
-
-**NOTE**: This is the `msmarco-doc-per-passage` variant (there's also `msmarco-doc-per-passage-v2` and `msmarco-doc-per-passage-v3`), see [this page](experiments-msmarco-doc-doc2query-details.md) for detailed notes about differences between these variants.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
+In the passage (i.e., segment) indexing condition, we select the score of the highest-scoring passage from a document as the score for that document to produce a document ranking; this is known as the MaxP technique.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -22,7 +24,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc-per-passage/` should be a directory containing the segmented paragraph collection; see [this link](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini) for how to prepare this collection.
+The directory `/path/to/msmarco-doc-segmented/` should be a directory containing the segmented corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/docgen/templates/msmarco-doc.template b/src/main/resources/docgen/templates/msmarco-doc.template
index 005a026f2d..371e266420 100644
--- a/src/main/resources/docgen/templates/msmarco-doc.template
+++ b/src/main/resources/docgen/templates/msmarco-doc.template
@@ -6,11 +6,15 @@ Note that there are four different regression conditions for this task, and this
 + **Indexing Condition:** each MS MARCO document is treated as a unit of indexing
 + **Expansion Condition:** none
 
-All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery#reproducing-ms-marco-document-ranking-results-with-anserini), in the context of doc2query-T5.
+All four conditions are described in detail [here](https://github.com/castorini/docTTTTTquery), in the context of doc2query-T5.
 
 The exact configurations for these regressions are stored in [this YAML file](${yaml}).
 Note that this page is automatically generated from [this template](${template}) as part of Anserini's regression pipeline, so do not modify this page directly; modify the template instead.
 
+Note that in November 2021 we discovered issues in our regression tests, documented [here](experiments-msmarco-doc-doc2query-details.md).
+As a result, we have had to rebuild all our regressions from the raw corpus.
+These new versions yield end-to-end scores that are slightly different, so if numbers reported in a paper do not exactly match the numbers here, this may be the reason.
+
 ## Indexing
 
 Typical indexing command:
@@ -19,7 +23,8 @@ Typical indexing command:
 ${index_cmds}
 ```
 
-The directory `/path/to/msmarco-doc/` should be a directory containing the official document collection (a single file), in TREC format.
+The directory `/path/to/msmarco-doc/` should be a directory containing the document corpus in Anserini's jsonl format.
+See [this page](experiments-msmarco-doc-doc2query-details.md) for how to prepare the corpus.
 
 For additional details, see explanation of [common indexing options](common-indexing-options.md).
 
diff --git a/src/main/resources/regression/backgroundlinking18.yaml b/src/main/resources/regression/backgroundlinking18.yaml
index ecd2fe3aab..8d1f41050a 100644
--- a/src/main/resources/regression/backgroundlinking18.yaml
+++ b/src/main/resources/regression/backgroundlinking18.yaml
@@ -2,7 +2,7 @@
 corpus: wapo.v2
 corpus_path: collections/newswire/WashingtonPost.v2/data/
 
-index_path: indexes/lucene-index.wapo.v2
+index_path: indexes/lucene-index.wapo.v2/
 collection_class: WashingtonPostCollection
 generator_class: WashingtonPostGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/backgroundlinking19.yaml b/src/main/resources/regression/backgroundlinking19.yaml
index 3b4ba22c70..8b41e4efb9 100644
--- a/src/main/resources/regression/backgroundlinking19.yaml
+++ b/src/main/resources/regression/backgroundlinking19.yaml
@@ -2,7 +2,7 @@
 corpus: wapo.v2
 corpus_path: collections/newswire/WashingtonPost.v2/data/
 
-index_path: indexes/lucene-index.wapo.v2
+index_path: indexes/lucene-index.wapo.v2/
 collection_class: WashingtonPostCollection
 generator_class: WashingtonPostGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/backgroundlinking20.yaml b/src/main/resources/regression/backgroundlinking20.yaml
index 6f968ba68c..071d7faa41 100644
--- a/src/main/resources/regression/backgroundlinking20.yaml
+++ b/src/main/resources/regression/backgroundlinking20.yaml
@@ -2,7 +2,7 @@
 corpus: wapo.v3
 corpus_path: collections/newswire/WashingtonPost.v3/data/
 
-index_path: indexes/lucene-index.wapo.v3
+index_path: indexes/lucene-index.wapo.v3/
 collection_class: WashingtonPostCollection
 generator_class: WashingtonPostGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/cacm.yaml b/src/main/resources/regression/cacm.yaml
index 900ef7ff84..1095f84b6a 100644
--- a/src/main/resources/regression/cacm.yaml
+++ b/src/main/resources/regression/cacm.yaml
@@ -2,7 +2,7 @@
 corpus: cacm
 corpus_path: src/main/resources/cacm/
 
-index_path: indexes/lucene-index.cacm
+index_path: indexes/lucene-index.cacm/
 collection_class: HtmlCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 8
diff --git a/src/main/resources/regression/car17v1.5.yaml b/src/main/resources/regression/car17v1.5.yaml
index caac95df2c..4ddbea3337 100644
--- a/src/main/resources/regression/car17v1.5.yaml
+++ b/src/main/resources/regression/car17v1.5.yaml
@@ -2,7 +2,7 @@
 corpus: car-paragraphCorpus.v1.5
 corpus_path: collections/car/paragraphCorpus.v1.5/
 
-index_path: indexes/lucene-index.car-paragraphCorpus.v1.5
+index_path: indexes/lucene-index.car-paragraphCorpus.v1.5/
 collection_class: CarCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/car17v2.0-doc2query.yaml b/src/main/resources/regression/car17v2.0-doc2query.yaml
index ee8274e868..6a52c9c5d1 100644
--- a/src/main/resources/regression/car17v2.0-doc2query.yaml
+++ b/src/main/resources/regression/car17v2.0-doc2query.yaml
@@ -2,7 +2,7 @@
 corpus: car-paragraphCorpus.v2.0-doc2query
 corpus_path: collections/car/paragraphCorpus.v2.0-expanded-topk10/
 
-index_path: indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query
+index_path: indexes/lucene-index.car-paragraphCorpus.v2.0-doc2query/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 30
diff --git a/src/main/resources/regression/car17v2.0.yaml b/src/main/resources/regression/car17v2.0.yaml
index eed8c60a59..95ae7e54b9 100644
--- a/src/main/resources/regression/car17v2.0.yaml
+++ b/src/main/resources/regression/car17v2.0.yaml
@@ -2,7 +2,7 @@
 corpus: car-paragraphCorpus.v2.0
 corpus_path: collections/car/paragraphCorpus.v2.0/
 
-index_path: indexes/lucene-index.car-paragraphCorpus.v2.0
+index_path: indexes/lucene-index.car-paragraphCorpus.v2.0/
 collection_class: CarCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/clef06-fr.yaml b/src/main/resources/regression/clef06-fr.yaml
index fbe2abc5f5..731aff95d5 100644
--- a/src/main/resources/regression/clef06-fr.yaml
+++ b/src/main/resources/regression/clef06-fr.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: clef06-fr
-corpus_path: collections/newswire/clir/clef2006-fr.json
+corpus_path: collections/newswire/clir/clef2006-fr.json/
 
-index_path: indexes/lucene-index.clef06-fr
+index_path: indexes/lucene-index.clef06-fr/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/core17.yaml b/src/main/resources/regression/core17.yaml
index 7f21294f28..b078c52eb6 100644
--- a/src/main/resources/regression/core17.yaml
+++ b/src/main/resources/regression/core17.yaml
@@ -2,7 +2,7 @@
 corpus: nyt
 corpus_path: collections/newswire/NYTcorpus/
 
-index_path: indexes/lucene-index.nyt
+index_path: indexes/lucene-index.nyt/
 collection_class: NewYorkTimesCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/core18.yaml b/src/main/resources/regression/core18.yaml
index 4220c6d75a..f925461df8 100644
--- a/src/main/resources/regression/core18.yaml
+++ b/src/main/resources/regression/core18.yaml
@@ -2,7 +2,7 @@
 corpus: wapo.v2
 corpus_path: collections/newswire/WashingtonPost.v2/data/
 
-index_path: indexes/lucene-index.wapo.v2
+index_path: indexes/lucene-index.wapo.v2/
 collection_class: WashingtonPostCollection
 generator_class: WashingtonPostGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/cw09b.yaml b/src/main/resources/regression/cw09b.yaml
index cbf8c973af..55f84805c7 100644
--- a/src/main/resources/regression/cw09b.yaml
+++ b/src/main/resources/regression/cw09b.yaml
@@ -2,7 +2,7 @@
 corpus: cw09b
 corpus_path: collections/web/ClueWeb09b/
 
-index_path: indexes/lucene-index.cw09b
+index_path: indexes/lucene-index.cw09b/
 collection_class: ClueWeb09Collection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 44
diff --git a/src/main/resources/regression/cw12.yaml b/src/main/resources/regression/cw12.yaml
index 0fdc024cbe..754f52d502 100644
--- a/src/main/resources/regression/cw12.yaml
+++ b/src/main/resources/regression/cw12.yaml
@@ -2,7 +2,7 @@
 corpus: cw12
 corpus_path: collections/web/ClueWeb12/
 
-index_path: indexes/lucene-index.cw12
+index_path: indexes/lucene-index.cw12/
 collection_class: ClueWeb12Collection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 44
diff --git a/src/main/resources/regression/cw12b13.yaml b/src/main/resources/regression/cw12b13.yaml
index 8ac14fb729..284ae49312 100644
--- a/src/main/resources/regression/cw12b13.yaml
+++ b/src/main/resources/regression/cw12b13.yaml
@@ -2,7 +2,7 @@
 corpus: cw12b13
 corpus_path: collections/web/ClueWeb12-B13/
 
-index_path: indexes/lucene-index.cw12b13
+index_path: indexes/lucene-index.cw12b13/
 collection_class: ClueWeb12Collection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 44
diff --git a/src/main/resources/regression/disk12.yaml b/src/main/resources/regression/disk12.yaml
index a5dd12302e..a8dcd9e810 100644
--- a/src/main/resources/regression/disk12.yaml
+++ b/src/main/resources/regression/disk12.yaml
@@ -2,7 +2,7 @@
 corpus: disk12
 corpus_path: collections/newswire/disk12/
 
-index_path: indexes/lucene-index.disk12
+index_path: indexes/lucene-index.disk12/
 collection_class: TrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/disk45.yaml b/src/main/resources/regression/disk45.yaml
index 4d4e893635..9fdf8fc7cc 100644
--- a/src/main/resources/regression/disk45.yaml
+++ b/src/main/resources/regression/disk45.yaml
@@ -2,7 +2,7 @@
 corpus: disk45
 corpus_path: collections/newswire/disk45/
 
-index_path: indexes/lucene-index.disk45
+index_path: indexes/lucene-index.disk45/
 collection_class: TrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/dl19-doc-docTTTTTquery-per-doc.yaml b/src/main/resources/regression/dl19-doc-docTTTTTquery.yaml
similarity index 83%
rename from src/main/resources/regression/dl19-doc-docTTTTTquery-per-doc.yaml
rename to src/main/resources/regression/dl19-doc-docTTTTTquery.yaml
index a47bc460f8..bdb464b733 100644
--- a/src/main/resources/regression/dl19-doc-docTTTTTquery-per-doc.yaml
+++ b/src/main/resources/regression/dl19-doc-docTTTTTquery.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-docTTTTTquery-per-doc
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-doc
+corpus: msmarco-doc-docTTTTTquery
+corpus_path: collections/msmarco/msmarco-doc-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc
+index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 7
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 3213834
-  documents (non-empty): 3213834
-  total terms: 3748332076
+  documents: 3213835
+  documents (non-empty): 3213835
+  total terms: 3748333319
 
 metrics:
   - metric: MAP
@@ -50,7 +50,7 @@ models:
     params: -bm25 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2699
+        - 0.2700
       nDCG@10:
         - 0.5968
       R@100:
@@ -60,9 +60,9 @@ models:
     params: -bm25 -rm3 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.3044
+        - 0.3045
       nDCG@10:
-        - 0.5895
+        - 0.5897
       R@100:
         - 0.4465
   - name: bm25-tuned
@@ -72,7 +72,7 @@ models:
       MAP:
         - 0.2620
       nDCG@10:
-        - 0.5967
+        - 0.5972
       R@100:
         - 0.3992
   - name: bm25-tuned+rm3
@@ -80,8 +80,8 @@ models:
     params: -bm25 -bm25.k1 4.68 -bm25.b 0.87 -rm3 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2812
+        - 0.2814
       nDCG@10:
-        - 0.6075
+        - 0.6080
       R@100:
         - 0.4119
\ No newline at end of file
diff --git a/src/main/resources/regression/dl19-doc-docTTTTTquery-per-passage.yaml b/src/main/resources/regression/dl19-doc-segmented-docTTTTTquery.yaml
similarity index 79%
rename from src/main/resources/regression/dl19-doc-docTTTTTquery-per-passage.yaml
rename to src/main/resources/regression/dl19-doc-segmented-docTTTTTquery.yaml
index f6f2b63fc1..25253cda3a 100644
--- a/src/main/resources/regression/dl19-doc-docTTTTTquery-per-passage.yaml
+++ b/src/main/resources/regression/dl19-doc-segmented-docTTTTTquery.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-docTTTTTquery-per-passage
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-passage
+corpus: msmarco-doc-segmented-docTTTTTquery
+corpus_path: collections/msmarco/msmarco-doc-segmented-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage
+index_path: indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 16
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 20544550
-  documents (non-empty): 20544550
-  total terms: 4203956960
+  documents: 20545677
+  documents (non-empty): 20545677
+  total terms: 4206639543
 
 metrics:
   - metric: MAP
@@ -50,38 +50,38 @@ models:
     params: -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2791
+        - 0.2798
       nDCG@10:
-        - 0.6099
+        - 0.6119
       R@100:
-        - 0.4092
+        - 0.4093
   - name: bm25-default+rm3
     display: +RM3
     params: -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3025
+        - 0.3021
       nDCG@10:
-        - 0.6318
+        - 0.6297
       R@100:
-        - 0.4394
+        - 0.4392
   - name: bm25-tuned
     display: BM25 (tuned)
     params: -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2655
+        - 0.2658
       nDCG@10:
-        - 0.6271
+        - 0.6273
       R@100:
-        - 0.4020
+        - 0.4026
   - name: bm25-tuned+rm3
     display: +RM3
     params: -bm25 -bm25.k1 2.56 -bm25.b 0.59 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2895
+        - 0.2893
       nDCG@10:
-        - 0.6256
+        - 0.6239
       R@100:
-        - 0.4235
\ No newline at end of file
+        - 0.4237
\ No newline at end of file
diff --git a/src/main/resources/regression/dl19-doc-per-passage.yaml b/src/main/resources/regression/dl19-doc-segmented.yaml
similarity index 82%
rename from src/main/resources/regression/dl19-doc-per-passage.yaml
rename to src/main/resources/regression/dl19-doc-segmented.yaml
index 478684d48d..ce5b32ce9d 100644
--- a/src/main/resources/regression/dl19-doc-per-passage.yaml
+++ b/src/main/resources/regression/dl19-doc-segmented.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-per-passage
-corpus_path: collections/msmarco/doc-per-passage/
+corpus: msmarco-doc-segmented
+corpus_path: collections/msmarco/msmarco-doc-segmented/
 
-index_path: indexes/lucene-index.msmarco-doc-per-passage
+index_path: indexes/lucene-index.msmarco-doc-segmented/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 16
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 20544550
-  documents (non-empty): 20544550
-  total terms: 3197886407
+  documents: 20545677
+  documents (non-empty): 20545677
+  total terms: 3200515914
 
 metrics:
   - metric: MAP
@@ -50,9 +50,9 @@ models:
     params: -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2441
+        - 0.2449
       nDCG@10:
-        - 0.5276
+        - 0.5302
       R@100:
         - 0.3840
   - name: bm25-default+rm3
@@ -60,39 +60,39 @@ models:
     params: -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2880
+        - 0.2884
       nDCG@10:
-        - 0.5750
+        - 0.5764
       R@100:
-        - 0.4356
+        - 0.4355
   - name: bm25-default+ax
     display: +Ax
     params: -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3015
+        - 0.2981
       nDCG@10:
-        - 0.5590
+        - 0.5556
       R@100:
-        - 0.4501
+        - 0.4490
   - name: bm25-default+prf
     display: +PRF
     params: -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2821
+        - 0.2827
       nDCG@10:
-        - 0.5591
+        - 0.5599
       R@100:
-        - 0.4477
+        - 0.4476
   - name: bm25-tuned
     display: BM25 (tuned)
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2394
+        - 0.2398
       nDCG@10:
-        - 0.5364
+        - 0.5389
       R@100:
         - 0.3903
   - name: bm25-tuned+rm3
@@ -100,28 +100,28 @@ models:
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2656
+        - 0.2658
       nDCG@10:
-        - 0.5379
+        - 0.5405
       R@100:
-        - 0.4126
+        - 0.4133
   - name: bm25-tuned+ax
     display: +Ax
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2934
+        - 0.2975
       nDCG@10:
-        - 0.5546
+        - 0.5574
       R@100:
-        - 0.4437
+        - 0.4491
   - name: bm25-tuned+prf
     display: +PRF
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.2838
+        - 0.2828
       nDCG@10:
-        - 0.5478
+        - 0.5476
       R@100:
-        - 0.4362
\ No newline at end of file
+        - 0.4361
\ No newline at end of file
diff --git a/src/main/resources/regression/dl19-doc.yaml b/src/main/resources/regression/dl19-doc.yaml
index d72635896c..eb17d15689 100644
--- a/src/main/resources/regression/dl19-doc.yaml
+++ b/src/main/resources/regression/dl19-doc.yaml
@@ -1,16 +1,16 @@
 ---
 corpus: msmarco-doc
-corpus_path: collections/msmarco/doc/
+corpus_path: collections/msmarco/msmarco-doc/
 
-index_path: indexes/lucene-index.msmarco-doc
-collection_class: CleanTrecCollection
+index_path: indexes/lucene-index.msmarco-doc/
+collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 7
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
   documents: 3213835
   documents (non-empty): 3213835
-  total terms: 2748636047
+  total terms: 2742209690
 
 metrics:
   - metric: MAP
@@ -50,19 +50,19 @@ models:
     params: -bm25 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2443
+        - 0.2434
       nDCG@10:
-        - 0.5190
+        - 0.5176
       R@100:
-        - 0.3948
+        - 0.3949
   - name: bm25-default+rm3
     display: +RM3
     params: -bm25 -rm3 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2772
+        - 0.2774
       nDCG@10:
-        - 0.5169
+        - 0.5170
       R@100:
         - 0.4189
   - name: bm25-default+ax
@@ -70,11 +70,11 @@ models:
     params: -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2452
+        - 0.2454
       nDCG@10:
-        - 0.4730
+        - 0.4732
       R@100:
-        - 0.3945
+        - 0.3946
   - name: bm25-default+prf
     display: +PRF
     params: -bm25 -bm25prf -hits 100  # Note, this is different DL 2019 passage ranking!
@@ -82,46 +82,46 @@ models:
       MAP:
         - 0.2541
       nDCG@10:
-        - 0.5105
+        - 0.5107
       R@100:
-        - 0.4004
+        - 0.4003
   - name: bm25-tuned
     display: BM25 (tuned)
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2318
+        - 0.2311
       nDCG@10:
-        - 0.5140
+        - 0.5139
       R@100:
-        - 0.3862
+        - 0.3853
   - name: bm25-tuned+rm3
     display: +RM3
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -rm3 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2700
+        - 0.2684
       nDCG@10:
-        - 0.5485
+        - 0.5445
       R@100:
-        - 0.4193
+        - 0.4186
   - name: bm25-tuned+ax
     display: +Ax
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -axiom -axiom.deterministic -rerankCutoff 20 -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2816
+        - 0.2792
       nDCG@10:
-        - 0.5245
+        - 0.5203
       R@100:
-        - 0.4399
+        - 0.4378
   - name: bm25-tuned+prf
     display: +PRF
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -bm25prf -hits 100  # Note, this is different DL 2019 passage ranking!
     results:
       MAP:
-        - 0.2758
+        - 0.2774
       nDCG@10:
-        - 0.5280
+        - 0.5294
       R@100:
-        - 0.4287
+        - 0.4295
diff --git a/src/main/resources/regression/dl19-passage-docTTTTTquery.yaml b/src/main/resources/regression/dl19-passage-docTTTTTquery.yaml
index 053df1e3c9..d6e65da564 100644
--- a/src/main/resources/regression/dl19-passage-docTTTTTquery.yaml
+++ b/src/main/resources/regression/dl19-passage-docTTTTTquery.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-passage-docTTTTTquery
-corpus_path: collections/msmarco/passage-docTTTTTquery
+corpus_path: collections/msmarco/passage-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-passage-docTTTTTquery
+index_path: indexes/lucene-index.msmarco-passage-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/dl19-passage.yaml b/src/main/resources/regression/dl19-passage.yaml
index 87eb7b6577..517aefd6dc 100644
--- a/src/main/resources/regression/dl19-passage.yaml
+++ b/src/main/resources/regression/dl19-passage.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage
 corpus_path: collections/msmarco/passage/
 
-index_path: indexes/lucene-index.msmarco-passage
+index_path: indexes/lucene-index.msmarco-passage/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/dl20-doc-docTTTTTquery-per-doc.yaml b/src/main/resources/regression/dl20-doc-docTTTTTquery.yaml
similarity index 87%
rename from src/main/resources/regression/dl20-doc-docTTTTTquery-per-doc.yaml
rename to src/main/resources/regression/dl20-doc-docTTTTTquery.yaml
index b2b7a765ff..8cc0f78136 100644
--- a/src/main/resources/regression/dl20-doc-docTTTTTquery-per-doc.yaml
+++ b/src/main/resources/regression/dl20-doc-docTTTTTquery.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-docTTTTTquery-per-doc
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-doc
+corpus: msmarco-doc-docTTTTTquery
+corpus_path: collections/msmarco/msmarco-doc-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc
+index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 7
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 3213834
-  documents (non-empty): 3213834
-  total terms: 3748332076
+  documents: 3213835
+  documents (non-empty): 3213835
+  total terms: 3748333319
 
 metrics:
   - metric: MAP
@@ -63,13 +63,13 @@ models:
       MRR:
         - 0.9369
       R@100:
-        - 0.6412
+        - 0.6414
   - name: bm25-default+rm3
     display: +RM3
     params: -bm25 -rm3 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.4228
+        - 0.4229
       nDCG@10:
         - 0.5407
       MRR:
@@ -81,7 +81,7 @@ models:
     params: -bm25 -bm25.k1 4.68 -bm25.b 0.87 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.4098
+        - 0.4099
       nDCG@10:
         - 0.5852
       MRR:
diff --git a/src/main/resources/regression/dl20-doc-docTTTTTquery-per-passage.yaml b/src/main/resources/regression/dl20-doc-segmented-docTTTTTquery.yaml
similarity index 84%
rename from src/main/resources/regression/dl20-doc-docTTTTTquery-per-passage.yaml
rename to src/main/resources/regression/dl20-doc-segmented-docTTTTTquery.yaml
index 70691d5bd9..d14df4363a 100644
--- a/src/main/resources/regression/dl20-doc-docTTTTTquery-per-passage.yaml
+++ b/src/main/resources/regression/dl20-doc-segmented-docTTTTTquery.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-docTTTTTquery-per-passage
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-passage
+corpus: msmarco-doc-segmented-docTTTTTquery
+corpus_path: collections/msmarco/msmarco-doc-segmented-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage
+index_path: indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 16
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 20544550
-  documents (non-empty): 20544550
-  total terms: 4203956960
+  documents: 20545677
+  documents (non-empty): 20545677
+  total terms: 4206639543
 
 metrics:
   - metric: MAP
@@ -69,9 +69,9 @@ models:
     params: -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.4269
+        - 0.4268
       nDCG@10:
-        - 0.5848
+        - 0.5850
       MRR:
         - 0.8944
       R@100:
@@ -81,22 +81,22 @@ models:
     params: -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.4042
+        - 0.4047
       nDCG@10:
-        - 0.5931
+        - 0.5943
       MRR:
         - 0.9469
       R@100:
-        - 0.6192
+        - 0.6195
   - name: bm25-tuned+rm3
     display: +RM3
     params: -bm25 -bm25.k1 2.56 -bm25.b 0.59 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.4023
+        - 0.4025
       nDCG@10:
-        - 0.5723
+        - 0.5724
       MRR:
         - 0.9150
       R@100:
-        - 0.6392
\ No newline at end of file
+        - 0.6394
\ No newline at end of file
diff --git a/src/main/resources/regression/dl20-doc-per-passage.yaml b/src/main/resources/regression/dl20-doc-segmented.yaml
similarity index 83%
rename from src/main/resources/regression/dl20-doc-per-passage.yaml
rename to src/main/resources/regression/dl20-doc-segmented.yaml
index c8a6cbbc1e..23b70854eb 100644
--- a/src/main/resources/regression/dl20-doc-per-passage.yaml
+++ b/src/main/resources/regression/dl20-doc-segmented.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-per-passage
-corpus_path: collections/msmarco/doc-per-passage/
+corpus: msmarco-doc-segmented
+corpus_path: collections/msmarco/msmarco-doc-segmented/
 
-index_path: indexes/lucene-index.msmarco-doc-per-passage
+index_path: indexes/lucene-index.msmarco-doc-segmented/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 16
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 20544550
-  documents (non-empty): 20544550
-  total terms: 3197886407
+  documents: 20545677
+  documents (non-empty): 20545677
+  total terms: 3200515914
 
 metrics:
   - metric: MAP
@@ -57,9 +57,9 @@ models:
     params: -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3584
+        - 0.3586
       nDCG@10:
-        - 0.5271
+        - 0.5281
       MRR:
         - 0.8479
       R@100:
@@ -69,9 +69,9 @@ models:
     params: -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3769
+        - 0.3774
       nDCG@10:
-        - 0.5159
+        - 0.5179
       MRR:
         - 0.8136
       R@100:
@@ -81,70 +81,70 @@ models:
     params: -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3854
+        - 0.3868
       nDCG@10:
-        - 0.5250
+        - 0.5227
       MRR:
-        - 0.8123
+        - 0.8028
       R@100:
-        - 0.6332
+        - 0.6362
   - name: bm25-default+prf
     display: +PRF
     params: -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3672
+        - 0.3686
       nDCG@10:
-        - 0.5217
+        - 0.5238
       MRR:
         - 0.7911
       R@100:
-        - 0.5994
+        - 0.6012
   - name: bm25-tuned
     display: BM25 (tuned)
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3456
+        - 0.3458
       nDCG@10:
         - 0.5213
       MRR:
         - 0.8684
       R@100:
-        - 0.5715
+        - 0.5723
   - name: bm25-tuned+rm3
     display: +RM3
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3471
+        - 0.3472
       nDCG@10:
-        - 0.4983
+        - 0.4979
       MRR:
         - 0.7807
       R@100:
-        - 0.6013
+        - 0.6025
   - name: bm25-tuned+ax
     display: +Ax
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3495
+        - 0.3486
       nDCG@10:
-        - 0.4942
+        - 0.4948
       MRR:
-        - 0.8102
+        - 0.8019
       R@100:
-        - 0.6086
+        - 0.6114
   - name: bm25-tuned+prf
     display: +PRF
     params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 100
     results:
       MAP:
-        - 0.3629
+        - 0.3627
       nDCG@10:
-        - 0.5260
+        - 0.5251
       MRR:
         - 0.8478
       R@100:
-        - 0.6064
\ No newline at end of file
+        - 0.6048
\ No newline at end of file
diff --git a/src/main/resources/regression/dl20-doc.yaml b/src/main/resources/regression/dl20-doc.yaml
index 49e2285f12..bfdc842667 100644
--- a/src/main/resources/regression/dl20-doc.yaml
+++ b/src/main/resources/regression/dl20-doc.yaml
@@ -1,16 +1,16 @@
 ---
 corpus: msmacro-doc
-corpus_path: collections/msmarco/doc/
+corpus_path: collections/msmarco/msmarco-doc/
 
-index_path: indexes/lucene-index.msmarco-doc
-collection_class: CleanTrecCollection
+index_path: indexes/lucene-index.msmarco-doc/
+collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 7
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
   documents: 3213835
   documents (non-empty): 3213835
-  total terms: 2748636047
+  total terms: 2742209690
 
 metrics:
   - metric: MAP
@@ -57,9 +57,9 @@ models:
     params: -bm25 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.3791
+        - 0.3793
       nDCG@10:
-        - 0.5271
+        - 0.5286
       MRR:
         - 0.8521
       R@100:
@@ -69,47 +69,47 @@ models:
     params: -bm25 -rm3 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.4006
+        - 0.4014
       nDCG@10:
-        - 0.5248
+        - 0.5225
       MRR:
         - 0.8541
       R@100:
-        - 0.6392
+        - 0.6414
   - name: bm25-tuned
     display: BM25 (tuned)
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.3630
+        - 0.3631
       nDCG@10:
-        - 0.5087
+        - 0.5070
       MRR:
         - 0.8641
       R@100:
-        - 0.5926
+        - 0.5935
   - name: bm25-tuned+rm3
     display: +RM3
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -rm3 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.3588
+        - 0.3592
       nDCG@10:
-        - 0.5117
+        - 0.5124
       MRR:
-        - 0.8188
+        - 0.8186
       R@100:
-        - 0.5983
+        - 0.5977
   - name: bm25-tuned2
     display: BM25 (tuned2)
     params: -bm25 -bm25.k1 4.46 -bm25.b 0.82 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.3583
+        - 0.3581
       nDCG@10:
-        - 0.5078
+        - 0.5061
       MRR:
-        - 0.8541
+        - 0.8522
       R@100:
         - 0.5860
   - name: bm25-tuned2+rm3
@@ -117,10 +117,10 @@ models:
     params: -bm25 -bm25.k1 4.46 -bm25.b 0.82 -rm3 -hits 100  # Note, this is different DL 2020 passage ranking!
     results:
       MAP:
-        - 0.3618
+        - 0.3619
       nDCG@10:
-        - 0.5202
+        - 0.5238
       MRR:
-        - 0.8458
+        - 0.8582
       R@100:
-        - 0.5998
+        - 0.5995
diff --git a/src/main/resources/regression/dl20-passage-docTTTTTquery.yaml b/src/main/resources/regression/dl20-passage-docTTTTTquery.yaml
index 3865d9d632..91e640d338 100644
--- a/src/main/resources/regression/dl20-passage-docTTTTTquery.yaml
+++ b/src/main/resources/regression/dl20-passage-docTTTTTquery.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-passage-docTTTTTquery
-corpus_path: collections/msmarco/passage-docTTTTTquery
+corpus_path: collections/msmarco/passage-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-passage-docTTTTTquery
+index_path: indexes/lucene-index.msmarco-passage-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/dl20-passage.yaml b/src/main/resources/regression/dl20-passage.yaml
index 1130fec069..b296ed495c 100644
--- a/src/main/resources/regression/dl20-passage.yaml
+++ b/src/main/resources/regression/dl20-passage.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage
 corpus_path: collections/msmarco/passage/
 
-index_path: indexes/lucene-index.msmarco-passage
+index_path: indexes/lucene-index.msmarco-passage/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/dl21-doc-segmented-unicoil-noexp-0shot.yaml b/src/main/resources/regression/dl21-doc-segmented-unicoil-noexp-0shot.yaml
index 7d66b9fd82..5688f049c9 100644
--- a/src/main/resources/regression/dl21-doc-segmented-unicoil-noexp-0shot.yaml
+++ b/src/main/resources/regression/dl21-doc-segmented-unicoil-noexp-0shot.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-doc-segmented-unicoil-noexp-0shot
-corpus_path: collections/msmarco/msmarco-doc-v2-seg-unicoil-noexp-0shot-b8
+corpus_path: collections/msmarco/msmarco-doc-v2-seg-unicoil-noexp-0shot-b8/
 
-index_path: indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot
+index_path: indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/dl21-doc-segmented.yaml b/src/main/resources/regression/dl21-doc-segmented.yaml
index 31c7aa366b..86253172d8 100644
--- a/src/main/resources/regression/dl21-doc-segmented.yaml
+++ b/src/main/resources/regression/dl21-doc-segmented.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-doc-segmented
-corpus_path: collections/msmarco/msmarco_v2_doc_segmented
+corpus_path: collections/msmarco/msmarco_v2_doc_segmented/
 
-index_path: indexes/lucene-index.msmarco-v2-doc-segmented
+index_path: indexes/lucene-index.msmarco-v2-doc-segmented/
 collection_class: MsMarcoV2DocCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/dl21-doc.yaml b/src/main/resources/regression/dl21-doc.yaml
index c02ea04de6..42f2146191 100644
--- a/src/main/resources/regression/dl21-doc.yaml
+++ b/src/main/resources/regression/dl21-doc.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-doc
-corpus_path: collections/msmarco/msmarco_v2_doc
+corpus_path: collections/msmarco/msmarco_v2_doc/
 
-index_path: indexes/lucene-index.msmarco-v2-doc
+index_path: indexes/lucene-index.msmarco-v2-doc/
 collection_class: MsMarcoV2DocCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/dl21-passage-augmented.yaml b/src/main/resources/regression/dl21-passage-augmented.yaml
index 2b6d689305..ac1d2d4422 100644
--- a/src/main/resources/regression/dl21-passage-augmented.yaml
+++ b/src/main/resources/regression/dl21-passage-augmented.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-passage-augmented
-corpus_path: collections/msmarco/msmarco_v2_passage_augmented
+corpus_path: collections/msmarco/msmarco_v2_passage_augmented/
 
-index_path: indexes/lucene-index.msmarco-v2-passage-augmented
+index_path: indexes/lucene-index.msmarco-v2-passage-augmented/
 collection_class: MsMarcoV2PassageCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/dl21-passage-unicoil-noexp-0shot.yaml b/src/main/resources/regression/dl21-passage-unicoil-noexp-0shot.yaml
index 8c57021218..95dd185e12 100644
--- a/src/main/resources/regression/dl21-passage-unicoil-noexp-0shot.yaml
+++ b/src/main/resources/regression/dl21-passage-unicoil-noexp-0shot.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-passage-unicoil-noexp-0shot
-corpus_path: collections/msmarco/msmarco-passage-v2-unicoil-noexp-0shot-b8
+corpus_path: collections/msmarco/msmarco-passage-v2-unicoil-noexp-0shot-b8/
 
-index_path: indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot
+index_path: indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/dl21-passage.yaml b/src/main/resources/regression/dl21-passage.yaml
index af093963f8..10acdb17d8 100644
--- a/src/main/resources/regression/dl21-passage.yaml
+++ b/src/main/resources/regression/dl21-passage.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-passage
-corpus_path: collections/msmarco/msmarco_v2_passage
+corpus_path: collections/msmarco/msmarco_v2_passage/
 
-index_path: indexes/lucene-index.msmarco-v2-passage
+index_path: indexes/lucene-index.msmarco-v2-passage/
 collection_class: MsMarcoV2PassageCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/fever.yaml b/src/main/resources/regression/fever.yaml
index f888871f62..73e5d90b53 100644
--- a/src/main/resources/regression/fever.yaml
+++ b/src/main/resources/regression/fever.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: fever
-corpus_path: collections/fever/wiki-pages
+corpus_path: collections/fever/wiki-pages/
 
-index_path: indexes/lucene-index.fever-paragraph
+index_path: indexes/lucene-index.fever-paragraph/
 collection_class: FeverParagraphCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/fire12-bn.yaml b/src/main/resources/regression/fire12-bn.yaml
index 6265ff5bb3..f65cec6f45 100644
--- a/src/main/resources/regression/fire12-bn.yaml
+++ b/src/main/resources/regression/fire12-bn.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: fire12-bn
-corpus_path: collections/fire/bengali/bn.docs.2012.19032012
+corpus_path: collections/fire/bengali/bn.docs.2012.19032012/
 
-index_path: indexes/lucene-index.fire12-bn
+index_path: indexes/lucene-index.fire12-bn/
 collection_class: CleanTrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/fire12-en.yaml b/src/main/resources/regression/fire12-en.yaml
index 267cb38073..fe275805a2 100644
--- a/src/main/resources/regression/fire12-en.yaml
+++ b/src/main/resources/regression/fire12-en.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: fire12-en
-corpus_path: collections/fire/english/en.docs.2011
+corpus_path: collections/fire/english/en.docs.2011/
 
-index_path: indexes/lucene-index.fire12-en
+index_path: indexes/lucene-index.fire12-en/
 collection_class: CleanTrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/fire12-hi.yaml b/src/main/resources/regression/fire12-hi.yaml
index 940e8c4cb0..e2f8bcd2c4 100644
--- a/src/main/resources/regression/fire12-hi.yaml
+++ b/src/main/resources/regression/fire12-hi.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: fire12-hi
-corpus_path: collections/fire/hindi/hi.docs.2011
+corpus_path: collections/fire/hindi/hi.docs.2011/
 
-index_path: indexes/lucene-index.fire12-hi
+index_path: indexes/lucene-index.fire12-hi/
 collection_class: CleanTrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/gov2.yaml b/src/main/resources/regression/gov2.yaml
index 2364099650..477865555b 100644
--- a/src/main/resources/regression/gov2.yaml
+++ b/src/main/resources/regression/gov2.yaml
@@ -2,7 +2,7 @@
 corpus: gov2
 corpus_path: collections/web/gov2/gov2-corpus/
 
-index_path: indexes/lucene-index.gov2
+index_path: indexes/lucene-index.gov2/
 collection_class: TrecwebCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 44
diff --git a/src/main/resources/regression/mb11.yaml b/src/main/resources/regression/mb11.yaml
index 9725e39130..875b9f7a0c 100644
--- a/src/main/resources/regression/mb11.yaml
+++ b/src/main/resources/regression/mb11.yaml
@@ -2,7 +2,7 @@
 corpus: mb11
 corpus_path: collections/twitter/Tweets2011-corpus/json.gold/
 
-index_path: indexes/lucene-index.mb11
+index_path: indexes/lucene-index.mb11/
 collection_class: TweetCollection
 generator_class: TweetGenerator
 index_threads: 44
diff --git a/src/main/resources/regression/mb13.yaml b/src/main/resources/regression/mb13.yaml
index 02dd4633ab..c736616dc8 100644
--- a/src/main/resources/regression/mb13.yaml
+++ b/src/main/resources/regression/mb13.yaml
@@ -2,7 +2,7 @@
 corpus: mb13
 corpus_path: collections/twitter/Tweets2013-corpus/data/
 
-index_path: indexes/lucene-index.mb13
+index_path: indexes/lucene-index.mb13/
 collection_class: TweetCollection
 generator_class: TweetGenerator
 index_threads: 44
diff --git a/src/main/resources/regression/mrtydi-v1.1-ar.yaml b/src/main/resources/regression/mrtydi-v1.1-ar.yaml
index 12129b11bc..122a46af00 100644
--- a/src/main/resources/regression/mrtydi-v1.1-ar.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-ar.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-ar
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-arabic
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-arabic/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-arabic
+index_path: indexes/lucene-index.mrtydi-v1.1-arabic/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-bn.yaml b/src/main/resources/regression/mrtydi-v1.1-bn.yaml
index b4ce928a6d..b083bc5706 100644
--- a/src/main/resources/regression/mrtydi-v1.1-bn.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-bn.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-bn
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-bengali
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-bengali/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-bengali
+index_path: indexes/lucene-index.mrtydi-v1.1-bengali/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-en.yaml b/src/main/resources/regression/mrtydi-v1.1-en.yaml
index c6960c7209..0c703b6c06 100644
--- a/src/main/resources/regression/mrtydi-v1.1-en.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-en.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-en
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-english
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-english/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-english
+index_path: indexes/lucene-index.mrtydi-v1.1-english/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-fi.yaml b/src/main/resources/regression/mrtydi-v1.1-fi.yaml
index bbaa676a9d..73b850ef7b 100644
--- a/src/main/resources/regression/mrtydi-v1.1-fi.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-fi.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-fi
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-finnish
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-finnish/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-finnish
+index_path: indexes/lucene-index.mrtydi-v1.1-finnish/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-id.yaml b/src/main/resources/regression/mrtydi-v1.1-id.yaml
index 09a1a0f128..22e39782a1 100644
--- a/src/main/resources/regression/mrtydi-v1.1-id.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-id.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-id
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-indonesian
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-indonesian/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-indonesian
+index_path: indexes/lucene-index.mrtydi-v1.1-indonesian/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-ja.yaml b/src/main/resources/regression/mrtydi-v1.1-ja.yaml
index a5e809ab16..a63271c9a6 100644
--- a/src/main/resources/regression/mrtydi-v1.1-ja.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-ja.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-ja
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-japanese
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-japanese/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-japanese
+index_path: indexes/lucene-index.mrtydi-v1.1-japanese/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-ko.yaml b/src/main/resources/regression/mrtydi-v1.1-ko.yaml
index 4fe56a2edc..8265665cbc 100644
--- a/src/main/resources/regression/mrtydi-v1.1-ko.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-ko.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-ko
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-korean
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-korean/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-korean
+index_path: indexes/lucene-index.mrtydi-v1.1-korean/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-ru.yaml b/src/main/resources/regression/mrtydi-v1.1-ru.yaml
index 3af935fd9c..51b8c6fce1 100644
--- a/src/main/resources/regression/mrtydi-v1.1-ru.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-ru.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-ru
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-russian
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-russian/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-russian
+index_path: indexes/lucene-index.mrtydi-v1.1-russian/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-sw.yaml b/src/main/resources/regression/mrtydi-v1.1-sw.yaml
index 7d13f89855..67350a3fda 100644
--- a/src/main/resources/regression/mrtydi-v1.1-sw.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-sw.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-sw
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-swahili
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-swahili/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-swahili
+index_path: indexes/lucene-index.mrtydi-v1.1-swahili/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-te.yaml b/src/main/resources/regression/mrtydi-v1.1-te.yaml
index 8d67d97c28..42930ef65a 100644
--- a/src/main/resources/regression/mrtydi-v1.1-te.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-te.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-te
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-telugu
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-telugu/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-telugu
+index_path: indexes/lucene-index.mrtydi-v1.1-telugu/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/mrtydi-v1.1-th.yaml b/src/main/resources/regression/mrtydi-v1.1-th.yaml
index 2333601557..035efda906 100644
--- a/src/main/resources/regression/mrtydi-v1.1-th.yaml
+++ b/src/main/resources/regression/mrtydi-v1.1-th.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: mrtydi-v1.1-th
-corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-thai
+corpus_path: collections/mr-tydi-corpus/mrtydi-v1.1-thai/
 
-index_path: indexes/lucene-index.mrtydi-v1.1-thai
+index_path: indexes/lucene-index.mrtydi-v1.1-thai/
 collection_class: MrTyDiCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 1
diff --git a/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage.yaml b/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage.yaml
deleted file mode 100644
index d24f633cbe..0000000000
--- a/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage.yaml
+++ /dev/null
@@ -1,67 +0,0 @@
----
-corpus: msmarco-doc-docTTTTTquery-per-passage
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-passage
-
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage
-collection_class: JsonCollection
-generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
-index_options: -storePositions -storeDocvectors -storeRaw
-index_stats:
-  documents: 20544550
-  documents (non-empty): 20544550
-  total terms: 4203956960
-
-metrics:
-  - metric: MAP
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m map
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-  - metric: R@100
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m recall.100
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-  - metric: R@1000
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m recall.1000
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-
-topic_reader: TsvInt
-topic_root: src/main/resources/topics-and-qrels/
-qrels_root: src/main/resources/topics-and-qrels/
-topics:
-  - name: "[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)"
-    id: dev
-    path: topics.msmarco-doc.dev.txt
-    qrel: qrels.msmarco-doc.dev.txt
-
-models:
-  - name: bm25-default
-    display: BM25 (default)
-    params: -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.3182
-      R@100:
-        - 0.8481
-      R@1000:
-        - 0.9490
-  - name: bm25-tuned
-    display: BM25 (tuned)
-    params: -bm25 -bm25.k1 2.56 -bm25.b 0.59 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.3211
-      R@100:
-        - 0.8627
-      R@1000:
-        - 0.9530
\ No newline at end of file
diff --git a/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-doc.yaml b/src/main/resources/regression/msmarco-doc-docTTTTTquery.yaml
similarity index 80%
rename from src/main/resources/regression/msmarco-doc-docTTTTTquery-per-doc.yaml
rename to src/main/resources/regression/msmarco-doc-docTTTTTquery.yaml
index b42a5e735d..7dfb87b655 100644
--- a/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-doc.yaml
+++ b/src/main/resources/regression/msmarco-doc-docTTTTTquery.yaml
@@ -1,16 +1,16 @@
 ---
-corpus: msmarco-doc-docTTTTTquery-per-doc
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-doc
+corpus: msmarco-doc-docTTTTTquery
+corpus_path: collections/msmarco/msmarco-doc-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-doc
+index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 7
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
-  documents: 3213834
-  documents (non-empty): 3213834
-  total terms: 3748332076
+  documents: 3213835
+  documents (non-empty): 3213835
+  total terms: 3748333319
 
 metrics:
   - metric: MAP
@@ -52,7 +52,7 @@ models:
       MAP:
         - 0.2886
       R@100:
-        - 0.7990
+        - 0.7993
       R@1000:
         - 0.9259
   - name: bm25-tuned
@@ -60,8 +60,8 @@ models:
     params: -bm25 -bm25.k1 4.68 -bm25.b 0.87
     results:
       MAP:
-        - 0.3270
+        - 0.3273
       R@100:
-        - 0.8608
+        - 0.8612
       R@1000:
         - 0.9553
\ No newline at end of file
diff --git a/src/main/resources/regression/msmarco-doc-per-passage-v2.yaml b/src/main/resources/regression/msmarco-doc-per-passage-v2.yaml
deleted file mode 100644
index 4bd83f7fef..0000000000
--- a/src/main/resources/regression/msmarco-doc-per-passage-v2.yaml
+++ /dev/null
@@ -1,127 +0,0 @@
----
-corpus: msmarco-doc-per-passage-v2
-corpus_path: collections/msmarco/doc-per-passage-v2/
-
-index_path: indexes/lucene-index.msmarco-doc-per-passage-v2
-collection_class: JsonCollection
-generator_class: DefaultLuceneDocumentGenerator
-index_threads: 16
-index_options: -storePositions -storeDocvectors -storeRaw
-index_stats:
-  documents: 20545677
-  documents (non-empty): 20545612
-  total terms: 3056059952
-
-metrics:
-  - metric: MAP
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m map
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-  - metric: R@100
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m recall.100
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-  - metric: R@1000
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m recall.1000
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-
-topic_reader: TsvInt
-topic_root: src/main/resources/topics-and-qrels/
-qrels_root: src/main/resources/topics-and-qrels/
-topics:
-  - name: "[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)"
-    id: dev
-    path: topics.msmarco-doc.dev.txt
-    qrel: qrels.msmarco-doc.dev.txt
-
-models:
-  - name: bm25-default
-    display: BM25 (default)
-    params: -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2609
-      R@100:
-        - 0.7737
-      R@1000:
-        - 0.9095
-  - name: bm25-default+rm3
-    display: +RM3
-    params: -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2324
-      R@100:
-        - 0.7768
-      R@1000:
-        - 0.9266
-  - name: bm25-default+ax
-    display: +Ax
-    params: -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2170
-      R@100:
-        - 0.7578
-      R@1000:
-        - 0.9207
-  - name: bm25-default+prf
-    display: +PRF
-    params: -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2189
-      R@100:
-        - 0.7570
-      R@1000:
-        - 0.9135
-  - name: bm25-tuned
-    display: BM25 (tuned)
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2639
-      R@100:
-        - 0.7884
-      R@1000:
-        - 0.9222
-  - name: bm25-tuned+rm3
-    display: +RM3
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2342
-      R@100:
-        - 0.7793
-      R@1000:
-        - 0.9239
-  - name: bm25-tuned+ax
-    display: +Ax
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2250
-      R@100:
-        - 0.7730
-      R@1000:
-        - 0.9268
-  - name: bm25-tuned+prf
-    display: +PRF
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2184
-      R@100:
-        - 0.7520
-      R@1000:
-        - 0.9101
\ No newline at end of file
diff --git a/src/main/resources/regression/msmarco-doc-per-passage.yaml b/src/main/resources/regression/msmarco-doc-per-passage.yaml
deleted file mode 100644
index 65b483f69f..0000000000
--- a/src/main/resources/regression/msmarco-doc-per-passage.yaml
+++ /dev/null
@@ -1,127 +0,0 @@
----
-corpus: msmarco-doc-per-passage
-corpus_path: collections/msmarco/doc-per-passage/
-
-index_path: indexes/lucene-index.msmarco-doc-per-passage
-collection_class: JsonCollection
-generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
-index_options: -storePositions -storeDocvectors -storeRaw
-index_stats:
-  documents: 20544550
-  documents (non-empty): 20544550
-  total terms: 3197886407
-
-metrics:
-  - metric: MAP
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m map
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-  - metric: R@100
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m recall.100
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-  - metric: R@1000
-    command: tools/eval/trec_eval.9.0.4/trec_eval
-    params: -c -m recall.1000
-    separator: "\t"
-    parse_index: 2
-    metric_precision: 4
-    can_combine: true
-
-topic_reader: TsvInt
-topic_root: src/main/resources/topics-and-qrels/
-qrels_root: src/main/resources/topics-and-qrels/
-topics:
-  - name: "[MS MARCO Doc: Dev](https://github.com/microsoft/MSMARCO-Document-Ranking)"
-    id: dev
-    path: topics.msmarco-doc.dev.txt
-    qrel: qrels.msmarco-doc.dev.txt
-
-models:
-  - name: bm25-default
-    display: BM25 (default)
-    params: -bm25 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2688
-      R@100:
-        - 0.7849
-      R@1000:
-        - 0.9180
-  - name: bm25-default+rm3
-    display: +RM3
-    params: -bm25 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2416
-      R@100:
-        - 0.7876
-      R@1000:
-        - 0.9355
-  - name: bm25-default+ax
-    display: +Ax
-    params: -bm25 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2229
-      R@100:
-        - 0.7703
-      R@1000:
-        - 0.9266
-  - name: bm25-default+prf
-    display: +PRF
-    params: -bm25 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2325
-      R@100:
-        - 0.7714
-      R@1000:
-        - 0.9187
-  - name: bm25-tuned
-    display: BM25 (tuned)
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2756
-      R@100:
-        - 0.8009
-      R@1000:
-        - 0.9311
-  - name: bm25-tuned+rm3
-    display: +RM3
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -rm3 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2443
-      R@100:
-        - 0.7955
-      R@1000:
-        - 0.9359
-  - name: bm25-tuned+ax
-    display: +Ax
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -axiom -axiom.deterministic -rerankCutoff 20 -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2350
-      R@100:
-        - 0.7909
-      R@1000:
-        - 0.9341
-  - name: bm25-tuned+prf
-    display: +PRF
-    params: -bm25 -bm25.k1 2.16 -bm25.b 0.61 -bm25prf -hits 10000 -selectMaxPassage -selectMaxPassage.delimiter "#" -selectMaxPassage.hits 1000
-    results:
-      MAP:
-        - 0.2271
-      R@100:
-        - 0.7685
-      R@1000:
-        - 0.9162
\ No newline at end of file
diff --git a/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage-v3.yaml b/src/main/resources/regression/msmarco-doc-segmented-docTTTTTquery.yaml
similarity index 89%
rename from src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage-v3.yaml
rename to src/main/resources/regression/msmarco-doc-segmented-docTTTTTquery.yaml
index 1092f9ba29..9ad09552e0 100644
--- a/src/main/resources/regression/msmarco-doc-docTTTTTquery-per-passage-v3.yaml
+++ b/src/main/resources/regression/msmarco-doc-segmented-docTTTTTquery.yaml
@@ -1,8 +1,8 @@
 ---
-corpus: msmarco-doc-docTTTTTquery-per-passage-v3
-corpus_path: collections/msmarco/doc-docTTTTTquery-per-passage-v3
+corpus: msmarco-doc-segmented-docTTTTTquery
+corpus_path: collections/msmarco/msmarco-doc-segmented-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-doc-docTTTTTquery-per-passage-v3
+index_path: indexes/lucene-index.msmarco-doc-segmented-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/msmarco-doc-per-passage-v3.yaml b/src/main/resources/regression/msmarco-doc-segmented.yaml
similarity index 95%
rename from src/main/resources/regression/msmarco-doc-per-passage-v3.yaml
rename to src/main/resources/regression/msmarco-doc-segmented.yaml
index cd9a7a41ff..785de81f82 100644
--- a/src/main/resources/regression/msmarco-doc-per-passage-v3.yaml
+++ b/src/main/resources/regression/msmarco-doc-segmented.yaml
@@ -1,8 +1,8 @@
 ---
-corpus: msmarco-doc-per-passage-v3
-corpus_path: collections/msmarco/doc-per-passage-v3/
+corpus: msmarco-doc-segmented
+corpus_path: collections/msmarco/msmarco-doc-segmented/
 
-index_path: indexes/lucene-index.msmarco-doc-per-passage-v3
+index_path: indexes/lucene-index.msmarco-doc-segmented/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/msmarco-doc.yaml b/src/main/resources/regression/msmarco-doc.yaml
index ae43147a13..59644ebb7c 100644
--- a/src/main/resources/regression/msmarco-doc.yaml
+++ b/src/main/resources/regression/msmarco-doc.yaml
@@ -1,16 +1,16 @@
 ---
 corpus: msmarco-doc
-corpus_path: collections/msmarco/doc/
+corpus_path: collections/msmarco/msmarco-doc/
 
-index_path: indexes/lucene-index.msmarco-doc
-collection_class: CleanTrecCollection
+index_path: indexes/lucene-index.msmarco-doc/
+collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
-index_threads: 1
+index_threads: 7
 index_options: -storePositions -storeDocvectors -storeRaw
 index_stats:
   documents: 3213835
   documents (non-empty): 3213835
-  total terms: 2748636047
+  total terms: 2742209690
 
 metrics:
   - metric: MAP
@@ -50,9 +50,9 @@ models:
     params: -bm25
     results:
       MAP:
-        - 0.2310
+        - 0.2305
       R@100:
-        - 0.7279
+        - 0.7281
       R@1000:
         - 0.8856
   - name: bm25-default+rm3
@@ -60,21 +60,21 @@ models:
     params: -bm25 -rm3
     results:
       MAP:
-        - 0.1632
+        - 0.1631
       R@100:
-        - 0.6765
+        - 0.6767
       R@1000:
-        - 0.8785
+        - 0.8791
   - name: bm25-tuned
     display: BM25 (tuned)
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87
     results:
       MAP:
-        - 0.2788
+        - 0.2784
       R@100:
-        - 0.8065
+        - 0.8069
       R@1000:
-        - 0.9326
+        - 0.9324
   - name: bm25-tuned+rm3
     display: +RM3
     params: -bm25 -bm25.k1 3.44 -bm25.b 0.87 -rm3
@@ -82,17 +82,17 @@ models:
       MAP:
         - 0.2289
       R@100:
-        - 0.7872
+        - 0.7878
       R@1000:
-        - 0.9320
+        - 0.9314
   - name: bm25-tuned2
     display: BM25 (tuned2)
     params: -bm25 -bm25.k1 4.46 -bm25.b 0.82
     results:
       MAP:
-        - 0.2775
+        - 0.2774
       R@100:
-        - 0.8076
+        - 0.8070
       R@1000:
         - 0.9357
   - name: bm25-tuned2+rm3
@@ -100,8 +100,8 @@ models:
     params: -bm25 -bm25.k1 4.46 -bm25.b 0.82 -rm3
     results:
       MAP:
-        - 0.2238
+        - 0.2239
       R@100:
-        - 0.7789
+        - 0.7791
       R@1000:
-        - 0.9307
\ No newline at end of file
+        - 0.9305
\ No newline at end of file
diff --git a/src/main/resources/regression/msmarco-passage-deepimpact.yaml b/src/main/resources/regression/msmarco-passage-deepimpact.yaml
index 1b8a77470d..04a9fafc6b 100644
--- a/src/main/resources/regression/msmarco-passage-deepimpact.yaml
+++ b/src/main/resources/regression/msmarco-passage-deepimpact.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage-deepimpact
 corpus_path: collections/msmarco/msmarco-passage-deepimpact-b8/
 
-index_path: indexes/lucene-index.msmarco-passage-deepimpact
+index_path: indexes/lucene-index.msmarco-passage-deepimpact/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/msmarco-passage-distill-splade-max.yaml b/src/main/resources/regression/msmarco-passage-distill-splade-max.yaml
index 584affe9a2..2ec05ae797 100644
--- a/src/main/resources/regression/msmarco-passage-distill-splade-max.yaml
+++ b/src/main/resources/regression/msmarco-passage-distill-splade-max.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage-distill-splade-max
 corpus_path: collections/msmarco/msmarco-passage-distill-splade-max/
 
-index_path: indexes/lucene-index.msmarco-passage-distill-splade-max
+index_path: indexes/lucene-index.msmarco-passage-distill-splade-max/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/msmarco-passage-doc2query.yaml b/src/main/resources/regression/msmarco-passage-doc2query.yaml
index 39a0e6dc44..27c08fbb03 100644
--- a/src/main/resources/regression/msmarco-passage-doc2query.yaml
+++ b/src/main/resources/regression/msmarco-passage-doc2query.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-passage-doc2query
-corpus_path: collections/msmarco/passage-expanded-topk10
+corpus_path: collections/msmarco/passage-expanded-topk10/
 
-index_path: indexes/lucene-index.msmarco-passage-doc2query
+index_path: indexes/lucene-index.msmarco-passage-doc2query/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/msmarco-passage-docTTTTTquery.yaml b/src/main/resources/regression/msmarco-passage-docTTTTTquery.yaml
index 6990f8ca6c..b377dab96f 100644
--- a/src/main/resources/regression/msmarco-passage-docTTTTTquery.yaml
+++ b/src/main/resources/regression/msmarco-passage-docTTTTTquery.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-passage-docTTTTTquery
-corpus_path: collections/msmarco/passage-docTTTTTquery
+corpus_path: collections/msmarco/passage-docTTTTTquery/
 
-index_path: indexes/lucene-index.msmarco-passage-docTTTTTquery
+index_path: indexes/lucene-index.msmarco-passage-docTTTTTquery/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/msmarco-passage-unicoil-tilde-expansion.yaml b/src/main/resources/regression/msmarco-passage-unicoil-tilde-expansion.yaml
index d9ac30c8ca..0e9ebbe91d 100644
--- a/src/main/resources/regression/msmarco-passage-unicoil-tilde-expansion.yaml
+++ b/src/main/resources/regression/msmarco-passage-unicoil-tilde-expansion.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage-unicoil-tilde-expansion
 corpus_path: collections/msmarco/msmarco-passage-unicoil-tilde-expansion-b8/
 
-index_path: indexes/lucene-index.msmarco-passage-unicoil-tilde-expansion
+index_path: indexes/lucene-index.msmarco-passage-unicoil-tilde-expansion/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/msmarco-passage-unicoil.yaml b/src/main/resources/regression/msmarco-passage-unicoil.yaml
index 8224af6066..f40c3cad6e 100644
--- a/src/main/resources/regression/msmarco-passage-unicoil.yaml
+++ b/src/main/resources/regression/msmarco-passage-unicoil.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage-unicoil
 corpus_path: collections/msmarco/msmarco-passage-unicoil-b8/
 
-index_path: indexes/lucene-index.msmarco-passage-unicoil
+index_path: indexes/lucene-index.msmarco-passage-unicoil/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/msmarco-passage.yaml b/src/main/resources/regression/msmarco-passage.yaml
index e43719f712..0724306f9b 100644
--- a/src/main/resources/regression/msmarco-passage.yaml
+++ b/src/main/resources/regression/msmarco-passage.yaml
@@ -2,7 +2,7 @@
 corpus: msmarco-passage
 corpus_path: collections/msmarco/passage/
 
-index_path: indexes/lucene-index.msmarco-passage
+index_path: indexes/lucene-index.msmarco-passage/
 collection_class: JsonCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 9
diff --git a/src/main/resources/regression/msmarco-v2-doc-segmented-unicoil-noexp-0shot.yaml b/src/main/resources/regression/msmarco-v2-doc-segmented-unicoil-noexp-0shot.yaml
index 14da7a5dd3..e10485c8f9 100644
--- a/src/main/resources/regression/msmarco-v2-doc-segmented-unicoil-noexp-0shot.yaml
+++ b/src/main/resources/regression/msmarco-v2-doc-segmented-unicoil-noexp-0shot.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-doc-segmented-unicoil-noexp-0shot
-corpus_path: collections/msmarco/msmarco-doc-v2-seg-unicoil-noexp-0shot-b8
+corpus_path: collections/msmarco/msmarco-doc-v2-seg-unicoil-noexp-0shot-b8/
 
-index_path: indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot
+index_path: indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/msmarco-v2-doc-segmented.yaml b/src/main/resources/regression/msmarco-v2-doc-segmented.yaml
index 332867d0fa..5972a7d33c 100644
--- a/src/main/resources/regression/msmarco-v2-doc-segmented.yaml
+++ b/src/main/resources/regression/msmarco-v2-doc-segmented.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-doc-segmented
-corpus_path: collections/msmarco/msmarco_v2_doc_segmented
+corpus_path: collections/msmarco/msmarco_v2_doc_segmented/
 
-index_path: indexes/lucene-index.msmarco-v2-doc-segmented
+index_path: indexes/lucene-index.msmarco-v2-doc-segmented/
 collection_class: MsMarcoV2DocCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/msmarco-v2-doc.yaml b/src/main/resources/regression/msmarco-v2-doc.yaml
index 74b0c328f9..c890fccba8 100644
--- a/src/main/resources/regression/msmarco-v2-doc.yaml
+++ b/src/main/resources/regression/msmarco-v2-doc.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-doc
-corpus_path: collections/msmarco/msmarco_v2_doc
+corpus_path: collections/msmarco/msmarco_v2_doc/
 
-index_path: indexes/lucene-index.msmarco-v2-doc
+index_path: indexes/lucene-index.msmarco-v2-doc/
 collection_class: MsMarcoV2DocCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/msmarco-v2-passage-augmented.yaml b/src/main/resources/regression/msmarco-v2-passage-augmented.yaml
index 02e091834e..8e158e44fe 100644
--- a/src/main/resources/regression/msmarco-v2-passage-augmented.yaml
+++ b/src/main/resources/regression/msmarco-v2-passage-augmented.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-passage-augmented
-corpus_path: collections/msmarco/msmarco_v2_passage_augmented
+corpus_path: collections/msmarco/msmarco_v2_passage_augmented/
 
-index_path: indexes/lucene-index.msmarco-v2-passage-augmented
+index_path: indexes/lucene-index.msmarco-v2-passage-augmented/
 collection_class: MsMarcoV2PassageCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/msmarco-v2-passage-unicoil-noexp-0shot.yaml b/src/main/resources/regression/msmarco-v2-passage-unicoil-noexp-0shot.yaml
index bf3be720d6..8b81f03b3b 100644
--- a/src/main/resources/regression/msmarco-v2-passage-unicoil-noexp-0shot.yaml
+++ b/src/main/resources/regression/msmarco-v2-passage-unicoil-noexp-0shot.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-passage-unicoil-noexp-0shot
-corpus_path: collections/msmarco/msmarco-passage-v2-unicoil-noexp-0shot-b8
+corpus_path: collections/msmarco/msmarco-passage-v2-unicoil-noexp-0shot-b8/
 
-index_path: indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot
+index_path: indexes/lucene-index.msmarco-v2-passage-unicoil-noexp-0shot/
 collection_class: JsonVectorCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/msmarco-v2-passage.yaml b/src/main/resources/regression/msmarco-v2-passage.yaml
index a13b07deb5..e452109e59 100644
--- a/src/main/resources/regression/msmarco-v2-passage.yaml
+++ b/src/main/resources/regression/msmarco-v2-passage.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: msmarco-v2-passage
-corpus_path: collections/msmarco/msmarco_v2_passage
+corpus_path: collections/msmarco/msmarco_v2_passage/
 
-index_path: indexes/lucene-index.msmarco-v2-passage
+index_path: indexes/lucene-index.msmarco-v2-passage/
 collection_class: MsMarcoV2PassageCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 18
diff --git a/src/main/resources/regression/ntcir8-zh.yaml b/src/main/resources/regression/ntcir8-zh.yaml
index 3daee543b5..119bcb4086 100644
--- a/src/main/resources/regression/ntcir8-zh.yaml
+++ b/src/main/resources/regression/ntcir8-zh.yaml
@@ -2,7 +2,7 @@
 corpus: ntcir8-zh
 corpus_path: collections/newswire/clir/ntcir.zh/ntcir8-zh/
 
-index_path: indexes/lucene-index.ntcir8-zh
+index_path: indexes/lucene-index.ntcir8-zh/
 collection_class: CleanTrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/robust05.yaml b/src/main/resources/regression/robust05.yaml
index f061608270..baec2eaf7d 100644
--- a/src/main/resources/regression/robust05.yaml
+++ b/src/main/resources/regression/robust05.yaml
@@ -2,7 +2,7 @@
 corpus: robust05
 corpus_path: collections/newswire/AQUAINT/
 
-index_path: indexes/lucene-index.robust05
+index_path: indexes/lucene-index.robust05/
 collection_class: TrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/trec02-ar.yaml b/src/main/resources/regression/trec02-ar.yaml
index 26e0fee49f..480777463a 100644
--- a/src/main/resources/regression/trec02-ar.yaml
+++ b/src/main/resources/regression/trec02-ar.yaml
@@ -1,8 +1,8 @@
 ---
 corpus: trec02-ar
-corpus_path: collections/newswire/clir/trec.ar/arabic_newswire_a_ldc2001t55/transcripts
+corpus_path: collections/newswire/clir/trec.ar/arabic_newswire_a_ldc2001t55/transcripts/
 
-index_path: indexes/lucene-index.trec02-ar
+index_path: indexes/lucene-index.trec02-ar/
 collection_class: CleanTrecCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16
diff --git a/src/main/resources/regression/wt10g.yaml b/src/main/resources/regression/wt10g.yaml
index 9d4867b542..4a9f01f5b7 100644
--- a/src/main/resources/regression/wt10g.yaml
+++ b/src/main/resources/regression/wt10g.yaml
@@ -2,7 +2,7 @@
 corpus: wt10g
 corpus_path: collections/web/wt10g/
 
-index_path: indexes/lucene-index.wt10g
+index_path: indexes/lucene-index.wt10g/
 collection_class: TrecwebCollection
 generator_class: DefaultLuceneDocumentGenerator
 index_threads: 16