diff --git a/.gitmodules b/.gitmodules
index b3c8cb19..a9456006 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,3 +1,3 @@
-[submodule "eval"]
-	path = eval
-	url = https://github.com/castorini/anserini-eval.git
+[submodule "tools"]
+	path = tools
+	url = https://github.com/castorini/anserini-tools.git
diff --git a/docs/experiments-msmarco-document.md b/docs/experiments-msmarco-document.md
index 63903992..b6c82125 100644
--- a/docs/experiments-msmarco-document.md
+++ b/docs/experiments-msmarco-document.md
@@ -95,14 +95,6 @@ It is worth noting again that you might need to modify the batch size to best fi
 
 Upon completion, the re-ranked run file `runs/run.monot5.doc_fh.dev.tsv` will be available in the `runs` directory.
 
-We can use the official MS MARCO evaluation script to verify the MRR@10:
-
-```
-python eval/msmarco_eval.py data/msmarco_doc_ans_small/fh/qrels.dev.small.tsv runs/run.monot5.doc_fh.dev.tsv
-```
-
-You should see the same result.
-
 We can modify the argument for `--dataset` to `data/msmarco_doc_ans_small/sh` to re-rank the second half of the dataset, and don't forget to change output file name.
 
 The results are as follows:
diff --git a/docs/experiments-msmarco-passage.md b/docs/experiments-msmarco-passage.md
index 032e5d31..f4ad040b 100644
--- a/docs/experiments-msmarco-passage.md
+++ b/docs/experiments-msmarco-passage.md
@@ -49,7 +49,7 @@ unzip data/msmarco_ans_small.zip -d data
 As a sanity check, we can evaluate the first-stage retrieved documents using the official MS MARCO evaluation script.
 
 ```
-python eval/msmarco_eval.py data/msmarco_ans_small/qrels.dev.small.tsv data/msmarco_ans_small/run.dev.small.tsv
+python tools/eval/msmarco_eval.py data/msmarco_ans_small/qrels.dev.small.tsv data/msmarco_ans_small/run.dev.small.tsv
 ```
 
 The output should be:
@@ -105,7 +105,7 @@ The re-ranked run file `run.monobert.ans_small.dev.tsv` will also be available i
 We can use the official MS MARCO evaluation script to verify the MRR@10:
 
 ```
-python eval/msmarco_eval.py data/msmarco_ans_small/qrels.dev.small.tsv runs/run.monobert.ans_small.dev.tsv
+python tools/eval/msmarco_eval.py data/msmarco_ans_small/qrels.dev.small.tsv runs/run.monobert.ans_small.dev.tsv
 ```
 
 You should see the same result. Great, let's move on to monoT5!
@@ -145,7 +145,7 @@ Upon completion, the re-ranked run file `run.monot5.ans_small.dev.tsv` will be a
 We can use the official MS MARCO evaluation script to verify the MRR@10:
 
 ```
-python eval/msmarco_eval.py data/msmarco_ans_small/qrels.dev.small.tsv runs/run.monot5.ans_small.dev.tsv
+python tools/eval/msmarco_eval.py data/msmarco_ans_small/qrels.dev.small.tsv runs/run.monot5.ans_small.dev.tsv
 ```
 
 You should see the same result.