From 55055a081864e780914d12f9459e47c46e07c220 Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Wed, 8 Jul 2020 12:51:06 -0400 Subject: [PATCH 1/7] Update Replication Log --- docs/experiments-msmarco-document.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/experiments-msmarco-document.md b/docs/experiments-msmarco-document.md index b6c82125..fc9628a9 100644 --- a/docs/experiments-msmarco-document.md +++ b/docs/experiments-msmarco-document.md @@ -115,3 +115,5 @@ If you were able to replicate these results, please submit a PR adding to the re ## Replication Log + ++ Results replicated by [@HangCui0510](https://github.com/HangCui0510) on 2020-05-29 (commit [`f2e078e`](https://github.com/HangCui0510/pygaggle/commit/f2e078e47c87156925a9151632753be861ec403d)) (Tesla P100) From fffc10315be7f7d3fcbb76b2902c21ba3908c4e7 Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Wed, 8 Jul 2020 12:52:48 -0400 Subject: [PATCH 2/7] Update Replication Log --- docs/experiments-msmarco-document.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/experiments-msmarco-document.md b/docs/experiments-msmarco-document.md index fc9628a9..f5444d77 100644 --- a/docs/experiments-msmarco-document.md +++ b/docs/experiments-msmarco-document.md @@ -116,4 +116,4 @@ If you were able to replicate these results, please submit a PR adding to the re ## Replication Log -+ Results replicated by [@HangCui0510](https://github.com/HangCui0510) on 2020-05-29 (commit [`f2e078e`](https://github.com/HangCui0510/pygaggle/commit/f2e078e47c87156925a9151632753be861ec403d)) (Tesla P100) ++ Results replicated by [@HangCui0510](https://github.com/HangCui0510) on 2020-07-08 (commit [`f2e078e`](https://github.com/HangCui0510/pygaggle/commit/f2e078e47c87156925a9151632753be861ec403d)) (Tesla P100) From 314bebfcc0ae8b18f57e92a1a1c3e50d469b37ff Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Wed, 8 Jul 2020 12:57:18 -0400 Subject: [PATCH 3/7] Update requirements --- requirements.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/requirements.txt b/requirements.txt index e97e587a..1c76c3a1 100644 --- a/requirements.txt +++ b/requirements.txt @@ -8,6 +8,6 @@ scipy>=1.4 spacy==2.2.4 tensorboard>=2.1.0 tensorflow>=2.2.0rc1 -tokenizers>=0.7 +tokenizers==0.7 tqdm==4.45.0 -transformers>=2.9.0 +transformers==2.10.0 From 472f59260014a9e9d3398752df4a4131cff97550 Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Thu, 9 Jul 2020 05:08:01 -0400 Subject: [PATCH 4/7] Create CovidQA doc --- docs/CovidQA.md | 79 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 docs/CovidQA.md diff --git a/docs/CovidQA.md b/docs/CovidQA.md new file mode 100644 index 00000000..a8ae8796 --- /dev/null +++ b/docs/CovidQA.md @@ -0,0 +1,79 @@ +# PyGaggle: Neural Ranking Baselines on [MS MARCO Passage Retrieval](https://github.com/microsoft/MSMARCO-Passage-Ranking) + +This page contains instructions for running various neural reranking baselines on the CovidQA ranking task. + +Note 1: Run the following instructions at root of this repo. +Note 2: Make sure that you have access to a GPU +Note 3: Installation must have been done from source and make sure the [anserini-eval](https://github.com/castorini/anserini-eval) submodule is pulled. +To do this, first clone the repository recursively. + +``` +git clone --recursive https://github.com/castorini/pygaggle.git +``` + +Then install PyGaggle using: + +``` +pip install pygaggle/ +``` + +## Re-Ranking with Random + +``` +python -um pygaggle.run.evaluate_kaggle_highlighter --method random \ + --dataset data/kaggle-lit-review-0.2.json \ + --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12 +``` + +The following output will be visible after it has finished: + +``` +precision@1 0.0 +recall@3 0.0199546485260771 +recall@50 0.3247165532879819 +recall@1000 1.0 +mrr 0.03999734528458418 +mrr@10 0.020888672929489253 +``` + +## Re-Ranking with BM25 + +``` +python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 \ + --dataset data/kaggle-lit-review-0.2.json \ + --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12 +``` + +The following output will be visible after it has finished: + +``` +precision@1 0.14685314685314685 +recall@3 0.2199546485260771 +recall@50 0.6582766439909296 +recall@1000 0.6820861678004534 +mrr 0.24651188194041115 +mrr@10 0.2267060792570997 +``` + +It takes about 10 seconds to re-rank this subset on CovidQA using a P100. + +## Re-Ranking with monoT5-Base + +``` +python -um pygaggle.run.evaluate_kaggle_highlighter --method t5 \ + --dataset data/kaggle-lit-review-0.2.json \ + --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12 +``` + +The following output will be visible after it has finished: + +``` +precision@1 0.2789115646258503 +recall@3 0.41854551344347257 +recall@50 0.92555879494655 +recall@1000 1.0 +mrr 0.417982565405279 +mrr@10 0.4045405463772811 +``` + +It takes about 17 minutes to re-rank this subset on CovidQA using a P100. \ No newline at end of file From 55a9bbb48139145d336a6a1fac8e63b4c7f1ec4b Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Thu, 9 Jul 2020 05:09:33 -0400 Subject: [PATCH 5/7] Create CovidQA doc --- docs/CovidQA.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/CovidQA.md b/docs/CovidQA.md index a8ae8796..862b5f29 100644 --- a/docs/CovidQA.md +++ b/docs/CovidQA.md @@ -1,4 +1,4 @@ -# PyGaggle: Neural Ranking Baselines on [MS MARCO Passage Retrieval](https://github.com/microsoft/MSMARCO-Passage-Ranking) +# PyGaggle: Neural Ranking Baselines on CovidQA This page contains instructions for running various neural reranking baselines on the CovidQA ranking task. From 998211b4f4b73aa2dc2f032ebef01241880d6c0a Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Thu, 9 Jul 2020 19:35:39 -0400 Subject: [PATCH 6/7] Create CovidQA Docs --- docs/CovidQA.md | 87 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 79 insertions(+), 8 deletions(-) diff --git a/docs/CovidQA.md b/docs/CovidQA.md index 862b5f29..e3672dac 100644 --- a/docs/CovidQA.md +++ b/docs/CovidQA.md @@ -19,6 +19,8 @@ pip install pygaggle/ ## Re-Ranking with Random +NL Question: + ``` python -um pygaggle.run.evaluate_kaggle_highlighter --method random \ --dataset data/kaggle-lit-review-0.2.json \ @@ -32,12 +34,34 @@ precision@1 0.0 recall@3 0.0199546485260771 recall@50 0.3247165532879819 recall@1000 1.0 -mrr 0.03999734528458418 -mrr@10 0.020888672929489253 +mrr 0.03999734528458418 +mrr@10 0.020888672929489253 +``` + +Keyword Query + +``` +python -um pygaggle.run.evaluate_kaggle_highlighter --method random \ + --split kq \ + --dataset data/kaggle-lit-review-0.2.json \ + --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12 +``` + +The following output will be visible after it has finished: + +``` +precision@1 0.0 +recall@3 0.0199546485260771 +recall@50 0.3247165532879819 +recall@1000 1.0 +mrr 0.03999734528458418 +mrr@10 0.020888672929489253 ``` ## Re-Ranking with BM25 +NL Question: + ``` python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 \ --dataset data/kaggle-lit-review-0.2.json \ @@ -51,13 +75,35 @@ precision@1 0.14685314685314685 recall@3 0.2199546485260771 recall@50 0.6582766439909296 recall@1000 0.6820861678004534 -mrr 0.24651188194041115 -mrr@10 0.2267060792570997 +mrr 0.24651188194041115 +mrr@10 0.2267060792570997 +``` + +Keyword Query: + +``` +python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 \ + --split kq \ + --dataset data/kaggle-lit-review-0.2.json \ + --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12 +``` + +The following output will be visible after it has finished: + +``` +precision@1 0.14685314685314685 +recall@3 0.22675736961451243 +recall@50 0.6650793650793649 +recall@1000 0.6888888888888888 +mrr 0.249090910278702 +mrr@10 0.22846344887161213 ``` It takes about 10 seconds to re-rank this subset on CovidQA using a P100. -## Re-Ranking with monoT5-Base +## Re-Ranking with monoT5 + +NL Question: ``` python -um pygaggle.run.evaluate_kaggle_highlighter --method t5 \ @@ -72,8 +118,33 @@ precision@1 0.2789115646258503 recall@3 0.41854551344347257 recall@50 0.92555879494655 recall@1000 1.0 -mrr 0.417982565405279 -mrr@10 0.4045405463772811 +mrr 0.417982565405279 +mrr@10 0.4045405463772811 ``` -It takes about 17 minutes to re-rank this subset on CovidQA using a P100. \ No newline at end of file +Keyword Query: + +``` +python -um pygaggle.run.evaluate_kaggle_highlighter --method t5 \ + --split kq \ + --dataset data/kaggle-lit-review-0.2.json \ + --index-dir indexes/lucene-index-cord19-paragraph-2020-05-12 +``` + +The following output will be visible after it has finished: + +``` +precision@1 0.24489795918367346 +recall@3 0.38566569484936825 +recall@50 0.9231778425655977 +recall@1000 1.0 +mrr 0.37988285486956513 +mrr@10 0.3671336788683727 +``` + +It takes about 17 minutes to re-rank this subset on CovidQA using a P100. + +If you were able to replicate these results, please submit a PR adding to the replication log! + + +## Replication Log \ No newline at end of file From abb49bbd990f4a6e192e50c12b27d4b9e0477f6d Mon Sep 17 00:00:00 2001 From: HangCui0510 Date: Mon, 13 Jul 2020 15:08:38 -0400 Subject: [PATCH 7/7] Match file names, delete CovidQA from Readme --- README.md | 111 ++------------------ docs/{CovidQA.md => experiments-CovidQA.md} | 2 +- 2 files changed, 10 insertions(+), 103 deletions(-) rename docs/{CovidQA.md => experiments-CovidQA.md} (98%) diff --git a/README.md b/README.md index d5fb3f9c..35a0dde4 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,15 @@ Currently, this repo contains implementations of the rerankers for [CovidQA](htt 0. Install [Anserini](https://github.com/castorini/anserini). +## Additional Instructions + +0. Clone the repo with `git clone --recursive https://github.com/castorini/pygaggle.git` + +0. Make you sure you have an installation of [Python 3.6+](https://www.python.org/downloads/). All `python` commands below refer to this. + +0. For pip, do `pip install -r requirements.txt` + * If you prefer Anaconda, use `conda env create -f environment.yml && conda activate pygaggle`. + # A simple reranking example The code below exemplifies how to score two documents for a given query using a T5 reranker from [Document Ranking with a Pretrained @@ -56,105 +65,3 @@ scores = [result.score for result in reranker.rerank(query, documents)] # scores = [-0.1782158613204956, -0.36637523770332336] ``` -# Evaluations - -## Additional Instructions - -0. Clone the repo with `git clone --recursive https://github.com/castorini/pygaggle.git` - -0. Make you sure you have an installation of [Python 3.6+](https://www.python.org/downloads/). All `python` commands below refer to this. - -0. For pip, do `pip install -r requirements.txt` - * If you prefer Anaconda, use `conda env create -f environment.yml && conda activate pygaggle`. - - -## Running rerankers on CovidQA - -For a full list of mostly self-explanatory environment variables, see [this file](https://github.com/castorini/pygaggle/blob/master/pygaggle/settings.py#L7). - -BM25 uses the CPU. If you don't have a GPU for the transformer models, pass `--device cpu` (PyTorch device string format) to the script. - -*Note: Run the following evaluations at root of this repo.* - -### Unsupervised Methods - -**BM25**: - -```bash -python -um pygaggle.run.evaluate_kaggle_highlighter --method bm25 -``` - -**BERT**: - -```bash -python -um pygaggle.run.evaluate_kaggle_highlighter --method transformer --model-name bert-base-cased -``` - -**SciBERT**: - -```bash -python -um pygaggle.run.evaluate_kaggle_highlighter --method transformer --model-name allenai/scibert_scivocab_cased -``` - -**BioBERT**: - -```bash -python -um pygaggle.run.evaluate_kaggle_highlighter --method transformer --model-name biobert -``` - -### Supervised Methods - -**T5 (fine-tuned on MS MARCO)**: - -```bash -python -um pygaggle.run.evaluate_kaggle_highlighter --method t5 -``` - -**BioBERT (fine-tuned on SQuAD v1.1)**: - -0. `mkdir biobert-squad && cd biobert-squad` - -0. Download the weights, vocab, and config from the [BioBERT repository](https://github.com/dmis-lab/bioasq-biobert) to `biobert-squad`. - -0. Untar the model and rename some files in `biobert-squad`: - -```bash -tar -xvzf BERT-pubmed-1000000-SQuAD.tar.gz -mv bert_config.json config.json -for filename in model.ckpt*; do - mv $filename $(python -c "import re; print(re.sub(r'ckpt-\\d+', 'ckpt', '$filename'))"); -done -``` - -0. Evaluate the model: - -```bash -cd .. # go to root of this of repo -python -um pygaggle.run.evaluate_kaggle_highlighter --method qa_transformer --model-name -``` - -**BioBERT (fine-tuned on MS MARCO)**: - -0. Download the weights, vocab, and config from our Google Storage bucket. This requires an installation of [gsutil](https://cloud.google.com/storage/docs/gsutil_install?hl=ru). - -```bash -mkdir biobert-marco && cd biobert-marco -gsutil cp "gs://neuralresearcher_data/doc2query/experiments/exp374/model.ckpt-100000*" . -gsutil cp gs://neuralresearcher_data/biobert_models/biobert_v1.1_pubmed/bert_config.json config.json -gsutil cp gs://neuralresearcher_data/biobert_models/biobert_v1.1_pubmed/vocab.txt . -``` - -0. Rename the files: - -```bash -for filename in model.ckpt*; do - mv $filename $(python -c "import re; print(re.sub(r'ckpt-\\d+', 'ckpt', '$filename'))"); -done -``` - -0. Evaluate the model: - -```bash -cd .. # go to root of this repo -python -um pygaggle.run.evaluate_kaggle_highlighter --method seq_class_transformer --model-name -``` diff --git a/docs/CovidQA.md b/docs/experiments-CovidQA.md similarity index 98% rename from docs/CovidQA.md rename to docs/experiments-CovidQA.md index e3672dac..d7bc6c03 100644 --- a/docs/CovidQA.md +++ b/docs/experiments-CovidQA.md @@ -99,7 +99,7 @@ mrr 0.249090910278702 mrr@10 0.22846344887161213 ``` -It takes about 10 seconds to re-rank this subset on CovidQA using a P100. +It takes about 10 seconds to re-rank this subset on CovidQA ## Re-Ranking with monoT5