This is a baseline system for AIO4 competition utilizing Binary Passage Retriever (BPR).
BPR is an efficient passage retrieval model for a large collection of documents. BPR integrates a learning-to-hash technique into Dense Passage Retriever (DPR) to represent the passage embeddings using compact binary codes rather than continuous vectors. It substantially reduces the memory size without a loss of accuracy when tested on several QA datasets (see the BPR repository for more detail).
# Update pip and setuptools first
pip install -U pip setuptools
# Optional: Install the same version of the dependencies we used
pip install -r requirements.txt
# Install the aio4_bpr_baseline module as well as its dependencies
pip install -e .
You can download the pretrained model checkpoints and the passage embeddings for experiments. See Running Pipeline of Retriever and Reader section for instructions on using them.
Note: The following example is run on a server with 4 GPUs, each with 16 GB of memory.
1. Download datasets
mkdir data
wget https://github.com/cl-tohoku/quiz-datasets/releases/download/v1.0.0/datasets.jawiki-20220404-c400-large.aio_02_train.jsonl.gz -P data
wget https://github.com/cl-tohoku/quiz-datasets/releases/download/v1.0.0/datasets.jawiki-20220404-c400-large.aio_02_dev.jsonl.gz -P data
wget https://github.com/cl-tohoku/quiz-datasets/releases/download/v1.0.1/passages.jawiki-20220404-c400-large.jsonl.gz -P data
2. Preprocess the datasets
mkdir -p work/aio_02/data
python -m aio4_bpr_baseline.utils.convert_passages \
--passages_file data/passages.jawiki-20220404-c400-large.jsonl.gz \
--output_passages_file work/aio_02/data/passages.jsonl.gz \
--output_pid_idx_map_file work/aio_02/data/pid_idx_map.json.gz
python -m aio4_bpr_baseline.utils.convert_dataset \
--dataset_file data/datasets.jawiki-20220404-c400-large.aio_02_train.jsonl.gz \
--pid_idx_map_file work/aio_02/data/pid_idx_map.json.gz \
--output_dataset_file work/aio_02/data/retriever_train.jsonl.gz
python -m aio4_bpr_baseline.utils.convert_dataset \
--dataset_file data/datasets.jawiki-20220404-c400-large.aio_02_dev.jsonl.gz \
--pid_idx_map_file work/aio_02/data/pid_idx_map.json.gz \
--output_dataset_file work/aio_02/data/retriever_dev.jsonl.gz
1. Train a biencoder
python -m aio4_bpr_baseline.lightning_cli fit \
--config aio4_bpr_baseline/configs/retriever/bpr/biencoder.yaml \
--model.train_dataset_file work/aio_02/data/retriever_train.jsonl.gz \
--model.val_dataset_file work/aio_02/data/retriever_dev.jsonl.gz \
--model.passages_file work/aio_02/data/passages.jsonl.gz \
--trainer.default_root_dir work/aio_02/biencoder
2. Build passage embeddings
python -m aio4_bpr_baseline.lightning_cli predict \
--config aio4_bpr_baseline/configs/retriever/bpr/embedder.yaml \
--model.biencoder_ckpt_file work/aio_02/biencoder/lightning_logs/version_0/checkpoints/last.ckpt \
--model.passages_file work/aio_02/data/passages.jsonl.gz \
--trainer.default_root_dir work/aio_02/embedder
python -m aio4_bpr_baseline.utils.gather_numpy_predictions \
--predictions_dir work/aio_02/embedder/lightning_logs/version_0/predictions \
--output_file work/aio_02/embedder/lightning_logs/version_0/prediction.npy
3. Retrieve passages for questions
python -m aio4_bpr_baseline.lightning_cli predict \
--config aio4_bpr_baseline/configs/retriever/bpr/retriever.yaml \
--model.biencoder_ckpt_file work/aio_02/biencoder/lightning_logs/version_0/checkpoints/last.ckpt \
--model.passage_embeddings_file work/aio_02/embedder/lightning_logs/version_0/prediction.npy \
--model.predict_dataset_file work/aio_02/data/retriever_train.jsonl.gz \
--trainer.default_root_dir work/aio_02/retriever/train
python -m aio4_bpr_baseline.utils.gather_json_predictions \
--predictions_dir work/aio_02/retriever/train/lightning_logs/version_0/predictions \
--output_file work/aio_02/retriever/train/lightning_logs/version_0/prediction.json.gz
python -m aio4_bpr_baseline.lightning_cli predict \
--config aio4_bpr_baseline/configs/retriever/bpr/retriever.yaml \
--model.biencoder_ckpt_file work/aio_02/biencoder/lightning_logs/version_0/checkpoints/last.ckpt \
--model.passage_embeddings_file work/aio_02/embedder/lightning_logs/version_0/prediction.npy \
--model.predict_dataset_file work/aio_02/data/retriever_dev.jsonl.gz \
--trainer.default_root_dir work/aio_02/retriever/dev
python -m aio4_bpr_baseline.utils.gather_json_predictions \
--predictions_dir work/aio_02/retriever/dev/lightning_logs/version_0/predictions \
--output_file work/aio_02/retriever/dev/lightning_logs/version_0/prediction.json.gz
4. Evaluate the retriever performance
python -m aio4_bpr_baseline.utils.evaluate_retriever \
--dataset_file work/aio_02/data/retriever_train.jsonl.gz \
--passages_file work/aio_02/data/passages.jsonl.gz \
--prediction_file work/aio_02/retriever/train/lightning_logs/version_0/prediction.json.gz \
--answer_match_type nfkc \
--output_file work/aio_02/data/reader_train.jsonl.gz
# Recall@1: 0.7597
# Recall@2: 0.8409
# Recall@5: 0.8869
# Recall@10: 0.9019
# Recall@20: 0.9123
# Recall@50: 0.9233
# Recall@100: 0.9311
# MRR@10: 0.8158
python -m aio4_bpr_baseline.utils.evaluate_retriever \
--dataset_file work/aio_02/data/retriever_dev.jsonl.gz \
--passages_file work/aio_02/data/passages.jsonl.gz \
--prediction_file work/aio_02/retriever/dev/lightning_logs/version_0/prediction.json.gz \
--answer_match_type nfkc \
--output_file work/aio_02/data/reader_dev.jsonl.gz
# Recall@1: 0.5500
# Recall@2: 0.6720
# Recall@5: 0.7750
# Recall@10: 0.8220
# Recall@20: 0.8500
# Recall@50: 0.8740
# Recall@100: 0.9010
# MRR@10: 0.6463
1. Train a reader
python -m aio4_bpr_baseline.lightning_cli fit \
--config aio4_bpr_baseline/configs/reader/extractive_reader/reader.yaml \
--model.train_dataset_file work/aio_02/data/reader_train.jsonl.gz \
--model.val_dataset_file work/aio_02/data/reader_dev.jsonl.gz \
--model.passages_file work/aio_02/data/passages.jsonl.gz \
--trainer.default_root_dir work/aio_02/reader
2. Predict answers for questions
python -m aio4_bpr_baseline.lightning_cli predict \
--config aio4_bpr_baseline/configs/reader/extractive_reader/reader_predict.yaml \
--model.reader_ckpt_file work/aio_02/reader/lightning_logs/version_0/checkpoints/last.ckpt \
--model.predict_dataset_file work/aio_02/data/reader_dev.jsonl.gz \
--model.passages_file work/aio_02/data/passages.jsonl.gz \
--trainer.default_root_dir work/aio_02/reader_predict/aio_02_dev
python -m aio4_bpr_baseline.utils.gather_json_predictions \
--predictions_dir work/aio_02/reader_predict/aio_02_dev/lightning_logs/version_0/predictions \
--output_file work/aio_02/reader_predict/aio_02_dev/lightning_logs/version_0/prediction.jsonl.gz
3. Evaluate the reader performance
python -m aio4_bpr_baseline.utils.evaluate_reader \
--dataset_file work/aio_02/data/reader_dev.jsonl.gz \
--passages_file work/aio_02/data/passages.jsonl.gz \
--prediction_file work/aio_02/reader_predict/aio_02_dev/lightning_logs/version_0/prediction.jsonl.gz \
--answer_normalization_mode nfkc
# Exact Match: 0.5680
1. Prepare the checkpoints of the pretrained models
# Copy the models you have trained above
cp work/aio_02/biencoder/lightning_logs/version_0/checkpoints/last.ckpt work/biencoder.ckpt
cp work/aio_02/reader/lightning_logs/version_0/checkpoints/last.ckpt work/reader.ckpt
cp work/aio_02/embedder/lightning_logs/version_0/prediction.npy work/passage_embeddings.npy
cp work/aio_02/data/passages.jsonl.gz work/passages.json.gz
# Or download ones publicly available
wget https://storage.googleapis.com/aio-public-tokyo/aio4_bpr_baseline_models/biencoder.ckpt -P work
wget https://storage.googleapis.com/aio-public-tokyo/aio4_bpr_baseline_models/reader.ckpt -P work
wget https://storage.googleapis.com/aio-public-tokyo/aio4_bpr_baseline_models/passage_embeddings.npy -P work
wget https://storage.googleapis.com/aio-public-tokyo/aio4_bpr_baseline_models/passages.json.gz -P work
2. Predict answers for the questions in AIO4 development data
python -m aio4_bpr_baseline.lightning_cli predict \
--config aio4_bpr_baseline/configs/pipeline_aio4/bpr_extractive_reader/pipeline.yaml \
--model.biencoder_ckpt_file work/biencoder.ckpt \
--model.reader_ckpt_file work/reader.ckpt \
--model.passage_embeddings_file work/passage_embeddings.npy \
--model.passages_file work/passages.json.gz \
--model.predict_dataset_file data/aio_04_dev_unlabeled_v1.0.jsonl \
--model.predict_num_passages 10 \
--model.predict_answer_score_threshold 0.5 \
--trainer.default_root_dir work/aio_02/pipeline_aio4/aio_04_dev
python -m aio4_bpr_baseline.utils.gather_json_predictions \
--predictions_dir work/aio_02/pipeline_aio4/aio_04_dev/lightning_logs/version_0/predictions \
--output_file work/aio_02/pipeline_aio4/aio_04_dev/lightning_logs/version_0/prediction.jsonl
3. Compute the scores
python -m compute_score \
--prediction_file work/aio_02/pipeline_aio4/aio_04_dev/lightning_logs/version_0/prediction.jsonl \
--gold_file data/aio_04_dev_v1.0.jsonl \
--limit_num_wrong_answers 3
# num_questions: 500
# num_correct: 288
# num_missed: 196
# num_failed: 16
# accuracy: 57.6%
# accuracy_score: 288.000
# position_score: 76.380
# total_score: 364.380
4. Predict answers for the questions in AIO4 leaderboard test data
python -m aio4_bpr_baseline.lightning_cli predict \
--config aio4_bpr_baseline/configs/pipeline_aio4/bpr_extractive_reader/pipeline.yaml \
--model.biencoder_ckpt_file work/biencoder.ckpt \
--model.reader_ckpt_file work/reader.ckpt \
--model.passage_embeddings_file work/passage_embeddings.npy \
--model.passages_file work/passages.json.gz \
--model.predict_batch_size 1 \
--model.predict_dataset_file data/aio_04_test_lb_unlabeled_v1.0.jsonl \
--model.predict_num_passages 10 \
--model.predict_answer_score_threshold 0.5 \
--trainer.default_root_dir work/aio_02/pipeline_aio4/aio_04_test_lb
python -m aio4_bpr_baseline.utils.gather_json_predictions \
--predictions_dir work/aio_02/pipeline_aio4/aio_04_test_lb/lightning_logs/version_0/predictions \
--output_file work/aio_02/pipeline_aio4/aio_04_test_lb/lightning_logs/version_0/prediction.jsonl
1. Build and run a Docker image
Make sure that the following files are placed under work/
directory.
If not, follow the first step in Running Pipeline of Retriever and Reader.
biencoder.ckpt
reader.ckpt
passage_embeddings.npy
passages.json.gz
docker build -t aio4-bpr-baseline .
docker run --gpus 1 --rm -p 8000:8000 aio4-bpr-baseline
2. Predict answers for the questions in AIO4 leaderboard test data
python3 -m evaluate_docker_api \
--test_unlabelded_file data/aio_04_test_lb_unlabeled_v1.0.jsonl \
--output_prediction_file work/aio_04_test_lb_prediction_v1.0.jsonl
This
work is licensed under a
Creative
Commons Attribution-NonCommercial 4.0 International License.
If you find this work useful, please cite the following paper:
Efficient Passage Retrieval with Hashing for Open-domain Question Answering
@inproceedings{yamada2021bpr,
title={Efficient Passage Retrieval with Hashing for Open-domain Question Answering},
author={Ikuya Yamada and Akari Asai and Hannaneh Hajishirzi},
booktitle={ACL},
year={2021}
}