This repository contains the data and source code of the paper Learning to Identify Follow-up Questions in Conversational Question Answering.
This work is focused on how a machine can learn to identify follow-up questions in Conversational QA settings. It is crucial for a system to be able to determine whether a question is a follow-up question of the current conversation for effective answer finding. In this work, we introduce a new follow-up question identification task. We propose a three-way attentive pooling network that determines the suitability of a follow-up question by capturing pair-wise interactions between the associated passage, the conversation history, and a candidate follow-up question.
If you use the data, source code or models from this work, please cite our paper:
@article{kundu2020lif,
author = {Kundu, Souvik and Lin, Qian and Ng, Hwee Tou},
title = {Learning to Identify Follow-up Questions in Conversational Question Answering},
booktitle = {Proceedings of ACL},
year = {2020},
}
We use Python3.6 and AllenNlp. Install the packages listed in the requirements.txt
file.
Run the download_data.sh
:
sh download_data.sh
The data files will be downloaded inside data/dataset
, and the embedding file will
be downloaded inside data/embeddings
.
A new model (Three-way Attentive Pooling Network) can be trained from scratch by executing the following example command:
allennlp train training_configs/l2af_3way_ap.json -s models/3way_ap --include-package l2af
Similarly, a new BERT-based baseline model can be trained using the following command:
allennlp train training_configs/bert_baseline.json -s models/bert_baseline --include-package l2af
You can predict for the test instances using the trained models by running the following command:
allennlp predict models/3way_ap/model.tar.gz data/dataset/test_i.jsonl \
--output-file models/3way_ap/test_i_predictions.jsonl \
--batch-size 32 \
--silent \
--cuda-device 0 \
--predictor l2af_predictor_binary \
--include-package l2af
One can also download our pre-trained models and use it for prediction. The models can be downloaded
by running download_pretrained_models.sh
:
sh download_pretrained_models.sh
This should download the models inside data/pretrained-model
.
For evaluation, one needs to generate the prediction files for both the dev set and the test set. Running the prediction on the dev set is necessary as we estimate the threshold based on the performance on the dev set. To run the evaluation, simply run:
python evaluator.py --dev_pred_file /path/to/dev_predictions.jsonl \
--test_pred_file /path/to/test_predictions.jsonl
The code and models in this repository are licensed under the GNU General Public License Version 3. For commercial use of this code and models, separate commercial licensing is also available. Please contact:
- Souvik Kundu ([email protected])
- Qian Lin ([email protected])
- Hwee Tou Ng ([email protected])