Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation

This is the repository for implementation of paper Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation.

This paper introduces a model for incomplete utterance restoration (IUR). Different from prior studies that only work on extraction or abstraction datasets, we design a simple but effective model, working for both scenarios of IUR. Our design simulates the nature of IUR, where omitted tokens from the context contribute to restoration. From this, we construct a Picker that identifies the omitted tokens. To support the picker, we design two label creation methods (soft and hard labels), which can work in cases of no annotation of the omitted tokens. The restoration is done by using a Generator with the help of the Picker on joint learning. Promising results on four benchmark datasets in extraction and abstraction scenarios show that our model is better than the pretrained T5 and non-generative language model methods in both rich and limited training data settings.

Setup

Datasets

Environment

conda env create -f environment.yml
conda activate jointiur

Usage

Training

python train.py

Evaluation

python test.py

Citation

If you would like to cite our paper in your work, this is the current reference:

@ARTICLE{Inoue2022-tb,
  title         = "Enhance Incomplete Utterance Restoration by Joint Learning
                   Token Extraction and Text Generation",
  author        = "Inoue, Shumpei and Liu, Tsungwei and Son, Nguyen Hong and
                   Nguyen, Minh-Tien",
  month         =  apr,
  year          =  2022,
  archivePrefix = "arXiv",
  primaryClass  = "cs.CL",
  eprint        = "2204.03958"
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
asset		asset
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation

Setup

Datasets

Environment

Usage

Training

Evaluation

Citation

About

Releases

Packages

shumpei19/JET

Folders and files

Latest commit

History

Repository files navigation

Enhance Incomplete Utterance Restoration by Joint Learning Token Extraction and Text Generation

Setup

Datasets

Environment

Usage

Training

Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages