GenPPN

This repository maintains code for EMNLP 2023 paper:

Enhancing Task-oriented Dialogue Systems with Generative Post-processing Networks [Paper]

Abstract: Recently, post-processing networks (PPNs), which modify the outputs of arbitrary modules including non-differentiable ones in task-oriented dialogue systems, have been proposed. PPNs have successfully improved the dialogue performance by post-processing natural language understanding, dialogue state tracking, and dialogue policy modules with a classification-based approach. However, they cannot be applied to natural language generation (NLG) modules because the post-processing of the utterance output by the NLG module requires a generative approach. In this study, we propose a new post-processing component for NLG, generative post-processing networks (GenPPNs). For optimizing GenPPNs via reinforcement learning, the reward function incorporates dialogue act contribution, a new measure to evaluate the contribution of GenPPN-generated utterances with regard to task completion in dialogue. Through simulation and human evaluation experiments based on the MultiWOZ dataset, we confirmed that GenPPNs improve the task completion performance of task-oriented dialogue systems.

Setup

Python requirement
- Python >= 3.9

Clone this repository with a submodule (ConvLab-2)

git clone --recursive [email protected]:nu-dialogue/GenPPN.git
cd GenPPN

Install requirements
- Make sure to specify the suitable version of torch in requirements.txt for your environment.
- Install
```
pip install -r requirements.txt
```

Install ConvLab-2

Comment out the following requirements in ConvLab-2/setup.py for avoiding conflict.

'torch>=1.2.0,<1.6.0',
'scikit-learn==0.20.3',
'transformers>=2.3.0,<3.0.0',
'tensorboardX==1.7',
'overrides==4.1.2',
'allennlp==0.9.0',
'spacy==2.1.9',

Install
```
cd ConvLab-2
pip install -e .
cd ../
```

Reproduce

To accelerate dialogue sampling, we used multi-node, multi-GPU parallel computing through MPI on HPC. In this document, we present the run commands with mpirun, but DDP with torchrun or huggingface accelerate launch can also be used by removing the --dist_type mpi argument. Also, in the following, the experimental procedure using SC-GPT is described, but you can experiment with different NLG models by changing --system_nlg_name. Further details of the arguments can be found by using --help.

Note

The specific HPC job script we used can be found in job_scripts/ directory.

Warning

Different hyperparameters from this example, including the number of GPUs, will likely not reproduce the results on our paper.

1. Setup modules in ConvLab-2

We need to setup the following modules, which need to be trained or require pre-trained checkpoints. Please see the instructions in README.md in the directory of each model.

Module	Models (link to README.md)
User NLU	BERT NLU
System NLU	BERT NLU
System NLG	SC-LSTM, SC-GPT, GPT2 + RL, Template

2. (Optional) Test baseline system

Evaluate the baseline system by using 1024 test dialogues (8 process x 128 dialogues).

RUN_DPATH="outputs/baseline_test/sys_scgpt"

mpirun -n 8 -machinefile $PJM_O_NODEINF -display-devel-map -map-by ppr:2:socket \
    python run_baseline_system.py \
        --dist_type mpi \
        --run_dpath $RUN_DPATH \
        --system_nlg_name scgpt_nlg \
        --do_test

The result scores will be saved to $RUN_DPATH/eval_summary.json.

3. Sample dialogue data to initialize DA contribution

Sample 16k turns (2000 turns x 8 processes), equivalent to at least 1k dialogues, of dialogue data from the baseline system prior to PPO training.

RUN_DPATH="outputs/dialogue_data/sys_scgpt"

mpirun -n 8 -machinefile $PJM_O_NODEINF -display-devel-map -map-by ppr:2:socket \
    python run_baseline_system.py \
        --dist_type mpi \
        --run_dpath $RUN_DPATH \
        --system_nlg_name scgpt_nlg \
        --turns_per_process 2000

The sampled dialogues will be saved to $RUN_DPATH/dialogues/*.

4. PPO training of GenPPN

Fine-tune Alpaca 7B (using 16 x V100 32GB GPUs). The learning curves will be tracked by wandb.

Note

The actual training logs of GenPPN for each NLG model can be found in this wandb project.

# Specify the path to output directory
RUN_DPATH="outputs/rl/sys_scgpt-absmax"

# Run
mpirun -n 16 -machinefile $PJM_O_NODEINF -display-devel-map -map-by ppr:2:socket \
    python run_rl_alpaca_lora.py \
        --dist_type mpi \
        --run_dpath $RUN_DPATH \
        --system_nlg_name scgpt_nlg \
        --base_model_name ohashi56225/alpaca-7b \
        --wandb_project_name genppn-emnlp2023 \
        --do_train \
        --total_iterations 200 \
        --batch_size_per_device 32 \
        --mini_batch_size_per_device 2 \
        --gradient_accumulation_steps 2 \
        --num_epochs 4 \
        --learning_rate 1e-5 \
        --reward_factors da_contribution_absmax \
        --adaptive_dac \
        --dac_dialogue_dpath_to_init "outputs/dialogue_data/sys_scgpt/dialogues" \
        --dac_num_dialogues_to_init 1000

This will save the trained checkpoints to $RUN_DPATH/checkpoints/*.

5. Test GenPPN

Select the checkpoint with the best Task Success from the wandb learning curve and test it.

# Specify the trained model path
ADAPTER_DPATH="outputs/rl/sys_scgpt-absmax/checkpoints/iteration-150"

# Run
mpirun -n 16 -machinefile $PJM_O_NODEINF -display-devel-map -map-by ppr:2:socket \
    python run_rl_alpaca_lora.py \
        --dist_type mpi \
        --run_dpath $ADAPTER_DPATH \
        --system_nlg_name scgpt_nlg \
        --base_model_name ohashi56225/alpaca-7b \
        --adapter_checkpoint_path $ADAPTER_DPATH \
        --do_test

This will save the sampled dialogues to $ADAPTER_DPATH/test_dialogues/* and the result scores to $ADAPTER_DPATH/test_summary.json.

Citation

@inproceedings{ohashi-higashinaka-2023-enhancing,
    title = "Enhancing Task-oriented Dialogue Systems with Generative Post-processing Networks",
    author = "Ohashi, Atsumoto and Higashinaka, Ryuichiro",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    year = "2023",
    url = "https://aclanthology.org/2023.emnlp-main.231",
    pages = "3815--3828",
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ConvLab-2 @ b014b81		ConvLab-2 @ b014b81
assets		assets
job_scripts		job_scripts
ppn		ppn
system		system
user_simulator		user_simulator
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
arguments.py		arguments.py
requirements.txt		requirements.txt
run_baseline_system.py		run_baseline_system.py
run_rl_alpaca_lora.py		run_rl_alpaca_lora.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenPPN

Setup

Reproduce

1. Setup modules in ConvLab-2

2. (Optional) Test baseline system

3. Sample dialogue data to initialize DA contribution

4. PPO training of GenPPN

5. Test GenPPN

Citation

About

Releases

Packages

Languages

nu-dialogue/GenPPN

Folders and files

Latest commit

History

Repository files navigation

GenPPN

Setup

Reproduce

1. Setup modules in ConvLab-2

2. (Optional) Test baseline system

3. Sample dialogue data to initialize DA contribution

4. PPO training of GenPPN

5. Test GenPPN

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages