We provide the source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts" accepted at ACL'22. If you find the code useful, please cite the following paper.
@inproceedings{song-etal-2022-grounded,
title="Towards Abstractive Grounded Summarization of Podcast Transcripts",
author = "Song, Kaiqiang and
Li, Chen and
Wang, Xiaoyang and
Yu, Dong and
Liu, Fei",
booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics},
year={2022}
}
We proposed a grounded summarization system, which provide each summary sentence a linked chunk of the original transcripts and their audio/video recordings. It allows a human evaluator to quickly verify the summary content against source clips.
- 03/22/2022 ArXiv Paper released.
- 03/04/2022 Trained model and processed testing data released.
- 03/03/2022 Code Released. Paper link, trained model and processed testing data will be released soon.
- 02/23/2022 Paper accepted at ACL 2022.
You can follow the below 4 steps to generate grounded podcast summaries or directly download the generated summary from this link
Download the code
git clone https://github.com/tencent-ailab/GrndPodcastSum.git
cd GrndPodcastSum
Download the Trained Models to GrndPodcastSum
Directory and unzip
unzip model.zip
Download the Processed Test Set (1027) to GrndPodcastSum
Directory and unzip
unzip data.zip
Create the environment using .yml
file.
conda env create -f env.yml
conda activate GrndPodcastSum
Calculating the chunk embedding offline.
sh offline.sh
Use Grnd-token-nonoveralp model to generate summary.
sh test.sh
Copyright 2022 Tencent
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This repo is only for research purpose. It is not an officially supported Tencent product.