Skip to content

Commit

Permalink
Fix BART CNN/DM fine-tuning instructions (#1650)
Browse files Browse the repository at this point in the history
Summary:
The first step in the CNN/DM fine-tuning instructions for BART is misleading (see facebookresearch/fairseq#1391). This PR fixes the README and adds links to facebookresearch/fairseq#1391 as well as to a repository with CNN/DM processing code adjusted for BART.
Pull Request resolved: facebookresearch/fairseq#1650

Differential Revision: D19606689

fbshipit-source-id: 4f1771f47d3650035a911ab393ab6df2193c1bf9
  • Loading branch information
artmatsak authored and yzpang committed Feb 19, 2021
1 parent 279dd9c commit 7e4059f
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion examples/bart/README.cnn.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Fine-tuning BART on CNN-Dailymail summarization task

### 1) Follow instructions [here](https://github.com/abisee/cnn-dailymail) to download and process into data-files with non-tokenized cased samples.
### 1) Download the CNN and Daily Mail data and preprocess it into data files with non-tokenized cased samples.

Follow the instructions [here](https://github.com/abisee/cnn-dailymail) to download the original CNN and Daily Mail datasets. To preprocess the data, refer to the pointers in [this issue](https://github.com/pytorch/fairseq/issues/1391) or check out the code [here](https://github.com/artmatsak/cnn-dailymail).

### 2) BPE preprocess:
```bash
Expand Down

0 comments on commit 7e4059f

Please sign in to comment.