TETRA (Text Revision of ACL papers) corpus

TETRA consists of documen-level revisions for the articles published at ACL-related venues, and designed based on an annotation scheme that can handle edit types beyond sentences (such as argument flow) in addition to conventional word- and phrase-level edit types.

See the paper for more information.

Data format

The dataset is formatted in xml, consisting of one xml file per paper. Here's sample of annotations in the dataset:

Each file contains the following information in xml tags.

meta information
- doc id: ID of the paper (ACL Anthology)
- editor: ID of the revised human expert
- format: venues (conference (Conf) or workshop (WS))
- position: first author's position (Non-student (NS) or Student (S))
- region: region of the affiliation (Native (N) or Non-native (NN))
edit information
- edit type: edit type
- crr: edit instance by human expert
- comments: rationale comments

Citation

Thank you for your interest in our dataset. If you use it in your research, please cite:

@misc{mita2022automated,
      title={Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond}, 
      author={Masato Mita and Keisuke Sakaguchi and Masato Hagiwara and Tomoya Mizumoto and Jun Suzuki and Kentaro Inui},
      year={2022},
      eprint={2205.11484},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
original		original
README.md		README.md
aspect_edit-type_map.txt		aspect_edit-type_map.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TETRA (Text Revision of ACL papers) corpus

Data format

Citation

License

About

Releases

Packages

chemicaltree/tetra

Folders and files

Latest commit

History

Repository files navigation

TETRA (Text Revision of ACL papers) corpus

Data format

Citation

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages