Releases: CopticScriptorium/corpora
July 2015
Similar to June v1.4 release; minor metadata correction in Abraham Our Father corpus (corpus name modified)
June 2015
Updated corpora in TEI XML, PAULA, relANNIS formats. Reflects revisions to our data model as well as additions to our corpora. TEI XML files have been generated using a script to automatically convert data into EpiDoc TEI.
26 March 2015 release
Full corpora in relANNIS and PAULA format.
TEI XML not yet available for all corpora and have not been updated since v.1.2 (December 2014).
The PAULA format contains all of our corpus data in XML.
This version reflects minor changes to visualization and metadata of the corpus of Shenoute, Acephalous Work #22.
18 March 2015: Full Sahidica NT added to corpora
New to corpora: PAULA and relANNIS files for the full Sahidica New Testament (from Warren Wells site) have been added. See the files in the bible/sahidica.nt_relANNIS directory and bible/
Sahidica corpus is solely automatically annotated using the tokenizer and part of speech tagger. No manual annotations or prooreading.
26 January 2015. Minor fix for diplomatic visualization in ANNIS version of besa.letters
Minor fix for diplomatic visualization in ANNIS version of besa.letters.
December 2014. All corpora in TEI XML, PAULA XML, relANNIS
29 December 2014: All corpora in PAULA XML, TEI XML, and relANNIS formats.
The full set of text and annotations for each corpus is in the PAULA XML format. (TEI XML contains a more limited set of annotations.)
TEI XML includes markup encoded to EpiDoc standards.
NEW corpus December 2014: Shenoute's text, Not Because a Fox Barks.
Fall 2014. All corpora in relANNIS, PAULA XML, and TEI XML formats
8 December 2014: All corpora in PAULA XML, TEI XML, and relANNIS formats.
The full set of text and annotations for each corpus is in the PAULA XML format.
TEI XML includes markup encoded to EpiDoc standards.
3 November 2014: All corpora in PAULA XML and relANNIS
3 November 2014: All corpora in PAULA XML and relANNIS
Full set of text and annotations for each corpus is in the PAULA XML format.
TEI coming soon.