GitHub - brannerchinese/ChineseUtilities: Utilities to help me with Chinese-language work

ChineseUtilities

Utilities to help me with Chinese-language work and other NLP tasks

json_texts: Contains research files in progress, in JSON format. Format specification is at json_format_for_prosody.txt. Program handle_files.py enables encrypted version of data files to be pushed to repo, but keeps contents private.
character_count.py: Count the Chinese characters (only) in a file and return their overall percentages. File to be opened must be in directory DATA.
separate_pinyin/ Takes a string of Pīnyīn as input and returns a list of the discrete component syllables. There is a second program count_syllables.py to count the number of syllables found.
convert_pinyin/: Convert files in Pages (v. 3, "Pages '08") format so that their non-standard tonal diacritics are normalized to Unicode. Does not work with later versions of Pages. Sample font ("shyrbaw" 時報, based on Times) is included in directory.
statistics/: Little programs to calculate statistical tests.
poetry_flask/: The beginnings of a web application to assist the study of medieval Chinese prosody.
hanamin_fonts/: Copy of the HANAMIN fonts for use with this project.

[end]

Name		Name	Last commit message	Last commit date
Latest commit History 257 Commits
Colloq Chin Patterns 03-08.pages		Colloq Chin Patterns 03-08.pages
chinese_tables		chinese_tables
convert_pinyin		convert_pinyin
hanamin_fonts		hanamin_fonts
json_texts		json_texts
poetry_flask		poetry_flask
separate_pinyin		separate_pinyin
statistics		statistics
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
character_count.py		character_count.py
requirements.txt		requirements.txt