BoCheck
is a Python package for Tibetan syllable component recognization and spelling check based on memristor. It allows users to tokenize, recognize and check Tibetan syllable from string, .txt or .docx file.
- Feature 1: Tibetan syllable tokenization.
- Feature 2: Tibetan syllable component recognization.
- Feature 3: Tibetan syllable spelling check.
You can install the latest release of BoCheck
from PyPI:
pip install BoCheck
Alternatively, you can install it directly from the source:
git clone https://github.com/username/repo-name.git
cd repo-name
pip install .
Here are some basic examples to help you get started with Project Name.
import BoCheck
# Use the end-to-end processing
BoCheck.process_text(text, table_path="demo.csv")
BoCheck.process_file("assets/text.txt", table_path=None)
BoCheck.process_file("assets/text.docx", table_path="demo.xlsx",
component_recognization=True, spelling_check=True)
from BoCheck import Checker
# Initialize the class
recog = Checker()
text = "ཡང་དེབ་འདི་ནི་ལས་དབང་འཛོམས་པའི་སྐབས་འགན་འཁྲི་ཞིག་ལྡན་པའི་ཐོག་ནས་ཀློག་པ་པོས་ཤོག་ལྷེ་ཕྱེ་ཡི་ཡོད།"
# Component recognization
recognization_result = recog.recognization_text(text)
print(recognization_result)
# Spelling check
check_result = recog.check_text(text)
check_result.to_csv("demo.csv", encoding='utf-8-sig')
print(check_result)
Full documentation is available at Read the Docs.
Contributions are welcome! If you would like to contribute to this project, please read our Contributing Guide. You can also open an issue or submit a pull request on our GitHub repository.
- Fork the repository
- Create your feature branch (
git checkout -b feature/YourFeature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin feature/YourFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Thank anyone whose code was used or who provided inspiration Mention any additional resources or libraries that your project relies on markdown