NgaPi (ငါးပိ)

Ngapi is a semantic chunker based on fastText word embeddings for the Myanmar language. This approach can also be applied to other languages.

Fig. A 19th century Burmese watercolor depicting a ngapi hawker (Source: Wikipedia)

Citation

If you have used the Ngapi semantic chunker, please cite it as follows:
(Ngapi Semantic Chunker ကို သုံးဖြစ်ကြရင် အောက်ပါ citation လုပ်ပေးပါ။ ကျေးဇူးပါ။)

@misc{ngapi_2024,
  author       = {Ye Kyaw Thu},
  title        = {NgaPi Semantic Chunker for Burmese, Version 1.0},
  month        = {12},
  year         = {2024},
  url          = {https://github.com/ye-kyaw-thu/NgaPi},
  note         = {Accessed: 2024-12-25},
  institution  = {LU Lab., Myanmar}
}

Reference

[1]. A 19th century Burmese watercolor depicting a ngapi hawker, https://en.wikipedia.org/wiki/Ngapi
[2]. The 5 Levels Of Text Splitting For Retrieval by Greg Kamradt: https://www.youtube.com/watch?v=8OJC21T2SL4&t=2112s
[3] P. Bojanowski*, E. Grave*, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information

@article{bojanowski2016enriching,
  title={Enriching Word Vectors with Subword Information},
  author={Bojanowski, Piotr and Grave, Edouard and Joulin, Armand and Mikolov, Tomas},
  journal={arXiv preprint arXiv:1607.04606},
  year={2016}
}

[4] A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, Bag of Tricks for Efficient Text Classification

@article{joulin2016bag,
  title={Bag of Tricks for Efficient Text Classification},
  author={Joulin, Armand and Grave, Edouard and Bojanowski, Piotr and Mikolov, Tomas},
  journal={arXiv preprint arXiv:1607.01759},
  year={2016}
}

[5] A. Joulin, E. Grave, P. Bojanowski, M. Douze, H. Jégou, T. Mikolov, FastText.zip: Compressing text classification models

@article{joulin2016fasttext,
  title={FastText.zip: Compressing text classification models},
  author={Joulin, Armand and Grave, Edouard and Bojanowski, Piotr and Douze, Matthijs and J{\'e}gou, H{\'e}rve and Mikolov, Tomas},
  journal={arXiv preprint arXiv:1612.03651},
  year={2016}
}

(* These authors contributed equally.)

[6]. RE based syllable breaking tool for Burmese: https://github.com/ye-kyaw-thu/sylbreak

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
model		model
pic		pic
test		test
utils		utils
LICENSE		LICENSE
README.md		README.md
ngapi.py		ngapi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NgaPi (ငါးပိ)

Citation

Reference

About

Releases

Packages

Languages

License

ye-kyaw-thu/NgaPi

Folders and files

Latest commit

History

Repository files navigation

NgaPi (ငါးပိ)

Citation

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages