sanskrit-data

Versioned Sanskrit linguistic data.

The data has been cobbled together from a variety of sources. Together, the data covers almost all lexical forms in Classical Sanskrit literature.

Quickstart

git clone https://github.com/sanskrit/data.git && cd data
python bin/make_data.py
ls all-data

The data comes from several sources, each with its own format. make_data.py converts all of the data to a common format and stores the results in the all-data directory. This is the data that downstream systems should use.

About the data

Verbs, participles, nouns, adjectives, pronouns, indeclinables, morphemes, and sandhi rules. If it's a Sanskrit word, it's probably here.

Each of the data sources used has its own license. Check the LICENSE files in learnsanskrit.org, sanskrit-heritage-site, and monier-williams for details.

All Sanskrit strings are written in SLP1, mainly because it is extremely convenient when processing Sanskrit programmatically. You can transliterate this data to some other representation by using a variety of transliterators.

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
all-data		all-data
ashtadhyayi.com		ashtadhyayi.com
bin		bin
learnsanskrit.org		learnsanskrit.org
monier-williams		monier-williams
sanskrit-heritage-site		sanskrit-heritage-site
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
test_data.py		test_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sanskrit-data

Quickstart

About the data

About

Releases

Packages

Languages

sanskrit/data

Folders and files

Latest commit

History

Repository files navigation

sanskrit-data

Quickstart

About the data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages