Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tatar language data quality issues #61

Open
rsabirov opened this issue Oct 14, 2024 · 0 comments
Open

Tatar language data quality issues #61

rsabirov opened this issue Oct 14, 2024 · 0 comments

Comments

@rsabirov
Copy link

Hello,

Where is this data for Tatar language is coming from?

I see a lot of garbage there, I barely found a Tatar words here.

I would like to improve this.

  1. do you have some page with guidance how to train the model?
  2. once I train it, should I create a PR with just model itself to that repo? where are storing raw data for training?

follow up for tesseract-ocr/langdata#305

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant