Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about arabic tessdata? #209

Closed
Munderkind opened this issue Aug 7, 2015 · 14 comments
Closed

What about arabic tessdata? #209

Munderkind opened this issue Aug 7, 2015 · 14 comments

Comments

@Munderkind
Copy link

I tried tessdata from:
Google https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-3.02.ara.tar.gz&can=2&q= 3.02
I receive an error "tessdata_manager.SeekToStart(TESSDATA_INTTEMP):Error:Assert failed:in file adaptmatch.cpp, line 511"

Git, main tesseract repo https://github.com/tesseract-ocr/tessdata
And I receive the same error.

Other languages work well!

@ws233
Copy link
Collaborator

ws233 commented Aug 7, 2015

If it's a question of the language file consistency you would better publish it to the upstream tesseract repository. You may also create your own language file like it's written here.

@Munderkind
Copy link
Author

In original project "Template Framework Project" arabic works fine, but in the new project does not work, strange...

@ws233
Copy link
Collaborator

ws233 commented Aug 9, 2015

Can you show me your code snippet and the project structure with the tessdata folder in it? (is it blue or yellow?). Could you also provide the whole output of Tesseract from the console?

@Munderkind
Copy link
Author

Error and tessdata folder
edited1

Log 1
edited

Log 2
edited3

Code snippet
snippet

@Munderkind
Copy link
Author

I found a solution.
ffffffffffuuuu

After changes everything works fine!

@ws233
Copy link
Collaborator

ws233 commented Aug 10, 2015

@Munderkind, so the reason of the crach is in G8OCREngineModeCubeOnly, isn't it?
Does arabic language normally work in Tesseract only mode? If so, pls, refer to #140. It seems that Cube is dead and no longer supported. So I'll change the ticket title to mention arabic language in it as well.

@ws233
Copy link
Collaborator

ws233 commented Aug 10, 2015

@Munderkind, could you also close the ticket if there are no more questions regarding the topic?

@Munderkind
Copy link
Author

@ws233 Arabic language normally work only in G8OCREngineModeCubeOnly.
In G8OCREngineModeTesseractOnly, G8OCREngineModeTesseractCubeCombined crash with errors.

@rmtheis
Copy link

rmtheis commented Aug 11, 2015

FYI, I see the same behavior running Tesseract on Android: rmtheis/tess-two#12

@ghost
Copy link

ghost commented Sep 27, 2017

.

@yaman91
Copy link

yaman91 commented Feb 9, 2018

@Munderkind Thank you so much, you just saved my day!

@banuharshavardhan
Copy link

banuharshavardhan commented Jul 14, 2020

@Munderkind neither cubeOnly nor tesseractCubeCombined working for me. Can you help me.

@banuharshavardhan
Copy link

Crash --->

Screenshot

Code --->

Screenshot 2020-07-14 at 1 13 40 PM

tessdata folder --->

Screenshot 2020-07-14 at 1 15 04 PM

@Abdullah-Alashi-LP
Copy link

Hey @Munderkind and @yaman91 @banuharshavardhan

I'm getting the same error, I tried the following:

version [5.0.1] with ara.traineddata
and without cube data

version [5.0.1] with [ara.traineddata](https://github.com/tesseract-ocr/tessdata/tree/4.00)
and included cube data from there

version [4.0.0] with [ara.traineddata
and without cube data

version [4.0.0] with [ara.traineddata
and included cube data from there

I made sure to set engineMode: .cubeOnly

can you please please guys add an arabic sample code in git hub 🙏🏻

your help is much appreciated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants