Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass multiple images at once #254

Closed
ShroukMansour opened this issue Feb 25, 2020 · 4 comments
Closed

Pass multiple images at once #254

ShroukMansour opened this issue Feb 25, 2020 · 4 comments

Comments

@ShroukMansour
Copy link

Hi, I know that I can pass multiple images to tesseract using the command line like this tesseract list.txt list.out.txt where list.txt is a file containing images paths. How can I do that in pytesseract without using a loop?

@bozhodimitrov
Copy link
Collaborator

You can pass the path for this list.txt as string to pytesseract and it should do the same.

@ShroukMansour
Copy link
Author

It worked thank you, also can I pass to it an array of images? @int3l

@bozhodimitrov
Copy link
Collaborator

If you mean Python lists - unfortunately, there is not built-in support for that at the moment.

@sysescool
Copy link

sysescool commented Sep 27, 2020

You can pass the path for this list.txt as string to pytesseract and it should do the same.
@int3l
I want to OCR multi images in one pdf. So I write it as follow, but not worked:

 pdf = pytesseract.image_to_pdf_or_hocr("C:/list.txt", lang='eng', extension='pdf')
 with open(pdfoutputfolder+"result.pdf", 'w+b') as f:
     f.write(pdf) # pdf type is bytes by default

I searched and got it: tesseract-ocr/tesseract#1268
but I hope write it in python by pytesseract, instead of CLI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants