-
-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] OCR search text in images #296
Comments
hmmm, OCR is a cool idea indeed. My only concern is finding a good OCR tool that would work with different languages. |
This might be helpful - I am thinking probably we can let each user configure a list of possible languages that would occur in their hoard - which usually are the languages they know, so the list wouldn't be too long (for most people it might be 1-3?). It seems that Tesseract.js supports recognizing multiple languages at the same time when you concatenate the lang codes with |
tesseract.js looks cool indeed. We can probably add it to the roadmap at some point |
Without OCR (which allows for searching text within images), the hoarding images become somewhat pointless. |
+1 for OCR in images.
|
@Arcturuss OCR for uploaded images is something on our roadmap and I'm definitely planning to do it pretty soon. |
OCR is now implemented and will be available in the next release. |
Thank you! It's very great to hear that! Appreciate it a lot. |
Can you tell me how this is implemented? Do I need to specify any of the ENV variables for it to work? Can't seem to get it to work from photos of pages of text. |
@drycounty it's enabled by default. Currently, we don't expose the extracted text, but we only index it for search. Try searching for the content of the page and see if it'll showup. |
It would be especially helpful when you have a lot of screenshots, diagrams, photo of slides, etc., embedded in documents or as stand alone image files. Text in images may contain a large amount of information. However, it's not very easy to retrieve them in the traditional ways of file management. It would be greatly appreciated if you could consider making them searchable.
The text was updated successfully, but these errors were encountered: