-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use DPI as a way to refer to word size in documentation #1846
Comments
I would assume that dpi and ppi are used interchangeably here. |
Documentation wiki can be edited by users. Please modify/correct as
required.
…On Fri, Aug 17, 2018 at 1:00 PM, H-Bluhm ***@***.***> wrote:
I would assume that dpi and ppi are used interchangeably here.
Since, as you laid out, the technical meaning of dpi does not make a lot
of sense in this case, I think ppi is what was meant.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1846 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_o-EsGgmOu7STbB3cJN_OqHQOW2Doks5uRnEJgaJpZM4WAcaj>
.
--
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
|
JPEG & PNG both support resolution metadata. Please use it. |
How do I just remove my self completely from all of this |
@albertoandreottiATgmail: I do not know what is your aim, but you are taking it from wrong end. |
@albertoandreottiATgmail and @zdenop, you are talking about different things. Yes, of course it makes a difference whether scanning is done with a high or a low resolution. But that is only a relative value. Scanning a large poster with 70 dpi will give the same picture as scanning a small printout of the poster with 300 dpi. A human won't see any difference when watching the resulting image file on a screen and will be able to read text in both cases. So I'd expect that it also does not make a difference for Tesseract. Currently it does! An image which was converted from 300 dpi to 600 dpi gives a different (typically better) result with Tesseract, although no information was added and the quality of the image won't get better by such a conversion. Other OCR software does not need or use the resolution information from the input image as far as I know. |
The explanation the OP expects is already present in another wiki page. Also see Ray's remark in #756 (comment) |
Hi,
here,
https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#rescaling
you recommend to use 300 DPI images. That doesn't make any sense, images don't have a DPI until you print them.
You should give your recommendation in terms of minimal number of pixels for the height of a word, for example. I can have an image where letter 'a' is 20 pixels high, or 200 pixels high. Both images will have different results in terms of performance.
As an independent fact, I can indeed print both images with 300dpi.
Am I missing something?
Alberto.
The text was updated successfully, but these errors were encountered: