-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-rotate feature fails for certain images #747
Comments
I added an auto-rotate benchmark to assess this feature more robustly. The benchmark rotates each of our benchmark images by 0.2, 0.1, -0.1, and -0.2 radians (0.1 radian is about 6 degrees) and attempts to un-rotate them using the auto-rotate option. At the time of writing this (version 4.0.3) auto-rotate worked correctly for 4 of 6 images rotated +/- 0.1 radians, and 0 of 6 images rotated +/- 0.2 radians. This indicates that the feature struggles with larger angles. Notably, for the images that were not correctly rotated, 0 rotation was applied--so none of the images got worse by using the |
I updated Tesseract.js and Tesseract.js-core to have improved auto-rotate functionality. Previously auto-rotate worked correctly for 4 of 6 images rotated +/- 0.1 radians, and 0 of 6 images rotated +/- 0.2 radians. After the change auto-rotate worked correctly for 6 of 6 images rotated +/- 0.1 radians and 3 of 6 images rotated +/- 0.2 radians. The change I made was switching from using the Tesseract "reskew angle" to using the "gradient". Although these are both estimates of the rotation of the page, the I do not believe further improvements are possible using only statistics already calculated by Tesseract. In the images where the auto-rotation fails, it looks like the root cause is that Tesseract does not detect text lines to begin with. |
This update has been included in the |
While setting
rotateAuto: true
generally works as expected, for certain documents the angle is falsely calculated as 0 degrees. Different images can be experimented with using the image processing example. An image that the feature does not work for is attached below.Notably, this is not a reason to not use
rotateAuto
, as the images I've encountered that fail to rotate correctly are not rotated at all--so end up the same as the input image.The text was updated successfully, but these errors were encountered: