-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there an imposed limit to accuracy? #58
Comments
Yes there is indeed an "angle step" value now fixed at 0.1 degrees when doing Hough transformation. Deskew can detect skew angle at "higher resolution" that this - however, the accuracy is limited by this. The reason is simple - time needed for skew detection goes up rapidly with smaller step. Nevertheless, it can be a user option. |
On a document that is 3000 pixels tall, an angle of 0.1 degrees is equivalent to a shear of 5 pixels over the height of the document. Admittedly that is close to being ignorable in my particular application but it would be nice to specify a step that permits exact accuracy, which for the scans I'm working with would be, I presume, 0.02 degrees. But again that's arbitrary and specific to the size of scans I'm working with. But the point I want to make is that I'm willing to put up with slow processing speeds if it avoids having to then examine the output by hand and have a subsequent stage of applying a final shear of a few pixels in order to get a perfect result. |
Ok, as a first step I'll expose the angle step value as a user option so it can be set as command line argument. |
Perhaps also add a parameter for the limits as well (ie what is currently +/- 10 I believe?) I can see how I could automate this by running your code once with default parameters (+/- 10 with step of 0.1), getting the angle at default resolution, then rerunning with a narrower range and smaller step... (eg +/- 1 with step of 0.01) |
I have a document which is about 3000 pixels tall, which deskew does straighten - but the straightened document, although the horizontal lines are very level, exhibits a horizontal shear of 7 pixels between the top rows and the bottom rows. Since the application I'm using this for is segmenting fixed-pitch text into a grid in order to extract characters for OCR, a 7-pixel offset at the foot of the document causes a noticeable grid misalignment when the characters themselves are only something like 20 pixels wide. I was able to manually detect the shear needed to correct this, and apply it with the pnmshear command, but it would have been preferable if deskew had straightened the document completely rather than 99.8% or whatever. If deskew has some accuracy parameter beyond which it doesn't attempt to be more accurate, then an option for accuracy down to a single pixel would be appreciated, even if the runtime overhead is significantly higher.
The text was updated successfully, but these errors were encountered: