-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ballot-interpreter): improve vertical streak detection #5522
fix(ballot-interpreter): improve vertical streak detection #5522
Conversation
When scanning in VxCentralScan sometimes we get wider gray areas outside the ballot paper than we expected. Most of this gray is binarized to black, leading to false positives when detecting streaks. This reduces the likelihood of false positives while preserving detection of streaks within the ballot, including within the timing marks.
Binarizes the debug image to make it clearer what the streak detector was working with.
The comment says to draw on side B, so I updated the code to do that.
Ensures that streaks through timing marks are still detected and semi-wide edge "streaks" do not cause false positives.
@@ -293,7 +293,7 @@ pub fn detect_vertical_streaks( | |||
) -> Vec<PixelPosition> { | |||
const PERCENT_BLACK_PIXELS_IN_STREAK: f32 = 0.75; | |||
const MAX_WHITE_GAP_PIXELS: PixelUnit = 15; | |||
const BORDER_COLUMNS_TO_EXCLUDE: PixelUnit = 5; | |||
const BORDER_COLUMNS_TO_EXCLUDE: PixelUnit = 20; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a reasonable approach and am happy to move forward with it.
I wonder if we could keep this threshold lower, however, if we cropped all black columns from the edges of the image (similar to our cropping logic for the top/bottom)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to, but in at least one example I've seen the first 4 pixels at the left edge actually do contain some white, but the next 7 are pure black. That's why that example failed with 5. We could succeed with that particular ballot with a value of 12, but only barely. It's possible that 15 would be sufficient, but I deemed the benefit of detecting streaks in that region not to be worth the risk of false positives.
However, this is not based on a lot of evidence and I'm speculating. If we think it's worth doing, I could do a more thorough analysis with the fi-7180 to see what the distribution of the black sides looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok interesting that there's some white on the edge first. Given what you've seen, a margin of 20 pixels seems reasonable.
We could also consider setting different margin values for different scanning hardware, but seems like we don't have strong evidence requiring that currently.
Overview
VxCentralScan sometimes produces images with gray backgrounds that get binarized to black. If that black is inset enough and consistent enough it can cause false positives when detecting vertical streaks. The approach I took to fix this is to increase the width of the ignored border pixels from 5px to 20px. This does leave us open to the same issue, but makes it less likely to occur.
Alternatives
I initially considered switching the order of detecting streaks and detecting timing marks, figuring that we could use the area within the timing marks to find streaks to ensure we don't accidentally count a black edge as a real streak. However, @jonahkagan pointed out that this would prevent us from detecting streaks that intersect with L/R timing marks if those streaks affected the ability to detect the timing marks. I believe such a situation is likely because of the limits on allowed rotation making it likely that a streak would line up with the line of timing marks fairly well. Therefore, I opted instead to simply increase the number of pixels we ignore at each edge when detecting streaks.
Demo Video or Screenshot
The screenshots below are the debug image for the same scanned ballot, one with 5px ignored and one with 20px ignored. The dark cyan area in each image is the area that was not considered for vertical streak detection.
Old ignored border pixel value (5px)
One incorrect streak detected with this value.

New ignored border pixel value (20px)
No streaks detected with this value.

Testing Plan
Tested with NH Test Ballots with the fi-7180 scanner. Added automated tests to cover finding real mid-ballot streaks, ignoring edge streaks, and finding L/R edge timing mark-intersecting streaks.