Fix Difference of Gaussians prefilter #21

mqudsi · 2024-06-20T23:53:11Z

The panic in the original blockhash gauss_preproc() routine was patched in 2d203d6 to not panic, copying the behavior of the original code when run in release mode/with overflow assertions disabled.

Both the original code and the patched code are unfortunately not correct and result in very destructive preprocess pass that hurts rather than helps the blockhash algorithm pass (or that of any other image algorithm). The code documents what is supposed to happen with a link to the Wikipedia article on the Difference of Gaussians which describes the purpose of this filter; specifically when using gaussian blur with a ratio of kernels K2:K1 approximately equal to 1.6, you get a fast approximation of a Laplacian of Gaussian transform, used for object detection (in plain English: it gives you edge/outline detection).

The Wikipedia article has an example of what the DoG filter is supposed to look like; here it is reproduced below with the starting image and the image after the DoG filter has been applied:

Original:

Difference of Gaussians Reference:

The idea is that you blur the image twice, the second blur being stronger than the other. Each blur is effectively a low-pass filter that lets low-frequency components (smooth areas of the image) through up to the Gaussian kernel size. Taking the difference of two low-pass filters gives you a band-pass filter that lets through the components of the image that were present in the second image but not the first, and if the kernel value is chosen correctly for the input image, it can be used to approximate areas of certain frequency (what a Laplacian transform would do).

But even all the math aside, if you think about the actual pixel components of the image, you have 0 being black and 255 being white (for a 1-channel greyscale image). If you want the difference between two images, what you really want is to selectively capture something found in one image but not the other. If you perform a wrapping subtraction, you are destroying that information because when something isn't found in one image, you are wrapping around past the zero (which would indicate not found) all the way to 255 (indicating strongly found). So it stands to reason that you can only use an absolute difference or saturating subtraction operation here, never a wrapping one.

Back to the math: the Wikipedia image isn't annotated with what Gaussian kernel values were used, but as an approximation I have run some tests with the different possible implementations of the diff_inplace() using kernel sizes K1=2 and K2=3.2 to demonstrate why a wrapping subtraction does not make sense here, and to show that the correct choice is a saturating subtraction operation.

Wrapping Subtraction (current behavior):

The image above shows the current behavior, normalized to values between 0 and 255, using the following logic:

*lhs = lhs.wrapping_sub(*rhs);

Absolute Difference:

This image shows the inverse of the absolute difference, normalized to values between 0 and 255:

*lhs = 255 - lhs.abs_diff(*rhs);

(Since this is an absolute difference, it does not matter whether you subtract rhs minus lhs or lhs minus rhs.)

Saturating Subtraction (LHS minus RHS):

There are two possible options here, the original code subtracted rhs from lhs (lhs.wrapping_sub(rhs)), so here's what it looks like with the same but using saturating subtraction instead (again, pixel values normalized from 0 to 255 to darken the result):

The code for this is

*lhs = 255 - lhs.saturating_sub(*rhs);

Saturating Subtraction (RHS minus LHS):

The second option, to match the logic above (subtracting the lower Gaussian kernel size image from the higher Gaussian kernel size image), i.e. subtracting lhs from rhs, again normalized to 0-255:

The code for this is what's in the PR:

*lhs = 255 - rhs.saturating_sub(*lhs);

You'll notice that I am always posting the inverse of the operation (255 minus the expression) instead of just the expression. This is because 0 is black and 255 is white, so when you subtract normally you end up with the common area black, but by convention the DoG or Laplacian transform keeps the common area white and the difference in black. Whether you invert the result of the subtraction or not, the subsequent blockhash will work, but if you don't invert then you have the blockhash (or whatever image algorithm, really) of a black image will match the DoG-preprocessed image hash of a white image. It seemed to me that sanity should prevail over performance and a u8 inversion is such a cheap operation compared to the actual image hash algorithm that follows (or any resizing or blurring that takes place beforehand) that I believe it would be stupid not to invert the result.

I hope it is clear from the images above that the original code never handled the Difference of Gaussians preproc filter correctly. As you can see just by looking at the results, the wrapping difference doesn't make any sense to use here even if you ignore the reference Wikipedia image. (Also, I looked for prior art and OpenCV does a saturating subtraction over the pixel values in general for the operation img1 - img2 regardless of the algorithm or filter being applied.)

This is obviously a breaking change, but I think it is fair and I wouldn't even have to think too hard about releasing this in a semver-compatible minor update since the old code was so broken you would never actually get a sane blockhash out of the result (but releasing it as 2.1 probably makes the most sense!). I'm also going to open a separate PR to change the default Gaussian kernel sigma values, as they don't provide the correct ratio to mimic a Laplacian of Gaussian transform.

One final note: as I noted, the images I posted have been normalized to have their pixel values fall between 0 and 255. If you don't do this, some inputs will have lighter overall outlines than other images, but in general the result will be very faint. This does not necessarily affect the subsequent image hash operation because many work on relative values (grey is still darker than the white background), but it is something to consider. The code for normalizing a Greyscale image follows, in case you want to consider applying this normalization as a second step after the Difference of Gaussians:

fn normalize_image(image: &GrayImage) -> GrayImage {
    let (width, height) = image.dimensions();
    let mut normalized_image = GrayImage::new(width, height);

    let (min, max) = image
        .pixels()
        .fold((u8::MAX, u8::MIN), |(min, max), pixel| {
            (pixel[0].min(min), pixel[0].max(max))
        });

    // Avoid division by zero in case all pixels have the same value
    debug_assert!(max >= min);
    if max != min {
        let factor = 255.0 / (max - min) as f32;

        // Normalize the image
        for (x, y, pixel) in image.enumerate_pixels() {
            let value = pixel.channels()[0];
            let normalized = ((value - min) as f32 * factor).round() as u8;
            normalized_image.put_pixel(x, y, Luma([normalized]));
        }
    } else {
        normalized_image = image.clone();
    }

    normalized_image
}

For reference and posterity, here's what the correct algorithm produces when the output isn't normalized:

qarmin · 2024-10-24T16:02:43Z

The changes look sensible, and I was waiting for someone more familiar with it, perhaps criticize it somehow, but nevertheless no one reports anything, so I consider it ok

Thanks!

mqudsi · 2024-12-15T17:54:02Z

Thanks for merging. I still need to open the PR to change the default parameters or else it won't be performing this operation - if I get a chance I would like to test out a few different values instead of only the (K1, K2) = (2, 3.2) values used above; particularly to see if there might be a different set of values with the 1:1.6 ratio that reduce the need to normalize the image.

Fix difference of gaussians prefilter

20fb5f8

qarmin merged commit 18faef8 into qarmin:master Oct 24, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Difference of Gaussians prefilter #21

Fix Difference of Gaussians prefilter #21

mqudsi commented Jun 20, 2024 •

edited

Loading

qarmin commented Oct 24, 2024

mqudsi commented Dec 15, 2024

Fix Difference of Gaussians prefilter #21

Fix Difference of Gaussians prefilter #21

Conversation

mqudsi commented Jun 20, 2024 • edited Loading

qarmin commented Oct 24, 2024

mqudsi commented Dec 15, 2024

mqudsi commented Jun 20, 2024 •

edited

Loading