Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of hashing and reduce memory #489

Merged
merged 1 commit into from
Aug 24, 2023

Conversation

vcsjones
Copy link
Contributor

👋 I had a look at the use of hashing here for detecting duplicate images. Currently, this will allocate an array of bytes per pixel-row of each image, which can a significant amount of memory usage.

Instead, we can rely on Span to remove the allocations all together. Benchmarking a 3024 x 4032 image (the size my iPhone currently takes, so seems representative):

Before

Method Mean Error StdDev Gen0 Allocated
Skia_GetHash 23.08 ms 0.169 ms 0.158 ms 7750.0000 46.61 MB
ImageSharp_GetHash 22.93 ms 0.160 ms 0.150 ms 7750.0000 46.61 MB

After

Method Mean Error StdDev Allocated
Skia_GetHash 20.19 ms 0.162 ms 0.152 ms 1.21 KB
ImageSharp_GetHash 20.01 ms 0.047 ms 0.042 ms 1.29 KB

This effectively makes the memory usage 1.2 kilobytes regardless of the image size, down from 46 megabytes.

A smaller optimization is to place the hash in a stack buffer before converting it to hex. That just saves one small array allocation for the hash itself. This uses Convert.ToHexString since it can natively operate on a ReadOnlySpan<byte> and also uses upper-case lettering, but if you prefer I can create an overload of your extension method that works off of span as well.

Additionally, this fixes a tiny issue where IncrementalHash is not being disposed. This does not leak memory, but results in finalizers getting run during garbage collection.

@Webreaper
Copy link
Owner

This is totally awesome - thank you! I'll take a look later and merge.

That is a seriously good memory optimisation. 😁👌

@Webreaper Webreaper merged commit cdd2e12 into Webreaper:master Aug 24, 2023
@vcsjones vcsjones deleted the hash-memory branch August 24, 2023 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants