-
-
Notifications
You must be signed in to change notification settings - Fork 853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hoist some of the calculations from loops of 3DMoments() #1818
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1818 +/- ##
=======================================
Coverage 87% 87%
=======================================
Files 936 936
Lines 48164 48191 +27
Branches 6037 6037
=======================================
+ Hits 41970 42097 +127
+ Misses 5190 5093 -97
+ Partials 1004 1001 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@kunalspathak Thanks for this it is very interesting. I've just read your linked runtime issue also. If we can add some supplementary comments to explain the optimization and refactor some of the variable naming (prefer |
I have also noticed other places where similar technique could be use although I didn't measure its performance. ImageSharp/src/ImageSharp/Processing/Processors/Quantization/WuQuantizer{TPixel}.cs Line 217 in 7d74c4c
ImageSharp/src/ImageSharp/Processing/Processors/Quantization/WuQuantizer{TPixel}.cs Lines 233 to 249 in 7d74c4c
Here, Another interesting method is ImageSharp/src/ImageSharp/Processing/Processors/Quantization/WuQuantizer{TPixel}.cs Lines 629 to 641 in 7d74c4c
|
@@ -85,7 +85,7 @@ public void PngCoreWu() | |||
public void PngCoreWuNoDither() | |||
{ | |||
using var memoryStream = new MemoryStream(); | |||
var options = new PngEncoder { Quantizer = new WuQuantizer(new QuantizerOptions { Dither = null }) }; | |||
var options = new PngEncoder { Quantizer = new WuQuantizer(new QuantizerOptions { Dither = null }), ColorType = PngColorType.Palette }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change so I can get perf coverage for it. Should I keep it or revert it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. That's an oversight from us.
Very happy to see what other optimizations you can bring as long as they are well commented. Thanks! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, happy to contribute. I think it will be more productive for us to contribute or investigate the performance issues if these benchmarks are in our system. Any progress on #1795?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving that one for @antonfirsov Likely after we complete #1730 and release V2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered:
#1795 (reply in thread)
This should be ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! 👍
This workaround will not be needed anymore because codegen will take care of it as part of dotnet/runtime#68061. Here is the diff for the earlier version of C# code: Full diff: https://www.diffchecker.com/DKGAZKq0 |
That’s .NET 7 though yeah? We’ll be sticking to LTS release targets from now on so just 6 for now. I am planning on revising many parts of the codebase though to simplify things where possible now we have a much more modern single target. |
Yes.
Sure, this was just FYI :) |
It’s really good to know!! |
Prerequisites
Description
The 4-level nested for loop in 3DMoments is very expensive, the inner most loop executes almost 1.1M times. We call
GetPaletteIndex()
method that does bunch of calculation using index variables of all the 4-loops.ImageSharp/src/ImageSharp/Processing/Processors/Quantization/WuQuantizer{TPixel}.cs
Lines 418 to 444 in 255226b
There is a potential to hoist some of the invariants calculation outside the loop. I verified the benchmark and see around 2-3% improvement.
Here is the assembly difference: https://www.diffchecker.com/I4uqCtBO