-
-
Notifications
You must be signed in to change notification settings - Fork 853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1D convolution optimization and general codegen tweaks #1477
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1477 +/- ##
==========================================
- Coverage 83.55% 83.48% -0.08%
==========================================
Files 741 740 -1
Lines 32462 32559 +97
Branches 3648 3652 +4
==========================================
+ Hits 27125 27181 +56
- Misses 4625 4665 +40
- Partials 712 713 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@@ -90,4 +94,4 @@ public static void Compress(ref Vector4 vector) | |||
[MethodImpl(InliningOptions.ShortMethod)] | |||
public static float Compress(float channel) => channel <= 0.0031308F ? 12.92F * channel : (1.055F * MathF.Pow(channel, 0.416666666666667F)) - 0.055F; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ever figure out how to do an accurate SIMD enable approximation of this we would be laughing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pow(channel, 0.416666666666667F) => exp(channel * log(0.416666666666667F))
log(0.416666666666667F) == -0.875468737353899935628f
So...
public static void Compress(ref Vector4 vector)
{
var channels = Unsafe.As<Vector4, Vector128<float>>(ref vector);
var log = Vector128.Create(-0.875468737353899935628f);
channels = Sse.Multiply(channels, log);
channels = Exp(channels); // Isn't simd intrinsic
if (Fma.IsSupported)
{
channels = Fma.MultiplyAdd(Vector128.Create(1.055F), channels, Vector128.Create(-0.055F));
}
else
{
channels = Sse.Add(Sse.Multiply(Vector128.Create(1.055F), channels), Vector128.Create(-0.055F));
}
Unsafe.As<Vector4, Vector128<float>>(ref vector) = channels;
}
But Exp
isn't a Simd intrinsic; however you can approximate it with these sequences sse_mathfun or avx_mathfun?
src/ImageSharp/Processing/Processors/Convolution/Convolution2PassProcessor{TPixel}.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very, very nice! 🚀
1D convolution optimization and general codegen tweaks
Prerequisites
Description
This PR does a few things:
Benchmarks
Here's a preview of the current improvements for the gaussian blur benchmark:
And here's some more bokeh blur optimizations compared to master, after #1475 got merged: