-
-
Notifications
You must be signed in to change notification settings - Fork 853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1D convolution optimization and general codegen tweaks #1477
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
8e67153
Port horizontal convolution processor, remove Y loop
Sergio0694 a618b76
Port vertical convolution processor, remove X loop
Sergio0694 f52802d
Remove unnecessary inner loop coordinate sampling
Sergio0694 a9c1652
Switch to shared sampling map for convolution passes
Sergio0694 e60827f
Remove convolution state, more optimizations
Sergio0694 e574232
Remove transposed 1D kernels, switch to float[] type
Sergio0694 5a38307
Remove leftover ConvolutionRowOperation<TPixel> type
Sergio0694 e11adc6
Minor code tweaks
Sergio0694 cb5c868
More performance improvements to 2 pass convolution
Sergio0694 979baf7
More codegen improvements to bokeh blur
Sergio0694 1a3e1e7
More codegen improvements to shared methods
Sergio0694 5601559
Codegen improvements to Numerics.Clamp
Sergio0694 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we ever figure out how to do an accurate SIMD enable approximation of this we would be laughing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pow(channel, 0.416666666666667F) => exp(channel * log(0.416666666666667F))
log(0.416666666666667F) == -0.875468737353899935628f
So...
But
Exp
isn't a Simd intrinsic; however you can approximate it with these sequences sse_mathfun or avx_mathfun?