-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[conv] Optimize performance of GWSS #1300
Conversation
…space is not necessary.
This comment has been minimized.
This comment has been minimized.
Please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hey, I need also a green mark ;) |
My bad, misclicked. |
Results of testing added. @junliume Does the positive effect worth the slightly increased complexity of the code, what do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, every perf optimization is good :)
IsApplicable()
calls when workspace is not necessary for a SolverMayNeedWorkspace()
is introduced in the Solver API for that.Average acceleration of GWSS calls
miopenConvolutionForwardGetSolutionWorkspaceSize
etc): 15...115%miopenConvolutionForwardGetWorkSpaceSize
etc): 2...17%However the overall effect on the total Aux Wall Time (that includes GWSS time) is relatively small and does not exceed 2%.
Side effects
The total GWSS time for the use case described at #1297 is progressed from ~24 to ~7 ms.