-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast Hybrid mode of iGEMM #326
Conversation
OBSOLETE INFOThe proposed design has a drawback: It does not affect the result of GWSS for the Solvers that implement the `GetWorkspaceSize()` method, so the library might request an excessive workspace from the user's program. ~- (2) The design is algorithm-based which does not comply to the [proposal](https://github.com//issues/299#issuecomment-645692843)~ Also I dislike the idea of lists of solvers for any purposes. Basically, it is the Solver who must know everything about its applicability under certain circumstances. I have different design in mind. Low-level design
An attentive readers would note that “my” design does not literally match my own architectural proposal. However, if they look more closely, they could see that this design is actually a linear transformation of the proposal. |
My recent comment edited. |
@atamazov Thanks for your feedback. Supporting |
@zjing14 You are welcome, But please don't let words to fool yourself. One location is Solver. List is the second location. If we are going to add new lists for each new small feature, then where we'll end up? |
@atamazov Just double-check if I understand your proposal and confirm with the implementation details.
|
@zjing14 Good questions!
No, to
Yes, the support for
There is no need to create a new list. "OptimizedHybrid" should differ from plain "Hybrid" only in case of find-db miss: it sets Modified fragment of FindConvFwdAlgorithm()(Comments removed to save space) const miopen::FindMode fm;
bool use_immediate_solution = false;
miopenConvSolution_t sol;
if((fm.IsFast() || fm.IsHybrid()) && !use_winograd_only)
{
size_t count;
GetForwardSolutions(handle, wDesc, xDesc, yDesc, 1, &count, &sol);
use_immediate_solution = (count > 0) && !(fm.IsHybrid() && sol.time < 0);
}
if(use_immediate_solution)
{
CompileForwardSolution(handle, wDesc, xDesc, yDesc, sol.solution_id);
const auto id = solver::Id(sol.solution_id);
perf_db.push_back(
{id.GetAlgo(conv::Direction::Forward), id.ToString(), sol.time, sol.workspace_size});
}
else
{
ctx.skip_solutions_that_take_long_time_to_build_and_have_narrow_coverage
= fm.IsOptimizedHybrid(); /// <<<------------------------------------------ ADDED STATEMENT
perf_db = UserFindDbRecord::TryLoad(handle, problem, [&](DbRecord& record) {
DirConvFindCore(handle,
xDesc,
x,
wDesc,
w,
yDesc,
y,
workSpace,
workSpaceSize,
*this,
exhaustiveSearch,
record,
ctx, /// <<<---------------------- PASSING CONTEXT TO SearchForAllSolutions
use_winograd_only);
});
} Modified fragment of ForwardGetWorkSpaceSize()(Comments removed to save space) auto ctx = ConvolutionContext{xDesc, wDesc, yDesc, *this, conv::Direction::Forward};
...
const miopen::FindMode fm;
while(fm.IsFast() || fm.IsHybrid())
{
size_t count;
miopenConvSolution_t sol;
GetForwardSolutions(handle, wDesc, xDesc, yDesc, 1, &count, &sol);
if(count < 1 || (fm.IsHybrid() && sol.time < 0))
{
ctx.skip_solutions_that_take_long_time_to_build_and_have_narrow_coverage
= fm.IsOptimizedHybrid(); /// <<<-------------------------------------- ADDED STATEMENT
break; // Fall down to Normal Find.
}
MIOPEN_LOG_I2(sol.workspace_size);
return sol.workspace_size;
}
ctx.SetupFloats();
ctx.do_search = false;
ctx.disable_perfdb_access = true;
const size_t direct_workspace = ForwardBackwardDataGetWorkSpaceSizeDirect(ctx);
const size_t implicit_gemm_workspace = ForwardBackwardGetWorkSpaceSizeImplicitGemm(ctx);
const size_t workspace_size_scgemm = ForwardBackwardDataGetWorkSpaceSizeSCGemm(handle, ctx);
... As you can see, only two lines of code added. Unfortunately, some of the required modifications may involve more complex changes, because there are still some irregularities in the code (not all Solvers/Invokers are ready yet (((
Yes. Each Solver that “knows” about itself that it has narrow coverage and its build takes a lot of time should check the new attribute and tell the caller "I am not applicable". Hope this helps. |
It does seem a little hackish to use context variable to traverse a different list |
Which list do you mean? |
@atamazov Thanks very much for the clarification. Now, I think I understand. Will finish a modification soon. |
5f78ec3
to
e49eb08
Compare
Hi all, I modified the code following @atamazov comments. Please take a look. If the implementation is OK, I will set the optimized_hybrid to default. |
@@ -447,7 +447,11 @@ std::size_t ConvolutionDescriptor::ForwardGetWorkSpaceSize(Handle& handle, | |||
miopenConvSolution_t sol; | |||
GetForwardSolutions(handle, wDesc, xDesc, yDesc, 1, &count, &sol); | |||
if(count < 1 || (fm.IsHybrid() && sol.time < 0)) | |||
{ | |||
ctx.skip_solutions_that_take_long_time_to_build_and_have_narrow_coverage = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please shorten this variable name and preferably make this an int
to indicate cost estimate if @atamazov agrees. If not at least rename it to something simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cost estimate (or a build time?) is a good idea that allows for more flexibility. But we need also indication of coverage to make a decision (skip or use). Also I am not sure that we need this flexibility right now. Let's keep this idea in mind for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT naming, see #326 (comment):
...
skip_solutions_that_take_long_time_to_build_and_have_narrow_coverage
. This ugly name tells anything (and BTW denotes quirky architecture).
src/find_controls.cpp
Outdated
@@ -194,6 +194,7 @@ const char* ToCString(const FindMode::Values mode) | |||
case FindMode::Values::Normal: return "NORMAL"; | |||
case FindMode::Values::Fast: return "FAST"; | |||
case FindMode::Values::Hybrid: return "HYBRID"; | |||
case FindMode::Values::OptimizedHybrid: return "OPTIMIZED_HYBRID"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add a new mode, this looks like a heuristic which should just be part of the "Hybrid" mode. @atamazov comments ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This heuristic has negative side effects and looks like a patch from the architectural POV. That is why I would like to keep "plain" Hybrid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also update documentation in doc/src/find_and_immediate.md.
Here are updated numbers of ReNext101, bs=128 with tuning (MIOPEN_FIND_ENFORCE=3). @daniellowell @atamazov @JehandadKhan
|
Numbers of resnext101, bs128 with different FIND_MODE on Vega20 (MI60).
|
@zjing14 Merge conflict with develop. |
Numbers of resnext101, bs128, fp32 for different MIOPEN_FIND_MODE on Vega20 (MI60) with empty find-db.
|
@pfultz2 @atamazov @daniellowell @JehandadKhan Pl review. |
@zjing14 Just retrigger the build.
Needs review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitated to approve this PR since I prefer the previous approach used by @zjing14 where we setup a container of solvers to which we fell back.
- That approach makes it easier to just look at the code and tell which solvers would be executed as a fall back. The implemented approach requires log collection per config to determine which configs would be using the
hybrid
approach. - A new variable is not required to be added to the context. The context variable based approach makes it necessary that each solver is updated ( which this PR does) and makes it difficult to assess the impact of a change and burden all future solvers to setup the variable properly.
- Finally, the name of the newly added variable is almost the width of a line. Which is bad practice in my opinion.
I am approving this PR since this is an important issue, perhaps we can rework this later and update the overall mechanism.
@JehandadKhan Thanks for your opinion (although I have the opposite one). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
skip_solutions_that_take_long_time_to_build_and_have_narrow_coverage
attribut intoExecutionContext
IsOptimizedHybrid()
to set the attribute totrue
.IsHybrid()
to return true for both plainHybrid
andOptimizedHybrid
modesIsApplicable
of all hip-based implicitGemm solvers. If the attribute is true, skip the solverOptimizedHybrid
mode as default