-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KDB updates for latest assembler #2996
Conversation
manually verified fixes of cases in #2891 (comment) |
if(miopen::EndsWith(kern.kernel_file, ".s")) | ||
{ | ||
compile_options += | ||
" -mcpu=" + | ||
miopen::LcOptionTargetStrings{handle.GetTargetProperties()}.targetId; | ||
} | ||
else | ||
{ | ||
compile_options += " -mcpu=" + handle.GetDeviceName(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(miopen::EndsWith(kern.kernel_file, ".s")) | |
{ | |
compile_options += | |
" -mcpu=" + | |
miopen::LcOptionTargetStrings{handle.GetTargetProperties()}.targetId; | |
} | |
else | |
{ | |
compile_options += " -mcpu=" + handle.GetDeviceName(); | |
} | |
compile_options += " -mcpu="; | |
if(miopen::EndsWith(kern.kernel_file, ".s")) | |
{ | |
compile_options += | |
miopen::LcOptionTargetStrings{handle.GetTargetProperties()}.targetId; | |
} | |
else | |
{ | |
compile_options += handle.GetDeviceName(); | |
} |
[R] For me it looks a bit better.
if(miopen::EndsWith(kern.kernel_file, ".s")) | ||
{ | ||
compile_options += | ||
" -mcpu=" + | ||
miopen::LcOptionTargetStrings{handle.GetTargetProperties()} | ||
.targetId; | ||
} | ||
else | ||
{ | ||
compile_options += " -mcpu=" + handle.GetDeviceName(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(miopen::EndsWith(kern.kernel_file, ".s")) | |
{ | |
compile_options += | |
" -mcpu=" + | |
miopen::LcOptionTargetStrings{handle.GetTargetProperties()} | |
.targetId; | |
} | |
else | |
{ | |
compile_options += " -mcpu=" + handle.GetDeviceName(); | |
} | |
compile_options += " -mcpu=" | |
if(miopen::EndsWith(kern.kernel_file, ".s")) | |
{ | |
compile_options += | |
miopen::LcOptionTargetStrings{handle.GetTargetProperties()} | |
.targetId; | |
} | |
else | |
{ | |
compile_options += handle.GetDeviceName(); | |
} |
@cderb there are quite a few entries not found for gfx90a, could you help to double check? Thanks! |
@junliume @JehandadKhan this is a side effect of the targetid being added to the-mcpu argument. |
@atamazov can we revert this change?
|
I do not think so. More info here: #2309 (comment). I am investigating what needs to be done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟢 LGTM
See also #2891 (comment) Note that we don't need to generate DBs for all possible combinations. We need to do this only for the most common combinations. |
@junliume if the kernel arguments change is reverted, then the code changes in this PR are no longer necessary. The kdb changes would also be inaccurate. If the assembler changes are causing an issue, then I can generate the kdbs once more. Otherwise we can throw out this PR. |
@atamazov actually I have the opposite impression, since we build kernel caches using offline compiler (not hipRTC), then the code generated without update: discussed #2891 (comment) @atamazov @JehandadKhan please let me know if the above impression is mistaken. Thanks!
|
Confirmed that dbsync will pass with no kdb changes if we revert this change: Lines 500 to 503 in 4f5ed42
|
#2996 (comment) is updated (clarified) |
The reason is different, I guess. Offline compiler supports
Well, actually, if xnack is not specified but the GPU has target feature. then compiler assumes xnack+. And this is not free from the performance POV. I do not know the actual impact for our kernels, but it seems like current system KDB has some space for improvement. The same considerations apply to sramecc. If we want to get the best performance for certain targets (I believe we do want this), then we need to
|
@junliume I am going to open a ticket about "KDB optimization" soon. |
Discarding this change. |
This PR updates db_sync and KDB files to support assembler changes and changes made in #2891
Kernel argument changes to LoadProgram function replicated in db_sync test.