-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gfx11] MIOpen unit tests are failing - possible false alarm? #2079
Comments
LastTest.log
is causing the issue due to We should avoid using such keywords in log printing. |
@junliume This is known problem, you can find detailed description at #2038 (comment) Please do not use log level 7 (TRACE) until it is fixed. Generally, I do not recommend using log levels > 4 in our CI unless really necessary or explicitly set in the test (as logs become really huge and this may affect testing performance). I'll change the format of the guilty TRACE message to match other messages in GenericSearch. Please assign this ticket to me and rename it like "Logs of GenericSearch contain "failed" at TRACE level". |
LastTest_no_applicable_solver.log
Usually this means that we do not have applicable solver for this combination of config and platform, but I am curious why? |
This problem is different. I'll look into it ASAP and provide a fix for env | grep -E "MIO|AMD|ROC|GPU|DEVICE" | sort |
@junliume test_conv_igemm_mlir_bwd_wrw is not applicable for gfx11, should not be run on it (and not enabled for this target in tests/CMakeLists.txt). Does this problem really happen on gfx11? |
@atamazov interesting. and yes, I am on a gfx1100 system:
I guess these tests should be disabled?
but how about these ones:
|
@junliume All these tests are NOT enabled for gfx11. Something is wrong on your system. I need out put of |
I think we should disable these tests on gfx1100 since these solvers have been restricted existing architectures only |
I have both gfx1030 and gfx1100 on my system (two cards), so maybe that is the reason?
|
|
if you would attach the full logs as I requested, then I would be able to answer this question 😄
Your system must have ROCR_VISIBLE_DEVICES set to 0 or 1, so rocminfo shows only gfx1100 and ignores gfx1030. More info at ROCm/ROCm#841 (comment) |
@atamazov I think we need to modify CMakeLists.txt in test folder, so that ROCMINFO shows Device - 0 only. |
@junliume This won't help because the tests (executables) will likely use both devices. Please use ROCR_VISIBLE_DEVICES as suggested in my previous comment. Currently MIOpen does not support non-uniform GPU configurations. |
@atamazov thanks! yep |
[Symptom]:
The following unit tests from MIOpen are failing on
gfx1100
:[Analysis]
However, the log shows that
So likely it failed because the log contains strings like "FAILED" but it means no applicable kernel is available.
@muralinr could you check if this is a false alarm? Thanks!
The text was updated successfully, but these errors were encountered: