-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HOTFIX] Remove duplicated group divide when calculation k per group (SWDEV-291574) #989
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Regression test is missing. I am not going to delay merging this PR (and cherry-picking it to 4.3), for the sake of speed. But if this PR does not contain a regression test, then I'll open a high priority ticket for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conditionally approved
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This resolve SWDEV-291574 issue by removing the duplicated
group
divide while calculating k_per_block, sincek
already divided bygroup
in above L796.After this fix, the original problematic config
./bin/MIOpenDriver convfp16 -n 64 -c 256 -H 56 -W 56 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 32 -F 2 -t 1
should not be valid for this asm solver, sincec_per_bloc = 256/32 = 8
,k_per_bloc = 256/32 = 8
, this is too small for NCHW asm igemm to support (NCHW fp16 does not support pack c or k)Beside, if reduce group size to like 2, 4, 8, 16, the solver can successfully calculate the result.
./bin/MIOpenDriver convfp16 -n 64 -c 256 -H 56 -W 56 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 4 -F 2 -t 1
./bin/MIOpenDriver convfp16 -n 64 -c 256 -H 56 -W 56 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 8 -F 2 -t 1
./bin/MIOpenDriver convfp16 -n 64 -c 256 -H 56 -W 56 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 16 -F 2 -t 1