Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MI100][Fp16] ConvAsmImplicitGemmGTCDynamicWrwXdlops fails validation #996

Closed
atamazov opened this issue Jun 21, 2021 · 4 comments · Fixed by #1000
Closed

[MI100][Fp16] ConvAsmImplicitGemmGTCDynamicWrwXdlops fails validation #996

atamazov opened this issue Jun 21, 2021 · 4 comments · Fixed by #1000

Comments

@atamazov
Copy link
Contributor

atamazov commented Jun 21, 2021

gfx908, any ROCm:

Two cases from test_conv_igemm_dynamic_xdlops_wrw:

  • -n 400 -c 256 -H 7 -W 7 -k 1024 -y 7 -x 7 -p 0 -q 0
$ MIOPEN_FIND_MODE=normal MIOPEN_DEBUG_FIND_ONLY_SOLVER=ConvAsmImplicitGemmGTCDynamicWrwXdlops \
./bin/test_conv2d --half --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data \
--input 400, 256, 7, 7 --weights 1024, 256, 7, 7 --pads_strides_dilations 0 0 1 1 1 1 --trans_output_pads 0 0
...
FAILED: 0.14228
...
# Same config
MIOPEN_FIND_MODE=normal MIOPEN_DEBUG_FIND_ONLY_SOLVER=ConvAsmImplicitGemmGTCDynamicWrwXdlops \
./bin/MIOpenDriver convfp16 -w 1 -t 1 -i 1 -V 1 \
-n 400 -c 256 -H 7 -W 7 -k 1024 -y 7 -x 7 -p 0 -q 0 -F 4
...
Backward Convolution Weights Failed: 0.588887 > 0.082
  • -n 400 -c 256 -H 1 -W 1 -k 1024 -y 1 -x 1 -p 0 -q 0
$ MIOPEN_FIND_MODE=normal MIOPEN_DEBUG_FIND_ONLY_SOLVER=ConvAsmImplicitGemmGTCDynamicWrwXdlops \
./bin/test_conv2d --half --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data \
--input 400, 256, 1, 1 --weights 1024, 256, 1, 1 --pads_strides_dilations 0 0 1 1 1 1 --trans_output_pads 0 0
...
FAILED: 0.18679

What needs to be done:

This is release critical issue!

@shurale-nkn
Copy link
Contributor

ConvAsmImplicitGemmGTCDynamicFwdXdlops may also fail under test_conv2d

FAILED: 0.109962
Iteration: 0
Forward convolution: ConvAsmImplicitGemmGTCDynamicFwdXdlops
Input tensor: 8, 32, 14, 14
Weights tensor: 32, 32, 1, 1
Output tensor: 8, 32, 16, 16
Filter: conv2d, miopenConvolution, miopenPaddingDefault, {1, 1}, {1, 1}, {1, 1}, 
Max diff: 87
Mismatch at 16: 0 != -24

atamazov added a commit that referenced this issue Jun 21, 2021
@atamazov
Copy link
Contributor Author

@shurale-nkn This ticket is about WrW.

@shaojiewang
Copy link
Contributor

Root cause is found.

@atamazov
Copy link
Contributor Author

Good!

atamazov added a commit that referenced this issue Jun 22, 2021
…t_regression_half_vega. W/A for #995 and #996 (#991)

- Added: Workarounds for #995 and #996
- Fixes the following issues in tests:
  - Issue: `test_conv_igemm_dynamic_xdlops_bwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_fwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_wrw` does not test the HALF type.
  - Issue: `test_regression_half_vega` does nothing.
- [Jenkinsfile] Added dedicated build param for FP16/BF16/INT8 Smoke tests
- [NFC] Removed some useless if's.
atamazov pushed a commit that referenced this issue Jun 24, 2021
- Fix asm igemm wrw hoxwo less than b_padding bug
- [TESTS] Revert W/A for issue #996
atamazov added a commit that referenced this issue Jul 22, 2021
…t_regression_half_vega. W/A for #995 and #996 (#991)

- Added: Workarounds for #995 and #996
- Fixes the following issues in tests:
  - Issue: `test_conv_igemm_dynamic_xdlops_bwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_fwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_wrw` does not test the HALF type.
  - Issue: `test_regression_half_vega` does nothing.
- [Jenkinsfile] Added dedicated build param for FP16/BF16/INT8 Smoke tests
- [NFC] Removed some useless if's.
atamazov pushed a commit that referenced this issue Jul 22, 2021
- Fix asm igemm wrw hoxwo less than b_padding bug
- [TESTS] Revert W/A for issue #996
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants