Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TESTS][Fp16][MI100] Incorrect test case in test_conv_igemm_dynamic_xdlops_bwd and wrw #995

Open
atamazov opened this issue Jun 21, 2021 · 13 comments

Comments

@atamazov
Copy link
Contributor

atamazov commented Jun 21, 2021

BWD:

$ MIOPEN_FIND_MODE=normal MIOPEN_DEBUG_FIND_ONLY_SOLVER=ConvAsmImplicitGemmGTCDynamicBwdXdlops \
./bin/test_conv2d --half --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-weights \
--input 4, 512, 128, 128 --weights 12, 512, 1, 1 --pads_strides_dilations 0 0 1 1 1 1 --trans_output_pads 0 0
...
FAILED: /dockerx/github/miopenx01/src/ocl/convolutionocl.cpp:1216: Backward Data Convolution cannot be executed due to incorrect params

WRW:

CONSOLE LOGS
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [GetFindModeValueImpl] MIOPEN_FIND_MODE = NORMAL(1)
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [get_device_name] Raw device name: gfx908
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [Handle] stream: 0x28210d0, device_id: 1
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [HipCompilerVersionImpl] 3.7.20315
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, MIOpen version 2.13.0.8492-9ebbe9fde
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [BackwardDataGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [GetEnvFindOnlySolverImpl] 72
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [ForwardGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 3, 32, 32 --weights 1, 3, 11, 11 --pads_strides_dilations 1 1 2 2 2 1 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [FindConvBwdWeightsAlgorithm] requestAlgoCount = 1, workspace = 0
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [TryLoad] Find-db regenerating.
[2021-06-21T16:19:29.911Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 3, 32, 32 --weights 1, 3, 11, 11 --pads_strides_dilations 1 1 2 2 2 1 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.911Z] FAILED: /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/src/ocl/convolutionocl.cpp:1561: Backward Weights Convolution cannot be executed due to incorrect params
[2021-06-21T16:19:29.911Z] Backward weights convolution: 
[2021-06-21T16:19:29.911Z] Input tensor: 1, 3, 32, 32
[2021-06-21T16:19:29.911Z] Weights tensor: 1, 3, 11, 11
[2021-06-21T16:19:29.911Z] Output tensor: 1, 1, 7, 12
[2021-06-21T16:19:29.911Z] Filter: conv2d, miopenConvolution, miopenPaddingDefault, {1, 1}, {2, 2}, {2, 1}, 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [GetFindModeValueImpl] MIOPEN_FIND_MODE = NORMAL(1)
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [get_device_name] Raw device name: gfx908
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [Handle] stream: 0x2280840, device_id: 0
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [HipCompilerVersionImpl] 3.7.20315
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, MIOpen version 2.13.0.8492-9ebbe9fde
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [BackwardDataGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [GetEnvFindOnlySolverImpl] 72
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [ForwardGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 3, 224, 224 --weights 1, 3, 3, 3 --pads_strides_dilations 0 0 1 1 2 2 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [FindConvBwdWeightsAlgorithm] requestAlgoCount = 1, workspace = 0
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [TryLoad] Find-db regenerating.
[2021-06-21T16:19:29.911Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 3, 224, 224 --weights 1, 3, 3, 3 --pads_strides_dilations 0 0 1 1 2 2 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.911Z] FAILED: /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/src/ocl/convolutionocl.cpp:1561: Backward Weights Convolution cannot be executed due to incorrect params
[2021-06-21T16:19:29.911Z] Backward weights convolution: 
[2021-06-21T16:19:29.911Z] Input tensor: 1, 3, 224, 224
[2021-06-21T16:19:29.911Z] Weights tensor: 1, 3, 3, 3
[2021-06-21T16:19:29.911Z] Output tensor: 1, 1, 220, 220
[2021-06-21T16:19:29.911Z] Filter: conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {2, 2}, 
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [GetFindModeValueImpl] MIOPEN_FIND_MODE = NORMAL(1)
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [get_device_name] Raw device name: gfx908
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [Handle] stream: 0x1d64910, device_id: 0
[2021-06-21T16:19:29.911Z] MIOpen(HIP): Info [HipCompilerVersionImpl] 3.7.20315
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, MIOpen version 2.13.0.8492-9ebbe9fde
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [BackwardDataGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [GetEnvFindOnlySolverImpl] 72
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [ForwardGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 1, 8, 8 --weights 1, 1, 2, 2 --pads_strides_dilations 0 0 1 1 2 2 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [FindConvBwdWeightsAlgorithm] requestAlgoCount = 1, workspace = 0
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [TryLoad] Find-db regenerating.
[2021-06-21T16:19:29.912Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 1, 8, 8 --weights 1, 1, 2, 2 --pads_strides_dilations 0 0 1 1 2 2 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.912Z] FAILED: /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/src/ocl/convolutionocl.cpp:1561: Backward Weights Convolution cannot be executed due to incorrect params
[2021-06-21T16:19:29.912Z] Backward weights convolution: 
[2021-06-21T16:19:29.912Z] Input tensor: 1, 1, 8, 8
[2021-06-21T16:19:29.912Z] Weights tensor: 1, 1, 2, 2
[2021-06-21T16:19:29.912Z] Output tensor: 1, 1, 6, 6
[2021-06-21T16:19:29.912Z] Filter: conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {2, 2}, 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [GetFindModeValueImpl] MIOPEN_FIND_MODE = NORMAL(1)
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [get_device_name] Raw device name: gfx908
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [Handle] stream: 0x18e0650, device_id: 1
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [HipCompilerVersionImpl] 3.7.20315
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, MIOpen version 2.13.0.8492-9ebbe9fde
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [BackwardDataGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [GetEnvFindOnlySolverImpl] 72
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [ForwardGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 128, 56, 56 --weights 1, 128, 5, 5 --pads_strides_dilations 0 0 2 2 1 1 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [BackwardWeightsGetWorkSpaceSize] 
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [FindConvBwdWeightsAlgorithm] requestAlgoCount = 1, workspace = 0
[2021-06-21T16:19:29.912Z] MIOpen(HIP): Info [TryLoad] Find-db regenerating.
[2021-06-21T16:19:29.912Z] /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/build/bin/test_conv2d --float --cmode conv --pmode default --group-count 1 --disable-forward --disable-backward-data --input 1, 128, 56, 56 --weights 1, 128, 5, 5 --pads_strides_dilations 0 0 2 2 1 1 --trans_output_pads 0 0 --in_layout NCHW --fil_layout NCHW --out_layout NCHW 
[2021-06-21T16:19:29.912Z] FAILED: /var/jenkins/workspace/LLibs_MIOpen_fix-tests-asm-igemm/src/ocl/convolutionocl.cpp:1561: Backward Weights Convolution cannot be executed due to incorrect params
[2021-06-21T16:19:29.912Z] Backward weights convolution: 
[2021-06-21T16:19:29.912Z] Input tensor: 1, 128, 56, 56
[2021-06-21T16:19:29.912Z] Weights tensor: 1, 128, 5, 5
[2021-06-21T16:19:29.912Z] Output tensor: 1, 1, 26, 26
[2021-06-21T16:19:29.912Z] Filter: conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {2, 2}, {1, 1},
@atamazov
Copy link
Contributor Author

Similar to #954

@atamazov
Copy link
Contributor Author

I'll provide W/A soon.

@atamazov atamazov changed the title [TESTS] Incorrect test case in test_conv_igemm_dynamic_xdlops_fwd [TESTS][Fp16][MI100] Incorrect test case in test_conv_igemm_dynamic_xdlops_fwd Jun 21, 2021
@atamazov atamazov changed the title [TESTS][Fp16][MI100] Incorrect test case in test_conv_igemm_dynamic_xdlops_fwd [TESTS][Fp16][MI100] Incorrect test case in test_conv_igemm_dynamic_xdlops_bwd Jun 21, 2021
atamazov added a commit that referenced this issue Jun 21, 2021
@atamazov atamazov changed the title [TESTS][Fp16][MI100] Incorrect test case in test_conv_igemm_dynamic_xdlops_bwd [TESTS][Fp16][MI100] Incorrect test case in test_conv_igemm_dynamic_xdlops_bwd and wrw Jun 21, 2021
@atamazov
Copy link
Contributor Author

4 WrW configs added to the topmost comment

atamazov added a commit that referenced this issue Jun 21, 2021
@shaojiewang
Copy link
Contributor

wrw failures are due to incorrect parameters.
Can we make these n==1 cases only run for fp16 in ctest?

@atamazov
Copy link
Contributor Author

@shaojiewang Or course, you can.

If you interested how to do that -- create separate custom test (smth like test_conv_igemm_dynamic_xdlops_wrw_half), disable fp32 for it (FLOAT_DISABLE) and move these cases there.

atamazov added a commit that referenced this issue Jun 22, 2021
…t_regression_half_vega. W/A for #995 and #996 (#991)

- Added: Workarounds for #995 and #996
- Fixes the following issues in tests:
  - Issue: `test_conv_igemm_dynamic_xdlops_bwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_fwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_wrw` does not test the HALF type.
  - Issue: `test_regression_half_vega` does nothing.
- [Jenkinsfile] Added dedicated build param for FP16/BF16/INT8 Smoke tests
- [NFC] Removed some useless if's.
@shaojiewang
Copy link
Contributor

@shaojiewang Or course, you can.

If you interested how to do that -- create separate custom test (smth like test_conv_igemm_dynamic_xdlops_wrw_half), disable fp32 for it (FLOAT_DISABLE) and move these cases there.

OK. Thanks.

@atamazov
Copy link
Contributor Author

atamazov commented Jul 9, 2021

Any update?

atamazov added a commit that referenced this issue Jul 22, 2021
…t_regression_half_vega. W/A for #995 and #996 (#991)

- Added: Workarounds for #995 and #996
- Fixes the following issues in tests:
  - Issue: `test_conv_igemm_dynamic_xdlops_bwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_fwd` does not test the HALF type.
  - Issue: `test_conv_igemm_dynamic_xdlops_wrw` does not test the HALF type.
  - Issue: `test_regression_half_vega` does nothing.
- [Jenkinsfile] Added dedicated build param for FP16/BF16/INT8 Smoke tests
- [NFC] Removed some useless if's.
@atamazov
Copy link
Contributor Author

The W/A simply disables many test cases, thus making the library vulnerable for future bugs. Let's put the tests in order, thanks.

/cc @aserio @junliume

@shaojiewang
Copy link
Contributor

shaojiewang commented Jul 30, 2021

The W/A simply disables many test cases, thus making the library vulnerable for future bugs. Let's put the tests in order, thanks.

/cc @aserio @junliume

OK. I will work on this. Supposed to pull request by this week.

@atamazov
Copy link
Contributor Author

@shaojiewang Good, thanks.

@atamazov
Copy link
Contributor Author

Reopened due to #1068 (comment)

@ppanchad-amd
Copy link

@atamazov Is this ticket still relevant? Thanks!

@atamazov
Copy link
Contributor Author

atamazov commented Apr 5, 2024

@ppanchad-amd Yes, details below (follow the link).

Reopened due to #1068 (comment)

I recommend assigning this to @xinlipn or to @shaojiewang

/cc @junliume

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants