[API] Get/Set "ALT" ConvolutionAttribute. [Core] Pass attribute to Invokers. [FP16][NHWC][gfx90a][asm igemm] Build & run appropriate kernel. #1226

carlushuang · 2021-10-16T14:27:30Z

This is prerequisite for #1227. No functional changes, except new API calls (these calls do not change the behavior of the library for now).

MIOpen API: Get/Set ConvolutionAttribute in convolution descriptor.
Env var: MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPL.
The attribute passed to Fwd/Bwd/Wrw/Fused InvokeParams.
In ConvAsmImplicitGemmGTCDynamicFwdXdlopsNHWC, ConvAsmImplicitGemmGTCDynamicBwdXdlopsNHWC,
ConvAsmImplicitGemmGTCDynamicWrwXdlopsNHWC:
- Additional kernels produced when necessary.
- Invokers select appropriate kernel according to the new attribute
By-products:
- Removed useless initializers in Fwd/Bwd/Wrw InvokeParams
- [clang-tidy] Fixed some warnings for ROCm 4.5.
- [clang-tidy] Disabled altera-unroll-loops (ROCm 4.5).
- [clang-tidy] Sorted list of disabled warnings.

-- @atamazov

…16_ALT_IMPL to control MIOPEN_DEBUG_FP16_ALT_IMP attribute

atamazov · 2021-10-16T20:32:50Z

@carlushuang please see mail.

… based on attribute MIOPEN_CONVOLUTION_ATTRIB_FP16_ALT_IMPL value

carlushuang · 2021-10-18T16:00:13Z

@atamazov @JehandadKhan this PR is workable, and with kernel from PR:#1227, can switch between fp16 alt impl and native fp16 impl.

carlushuang · 2021-10-21T02:19:55Z

@atamazov this PR and #1227 are also needed for release 5.0. Please help review these PR and if anything you think need reorganize, maybe we need hurry up...

junliume · 2021-10-21T15:11:05Z

@atamazov this PR and #1227 are also needed for release 5.0. Please help review these PR and if anything you think need reorganize, maybe we need hurry up...

working on getting #1226 through CI, #1227 has passed CI, gentle ping @atamazov for review :)

src/include/miopen/convolution.hpp

…e attribute. Some error handling. Comments.

src/solver/conv_asm_implicit_gemm_gtc_wrw_nhwc.cpp

atamazov · 2021-10-22T22:05:47Z

I am working on this. Changes will most likely be required in #1227 (and I will take care of those)

…nvAsmImplicitGemmGTCDynamicWrwXdlopsNHWC: Update Solver and Invokers.

atamazov · 2021-10-23T00:01:46Z

@carlushuang Please look at changes in the WrW Solver, and ask questions about design, if any. Thanks.

src/ocl/convolutionocl.cpp

src/convolution.cpp

src/solver/conv_asm_implicit_gemm_gtc_wrw_nhwc.cpp

…ng. MIOPEN_DEBUG_FP16_ALT_IMP -> MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPL.

…LT attribute via InvokeParams.

…Cm 4.5). Sort list of disabled warnings.

…he sake of clarity of the source code.

…ssues

…small refactor

…Update Solver and Invoker.

…Update Solver and Invokers

…ecks" This reverts commit 5bb733f.

atamazov · 2021-10-25T19:12:53Z

@carlushuang Please also review this, thanks.

src/solver/conv_asm_implicit_gemm_gtc_bwd_nhwc.cpp

…nd arg

atamazov

LGTM. I'm going to merge this myself, as soon as it passes the CI.

junliume · 2021-10-26T16:09:38Z

LGTM. I'm going to merge this myself, as soon as it passes the CI.

The most recent commit is not yet picked up by CI, and whole set of test will take a long time.
Can we select gfx90a tests only?

atamazov · 2021-11-05T22:15:04Z

src/solver/conv_asm_implicit_gemm_gtc_wrw_nhwc.cpp

            return [=](const Handle& handle, const AnyInvokeParams& primitive_parameters) mutable {
                decltype(auto) wrw_invoke_params =
                    primitive_parameters.CastTo<conv::WrWInvokeParams>();
-                const auto& tensors       = wrw_invoke_params.tensors;
-                const auto k              = handle.Run(kernels[0]);
+                const auto& tensors = wrw_invoke_params.tensors;
+                const auto k        = handle.Run(
+                    kernels[(isGfx90aFp16altSupport && wrw_invoke_params.gfx90aFp16alt) ? 1 : 0]);


* Follows the design of #1226 - FP16 ALT kernels in GEMM obey the existing `MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPL`. - [Informative] Also there is dedicated env control in rocBLAS, `ROCBLAS_INTERNAL_FP16_ALT_IMPL`. * Short explanation: - GEMM Invokers read gfx90aFp16alt attribute from the instance of AnyInvokeParams passed to it. - This attribute is then forwarded down to calls of `rocblas_gemm_ex` and `rocblas_gemm_strided_batched_ex`. - rocBLAS will make the determination at runtime, and ignore the attribute when not executing on gfx90a - ⚠️ We expect that rocBLAS ignore the attribute for non-FP16 kernels. * This PR have no effect for GEMM backends other than rocBLAS.

carlushuang added 2 commits October 16, 2021 09:25

implement set/get attribute API, and add MIOPEN_CONVOLUTION_ATTRIB_FP…

4136bbb

…16_ALT_IMPL to control MIOPEN_DEBUG_FP16_ALT_IMP attribute

Merge remote-tracking branch 'origin/develop' into gfx90a_fp16_alt_impl

e1d563e

get attribute in asm igemm nhwc solver, and condintionally set symbol…

0210e86

… based on attribute MIOPEN_CONVOLUTION_ATTRIB_FP16_ALT_IMPL value

carlushuang mentioned this pull request Oct 18, 2021

[FP16][NHWC][gfx90a][gfx908][asm igemm] Support for "ALT" computations. Engage V_PK_ATOMIC_ADD_FP16 and workspace. #1227

Merged

carlushuang requested a review from atamazov October 18, 2021 14:45

carlushuang changed the title ~~fp16 alt implementation in gfx90a~~ Alt implementation of iGEMM kernel Oct 19, 2021

carlushuang changed the title ~~Alt implementation of iGEMM kernel~~ Implementation of Convolution Attributes Oct 19, 2021

carlushuang marked this pull request as ready for review October 21, 2021 02:18

junliume added this to the ROCm 5.0 milestone Oct 21, 2021

junliume added urgency_high value_high labels Oct 21, 2021

This comment has been minimized.

Sign in to view

junliume added the TESTING_CI_PASSED label Oct 21, 2021

junliume reviewed Oct 22, 2021

View reviewed changes

src/include/miopen/convolution.hpp Outdated Show resolved Hide resolved

atamazov added 2 commits October 22, 2021 19:54

Merge branch 'develop' into gfx90a_fp16_alt_impl

533f954

gfx90a_fp16_alt_impl(01) Add constness to the API. Allow resetting th…

0ba350c

…e attribute. Some error handling. Comments.

atamazov reviewed Oct 22, 2021

View reviewed changes

src/solver/conv_asm_implicit_gemm_gtc_wrw_nhwc.cpp Outdated Show resolved Hide resolved

atamazov self-assigned this Oct 22, 2021

gfx90a_fp16_alt_impl(02) WrW: Pass ALT attribute via InvokeParams. Co…

1020032

…nvAsmImplicitGemmGTCDynamicWrwXdlopsNHWC: Update Solver and Invokers.

atamazov marked this pull request as draft October 22, 2021 23:59

atamazov reviewed Oct 23, 2021

View reviewed changes

src/ocl/convolutionocl.cpp Outdated Show resolved Hide resolved

atamazov reviewed Oct 23, 2021

View reviewed changes

src/convolution.cpp Outdated Show resolved Hide resolved

carlushuang commented Oct 23, 2021

View reviewed changes

src/solver/conv_asm_implicit_gemm_gtc_wrw_nhwc.cpp Show resolved Hide resolved

gfx90a_fp16_alt_impl(03) Accelerate access to attribute. Error handli…

35c81c0

…ng. MIOPEN_DEBUG_FP16_ALT_IMP -> MIOPEN_DEBUG_CONVOLUTION_ATTRIB_FP16_ALT_IMPL.

atamazov added 11 commits October 24, 2021 19:19

gfx90a_fp16_alt_impl(07) [fin] Fix build error

a50e3b1

gfx90a_fp16_alt_impl(08) [TEMP][CI] Disable all but static checks

5bb733f

gfx90a_fp16_alt_impl(09) Remove useless initializers. Fwd/Bwd: Pass A…

7649295

…LT attribute via InvokeParams.

gfx90a_fp16_alt_impl(10) [clang-tidy] Disable altera-unroll-loops (RO…

4043347

…Cm 4.5). Sort list of disabled warnings.

gfx90a_fp16_alt_impl(11) [clang-tidy] Fix some warnings for ROCm 4.5.

65aa044

gfx90a_fp16_alt_impl(12) Disable tidy checks at couple of lines for t…

a32035a

…he sake of clarity of the source code.

gfx90a_fp16_alt_impl(13) Less clarity, but no more cppcheck or tidy i…

3f57ccf

…ssues

gfx90a_fp16_alt_impl(14) ConvAsmImplicitGemmGTCDynamicWrwXdlopsNHWC: …

7abd5aa

…small refactor

gfx90a_fp16_alt_impl(15) ConvAsmImplicitGemmGTCDynamicFwdXdlopsNHWC: …

f57f66e

…Update Solver and Invoker.

gfx90a_fp16_alt_impl(16) ConvAsmImplicitGemmGTCDynamicBwdXdlopsNHWC: …

ba44525

…Update Solver and Invokers

Revert "gfx90a_fp16_alt_impl(08) [TEMP][CI] Disable all but static ch…

7d4bbf4

…ecks" This reverts commit 5bb733f.

atamazov marked this pull request as ready for review October 25, 2021 18:16

atamazov requested a review from DrizztDoUrden October 25, 2021 18:17

atamazov added urgency_blocker and removed TESTING_CI_PASSED urgency_high labels Oct 25, 2021

atamazov changed the title ~~Implementation of Convolution Attributes~~ [API] Get/Set ConvolutionAttribute. [Core] Pass attribute to Invokers. [asm igemm][gfx90a][FP16] Build & run appropriate kernel. Oct 25, 2021

carlushuang commented Oct 26, 2021

View reviewed changes

src/solver/conv_asm_implicit_gemm_gtc_bwd_nhwc.cpp Outdated Show resolved Hide resolved

carlushuang and others added 2 commits October 26, 2021 05:24

fix ostringstream constructor with string problem, by adding eta to 2…

a529457

…nd arg

[ci-skip][Quality] openmode is a member of ios_base, not ostringstream.

a376136

atamazov approved these changes Oct 26, 2021

View reviewed changes

atamazov merged commit 3ddeaaa into develop Oct 27, 2021

atamazov reviewed Nov 5, 2021

View reviewed changes

atamazov mentioned this pull request Nov 6, 2021

[gfx90a][FP16][rocBLAS] Enable Gemm solver for Alt Impl #1261

Merged

carlushuang deleted the gfx90a_fp16_alt_impl branch December 6, 2021 13:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API] Get/Set "ALT" ConvolutionAttribute. [Core] Pass attribute to Invokers. [FP16][NHWC][gfx90a][asm igemm] Build & run appropriate kernel. #1226

[API] Get/Set "ALT" ConvolutionAttribute. [Core] Pass attribute to Invokers. [FP16][NHWC][gfx90a][asm igemm] Build & run appropriate kernel. #1226

carlushuang commented Oct 16, 2021 •

edited by atamazov

Loading

atamazov commented Oct 16, 2021

carlushuang commented Oct 18, 2021

carlushuang commented Oct 21, 2021

This comment has been minimized.

junliume commented Oct 21, 2021

atamazov commented Oct 22, 2021 •

edited

Loading

atamazov commented Oct 23, 2021

atamazov commented Oct 25, 2021

atamazov left a comment

junliume commented Oct 26, 2021

atamazov Nov 5, 2021

[API] Get/Set "ALT" ConvolutionAttribute. [Core] Pass attribute to Invokers. [FP16][NHWC][gfx90a][asm igemm] Build & run appropriate kernel. #1226

[API] Get/Set "ALT" ConvolutionAttribute. [Core] Pass attribute to Invokers. [FP16][NHWC][gfx90a][asm igemm] Build & run appropriate kernel. #1226

Conversation

carlushuang commented Oct 16, 2021 • edited by atamazov Loading

atamazov commented Oct 16, 2021

carlushuang commented Oct 18, 2021

carlushuang commented Oct 21, 2021

This comment has been minimized.

junliume commented Oct 21, 2021

atamazov commented Oct 22, 2021 • edited Loading

atamazov commented Oct 23, 2021

atamazov commented Oct 25, 2021

atamazov left a comment

Choose a reason for hiding this comment

junliume commented Oct 26, 2021

atamazov Nov 5, 2021

Choose a reason for hiding this comment

carlushuang commented Oct 16, 2021 •

edited by atamazov

Loading

atamazov commented Oct 22, 2021 •

edited

Loading