Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove target ids from kdb args #2309

Merged
merged 1 commit into from
Aug 18, 2023
Merged

Remove target ids from kdb args #2309

merged 1 commit into from
Aug 18, 2023

Conversation

cderb
Copy link
Contributor

@cderb cderb commented Aug 8, 2023

Target ids are removed from the -mcpu argument for the kernel object entries. This argument now only contains the architecture string.
This resolves a bug where MIOpen is unable to find these objects while querying the kdb.

@cderb cderb requested review from JehandadKhan and junliume August 8, 2023 22:57
@atamazov
Copy link
Contributor

atamazov commented Aug 9, 2023

@cderb AFAIU there should be corresponding changes in the library, right? Could you provide a link to the relevant PR? Thanks!

Another question is about other targets - gfx906/900, gfx103X/110X, gfx904X. Do we need the same changes in kdbs for them?

@cderb
Copy link
Contributor Author

cderb commented Aug 10, 2023

@cderb AFAIU there should be corresponding changes in the library, right? Could you provide a link to the relevant PR? Thanks!

@atamazov no corresponding MIOpen change this time, @JehandadKhan discovered that some of our kernels haven't been loading.

Another question is about other targets - gfx906/900, gfx103X/110X, gfx904X. Do we need the same changes in kdbs for them?

It looks like gfx906 could require the change as well, but gfx1030 and gfx900 are safe

@atamazov
Copy link
Contributor

@JehandadKhan @cderb

Target ids are removed from the -mcpu argument for the kernel object entries.

If you mean target features like xnack and sramecc then we must be careful. These features affect code generation in the compiler. Simplified:

  • If a GPU reports XNACK feature then the code generated without xnack will likely work incorrectly.
  • If a GPU does not report XNACK then the code generated with xnack may work slower than it potentially can.
  • If a GPU reports SRAMECC feature then the code generated without sramecc will likely work incorrectly.
  • If a GPU does not report SRAMECC then the code generated with sramecc may work slower than it potentially can.

Therefore the kernels should be build with correctly specified target features and the target features should be used for fetching kernels from kdb.

IIRC ROC runtime prevents loading of kernels that do not have correct target feature attributes.

This is just to remind -- maybe you are aware of all this stuff.

@cderb
Copy link
Contributor Author

cderb commented Aug 14, 2023

@atamazov
Tuna was adding for example '-mcpu=gfx90a:sram-ecc+:xnack-' onto the end of kernel argument lists for exported dbs. And this was based purely off the arch string.
MIOpen would search for the kernel arguments explicitly and it expected '-mcpu=gfx90a'.
The targetid isn't normally incorporated in this string, so it was added in error. The string in these dbs has been reduced to '-mcpu=gfx90a' etc.

@atamazov
Copy link
Contributor

@cderb Thanks for feedback. I've looked into MIOpen code and see the following:

  • Devices with different XNACK or SRAMECC attributes are assumed to have different ISA and therefore MIOpen uses different KDB files. Some part of KDB filename is taken TargetProperties::DbId().
  • As you are saying, MIOpen does not add target features to the KDB queries. That is correct because we have different KDBs for different combinations of target features.

I think that this PR is correct. I only wondering how system KDB worked before ;)

@junliume junliume merged commit 96c6934 into develop Aug 18, 2023
@junliume junliume deleted the cderb/gold18_kdb_patch branch August 28, 2023 23:40
cderb added a commit that referenced this pull request Sep 18, 2023
* 3D group forward convolution solver (#2286)

* [HotFix] Fix build issue after #2286 (#2328)

* Fix ConvCkIgemmFwdV6r1DlopsNchw solver to reflect that it's not dynamic (#2325)

* Remove target IDs from kdb entries (#2309)

* Dropout: make seed and states_num kernel arguments (#2277)

* [MI250] Adding missing kernel objects (#2329)

* Post-merge fixups: Replace environment variable check with problem config check and reduce lambda capture for Invoker obj (#2305)

* [HotFix][CI] fix HIP tidy issue from #2277 (#2335)

* [HotFix] Update requirements.txt MLIR ignore PATH for Python conda LLD (#2324)

* [NFC] Replace long integers with int64_t and size_t for better compatibility with Windows (#2323)

* Windows compatibility: replace long integers with int64_t and size_t, replace uint with unsigned int, replace long with long long for numbers, proper casting

* Fix formatting

* Fix 3d group forward convolution

* Resolve review comments

* Fix formatting

---------

Co-authored-by: Daming Feng <[email protected]>
Co-authored-by: Evgenii Averin <[email protected]>
Co-authored-by: JD <[email protected]>
Co-authored-by: Tal Ben-Nun <[email protected]>
Co-authored-by: amberhassaan <[email protected]>
Co-authored-by: Jun Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants