Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NNlibAMDGPUExt] Load MIOpen module only if it is available #483

Merged
merged 7 commits into from
Apr 6, 2023
Merged

[NNlibAMDGPUExt] Load MIOpen module only if it is available #483

merged 7 commits into from
Apr 6, 2023

Conversation

pxl-th
Copy link
Member

@pxl-th pxl-th commented Mar 17, 2023

Since MIOpen artifact is only available on Julia 1.9 it is not included in AMDGPU by default.
Therefore we need to load it only when it is available.

Otherwise, it causes crashes for users that do not have MIOpen installed.
For example in AMDGPU tests, where we compare CPU with AMDGPU implementations and don't rely on NNlib AMDGPU support.

  • Migrate to separate Project.toml for tests.

PR Checklist

  • Tests are added
  • Documentation, if applicable

@pxl-th pxl-th changed the title [NNlibAMDGPUExt] Load MIOpen module only if MIOpen is available [NNlibAMDGPUExt] Load MIOpen module only if it is available Mar 17, 2023
@ToucheSir
Copy link
Member

1.6 CUDA failure is probably spurious, but the AMDGPU one does not seem to be.

@pxl-th
Copy link
Member Author

pxl-th commented Mar 18, 2023

Segfault in CI, because it uses global ROCm installation which is of incompatible version.
I've switched to using artifacts, but since MIOpen is not installed it skips tests.

@ToucheSir
Copy link
Member

Can we install the MIOpen_jll on CI?

@@ -76,18 +76,23 @@ steps:
push!(conf[\"targets\"][\"test\"], \"AMDGPU\");
open(io -> TOML.print(io, conf), \"Project.toml\", \"w\");
"""
- julia --project=. -e """
using Pkg;
Pkg.develop(\"AMDGPU\");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason to switch to test/Project.toml sooner than later?

Copy link
Member

@ToucheSir ToucheSir Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or now, thanks!

@pxl-th
Copy link
Member Author

pxl-th commented Mar 31, 2023

AMDGPU CI is probably down...

@ToucheSir ToucheSir closed this Apr 1, 2023
@ToucheSir ToucheSir reopened this Apr 1, 2023
.buildkite/pipeline.yml Outdated Show resolved Hide resolved
@pxl-th
Copy link
Member Author

pxl-th commented Apr 5, 2023

Should we merge it? Even though AMDGPU CI is down, should be safe to do so.

Copy link
Member

@ToucheSir ToucheSir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I thought the CUDA jobs were also failing but must've not checked recently enough.

@ToucheSir ToucheSir merged commit 335ed1c into FluxML:master Apr 6, 2023
@pxl-th pxl-th deleted the conditional-miopen branch April 11, 2023 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants