Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] "import cudf" has changed the device ID #11386

Closed
wbo4958 opened this issue Jul 28, 2022 · 7 comments
Closed

[BUG] "import cudf" has changed the device ID #11386

wbo4958 opened this issue Jul 28, 2022 · 7 comments
Labels
0 - Blocked Cannot progress due to external reasons bug Something isn't working

Comments

@wbo4958
Copy link
Contributor

wbo4958 commented Jul 28, 2022

Looks like "import cudf" will change the device ID on 22.06 release

rapids 22.06

conda create -n rapids-22.06 -c rapidsai -c nvidia -c conda-forge  \
    rapids=22.06 python=3.9 cudatoolkit=11.5

repro

In [1]: import cupy

In [2]: cupy.cuda.runtime.getDevice()
Out[2]: 0

In [3]: cupy.cuda.runtime.setDevice(1)

In [4]: cupy.cuda.runtime.getDevice()
Out[4]: 1

In [5]: import cudf

In [6]: cupy.cuda.runtime.getDevice()
Out[6]: 0
@wbo4958 wbo4958 added bug Something isn't working Needs Triage Need team to review and classify labels Jul 28, 2022
@shwina
Copy link
Contributor

shwina commented Jul 28, 2022

Hi @wbo4958 - thanks for reporting! This looks like an issue with CUDA Python:

In [7]: import cuda.cudart

In [8]: import cupy

In [9]: cupy.cuda.runtime.setDevice(1)

In [10]: cupy.cuda.runtime.getDevice()
Out[10]: 1

In [11]: cuda.cudart.cudaGetDeviceCount()
Out[11]: (<cudaError_t.cudaSuccess: 0>, 2)

In [12]: cupy.cuda.runtime.getDevice()
Out[12]: 0

While we investigate further, perhaps you could use the environment variable CUDA_VISIBLE_DEVICES instead to control which GPU to use?

@shwina
Copy link
Contributor

shwina commented Jul 29, 2022

One way to work around this bug is to import cudf before cupy.

If for some reason you cannot do that, yet another workaround involves calling cuda.cudart.cudaGetDevice() before importing cudf:

In [1]: import cupy

In [2]: import cuda.cudart

In [3]: cuda.cudart.cudaGetDevice()
Out[3]: (<cudaError_t.cudaSuccess: 0>, 0)

In [4]: cupy.cuda.runtime.getDevice()
Out[4]: 0

In [5]: cupy.cuda.runtime.setDevice(1)

In [6]: cupy.cuda.runtime.getDevice()
Out[6]: 1

In [7]: import cudf

In [8]: cupy.cuda.runtime.getDevice()
Out[8]: 1

@wbo4958
Copy link
Contributor Author

wbo4958 commented Aug 2, 2022

Thx @shwina

@github-actions
Copy link

github-actions bot commented Sep 1, 2022

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@vyasr
Copy link
Contributor

vyasr commented Oct 20, 2022

@shwina is this something we need to follow up with the cuda-python team on?

@GregoryKimball GregoryKimball added 0 - Blocked Cannot progress due to external reasons and removed Needs Triage Need team to review and classify labels Oct 21, 2022
@wence-
Copy link
Contributor

wence- commented Nov 22, 2022

This is fixed by the changes in cuda-python that fixed NVIDIA/cuda-python#24, which are in v11.8. So should be resolved once we move there.

@vyasr vyasr removed the cuda label Feb 23, 2024
@vyasr vyasr closed this as completed May 14, 2024
@vyasr
Copy link
Contributor

vyasr commented May 14, 2024

Verified that this no longer occurs:


In [1]: import cupy

In [2]: cupy.cuda.runtime.getDevice()
Out[2]: 0

In [3]: cupy.cuda.runtime.setDevice(1)

In [4]: cupy.cuda.runtime.getDevice()
Out[4]: 1

In [5]: import cudf

In [6]: cupy.cuda.runtime.getDevice()
Out[6]: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Blocked Cannot progress due to external reasons bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants