Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add support for CUDA 11 #5369

Closed
harrism opened this issue Jun 3, 2020 · 13 comments · Fixed by #5398
Closed

[FEA] Add support for CUDA 11 #5369

harrism opened this issue Jun 3, 2020 · 13 comments · Fixed by #5398
Assignees
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.

Comments

@harrism
Copy link
Member

harrism commented Jun 3, 2020

Is your feature request related to a problem? Please describe.

This issue is for tracking issues building and running cuDF with CUDA 11, and any fixes required.

Currently testing with a prerelease of CUDA 11.0. Issues:

  1. CUB is now part of the CUDA toolkit, and cmake is finding CUB there before the submodule in thirdparty. The newer version in the toolkit changed the API of cub::ShuffleIndex, used in join_kernels.cuh. The fix is easy, but is dependent on [REVIEW] fetch thrust/cub from github #5315 . If we update to the latest CUB, we need to change join_kernels.cuh. If we do not, we can reorder paths in CMakeLists.txt to favor thirdparty over the toolkit. This was all resolved by [REVIEW] fetch thrust/cub from github #5315

  2. Conda does not yet have a cudatoolkit=11.0 package (naturally). This is not a problem when just building libcudf, but may cause problems with testing Python. Edit: Python testing seems to work with conda cudatoolkit=10.2, but wheels are in motion to produce an 11.0 metapackage.

  3. On first build and test (after fixing Collaborations on columnar data structures  #1 above locally), I get these gtest failures.

The following tests FAILED:
	  5 - GROUPBY_TEST (Failed)
	 21 - BINARY_TEST (Failed)
	 22 - TRANSFORM_TEST (Failed)
	 40 - ROLLING_TEST (SEGFAULT)
	 41 - GROUPED_ROLLING_TEST (SEGFAULT)

Edit: this will be resolved by merging the latest from jitify into the rapidsai/jitify fork.

@harrism harrism added feature request New feature or request cuda libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. labels Jun 3, 2020
@harrism harrism self-assigned this Jun 3, 2020
@jrhemstad
Copy link
Contributor

@rongou has already fixed 1. in #5315.

@harrism
Copy link
Member Author

harrism commented Jun 4, 2020

Regarding 3) above: this is fixed by getting a more recent jitify. We need to update rapidsai/jitify to the latest from NVIDIA/jitify.

The next issue I ran into was when building the libcudf python bindings, I get errors like this:

/usr/local/cuda-11.0/include/thrust/system/cuda/config.h:78:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.

This is because it looks for cub in thirdparty before the toolkit. If we fetch both CUB and thrust together in #5315 we should make sure they are always compatible.

@harrism
Copy link
Member Author

harrism commented Jun 4, 2020

I fixed the CUB/Thrust incompatibility error locally by commenting out this line:

"../../thirdparty/cub",

With this cuDF Cython compiles and all pytests pass.

And @rongou has added the fix for this with a723379

@kkraus14
Copy link
Collaborator

kkraus14 commented Jun 4, 2020

Numba related fixes are here: numba/numba#5819

@kkraus14
Copy link
Collaborator

kkraus14 commented Jun 8, 2020

CuPy CUDA 11 PR: cupy/cupy#3405

@kkraus14 kkraus14 reopened this Jun 9, 2020
@kkraus14
Copy link
Collaborator

kkraus14 commented Jun 9, 2020

Leaving this open until the corresponding Numba / CuPy PRs are merged.

@harrism
Copy link
Member Author

harrism commented Aug 10, 2020

@kkraus14 Numba and CuPy PRs are merged. Closing. Reopen if there are still blockers. But I believe cuDF works with CUDA 11 now.

@pentschev
Copy link
Member

CuPy 8 won't be released before RAPIDS 0.15. There's currently some work being done to backport CUDA 11.0 to 7.8.0 but there are still issues that we're trying to track down and fix.

@pentschev
Copy link
Member

The expected release date for CuPy 7.8.0 is August 19th, a week before RAPIDS 0.15.

@Extrodox
Copy link

Any idea about the support of CUDA 11.1 ?

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 30, 2020

Any idea about the support of CUDA 11.1 ?

We are currently exploring our options in supporting CUDA 11.1 and greater with CUDA Enhanced Compatibility (https://docs.nvidia.com/deploy/cuda-compatibility/index.html#enhanced-compat-minor-releases).

There are no immediate plans to directly support CUDA 11.1.

@Extrodox
Copy link

Extrodox commented Oct 30, 2020

We are currently exploring our options in supporting CUDA 11.1 and greater with CUDA Enhanced Compatibility (https://docs.nvidia.com/deploy/cuda-compatibility/index.html#enhanced-compat-minor-releases).

There are no immediate plans to directly support CUDA 11.1.

Is there any workaround to make this run on NVIDIA RTX 3090?
As ampere architecture requires me to use CUDA 11.1 @kkraus14

I tried linking libnvrtc.so.11.0 libnvrtc.so.11.1.
Still getting the following error.

java.lang.UnsatisfiedLinkError: /disk1/tmp/cudf<>.so: /usr/local/cuda/targets/x86_64-linux/lib/libnvrtc.so.11.0: version libnvrtc.so.11.0' not found (required by /disk1/tmp/cudf<>.so)`

@kkraus14
Copy link
Collaborator

Is there any workaround to make this run on NVIDIA RTX 3090?
As ampere architecture requires me to use CUDA 11.1 @kkraus14

You can use CUDA 11.0 on an RTX 3090. The sm_80 architecture is compatible with RTX 30 series cards where it doesn't require PTX jit compilation in order to run.

So you need a newer driver in order to support the RTX 3090, but you should be fine with the CUDA 11.0 toolkit / libraries.

If that doesn't work please report back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants