Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] JNI GPU memory leak when assertions are enabled #13225

Closed
abellina opened this issue Apr 26, 2023 · 0 comments · Fixed by #13262
Closed

[BUG] JNI GPU memory leak when assertions are enabled #13225

abellina opened this issue Apr 26, 2023 · 0 comments · Fixed by #13262
Assignees
Labels
bug Something isn't working Java Affects Java cuDF API.

Comments

@abellina
Copy link
Contributor

This line: https://github.com/rapidsai/cudf/blob/branch-23.06/java/src/main/java/ai/rapids/cudf/ColumnVector.java#L153, creates a new OffHeapState using a DeviceMemoryBuffer (contiguousBuffer). This OffHeapState owns the contiguous buffer, hence it incRefCounts it.

Since #13071 was added, the construction of ColumnView can throw because ColumnView.hasNonEmptyNulls performs GPU memory allocations (and we could get an OOM right there). The very next thing that happens in line https://github.com/rapidsai/cudf/blob/branch-23.06/java/src/main/java/ai/rapids/cudf/ColumnVector.java#L153 with the call to super (following the chain), is the call to ColumnView's constructor with the contiguous buffer that has been incRefCounted and the exception that can happen here.

In addition to an OOM, you could get AssertionError, again with the same effect of a dangling contiguous buffer.

We need to make sure that the ColumnView constructor's clean this up. I am not 100% sure on the ownership of OffHeapState re: ColumnView, e.g. should it close the offHeap state blindly..

@abellina abellina added bug Something isn't working Needs Triage Need team to review and classify Java Affects Java cuDF API. labels Apr 26, 2023
@mattahrens mattahrens removed the Needs Triage Need team to review and classify label May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Java Affects Java cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants