[FEA][JNI] Throw a specific java exception for OOM #11970
Labels
0 - Backlog
In queue waiting for assignment
feature request
New feature or request
Java
Affects Java cuDF API.
Spark
Functionality that helps Spark RAPIDS
Currently we catch
rmm::out_of_memory
here: https://github.com/rapidsai/cudf/blob/branch-22.12/java/src/main/native/include/jni_utils.hpp#L853, and then we throwjava.lang.OutOfMemoryError
.The problem is that this exception type is normally used for the Java heap (but we are using for GPU memory as well), as pointed out by @gerashegalov here: NVIDIA/spark-rapids#6810 (comment).
We'd like to fix this by throwing a cuDF-JNI specific exception for GPU OOM (e.g.
RapidsGpuOutOfMemory
), and likely adding other exceptions for HOST. We would like to include the amount of memory that we attempted to allocate (ideally from the RMM exception rapidsai/rmm#1134), but we may have enough info in cuDF to work around it.The issue would also try/catch
java.lang.OutOfMemory
as thrown byUNSAFE.allocateMemory
here: https://github.com/rapidsai/cudf/blob/branch-22.12/java/src/main/java/ai/rapids/cudf/UnsafeMemoryAccessor.java#L79, and instead throw something likeRapidsHostOutOfMemory
.The text was updated successfully, but these errors were encountered: