-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] GPU Memory is completely consumed in AWS-EMR #10827
Comments
The issue was due to configuring the following parameter |
@akmalmasud96 just making sure I understand, do you want to run the the spark-rapids plugin? If so the spark.plugins flag is required. Thanks! |
The RAPIDS Accelerator consumes almost all of the GPU memory by default. It does not expect to share the GPU with another process. You can configure the RAPIDS Accelerator to consume much less GPU memory, although that often has detrimental effects on performance due to extra spilling to fit within the smaller amount of GPU memory. See the |
Apologies for the delay, I accidentally missed this. Working with RAPIDS RAFT should be possible, but details will depend on how you're planning on using it from Spark. I assume this is via a UDF, so are you planning on a Python UDF calling the RAPIDS RAFT Python APIs or something lower-level like the C++ RAFT APIs via a Java UDF that allows the data to stay in the JVM process? Python UDF will require sharing the GPU across processes, which is a bit tricky as mentioned above. |
I am trying to run Spark-Rapids with AWS-EMR. I am facing a problem that the GPU memory is completely consumed when the processing is initiated. And there is no space left in the GPU memory to perform processing.
Attached is an image, showing this phenomenon.
The AWS-EMR version is 7.1.0.
I want to ask that how can I solve this problem ?
The text was updated successfully, but these errors were encountered: