-
Notifications
You must be signed in to change notification settings - Fork 242
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] - Spark3 question #4828
Comments
With the GPU being mostly idle, I'm wondering about two possibilities:
To answer the first question, you could run with the config If the query is dominated by filesystem access, then running the query with less than half of the CPU cores (10 vs. 24) could significantly slowdown the GPU run. Fetching the raw data (as opposed to decoding the data) is still processed by the CPU, so this could be a significant contributor of the slowdown in comparison. To help answer this question, you could try running with more CPU cores for your GPU-configured setup and see how it impacts the query. Separately, you could use the Spark SQL web UI to examine the graphical query plan and see if the |
To help on 1st possibility mentioned by @jlowe , we have a workload qualification doc here: After setting |
Wow, that a ton of information. Thank you both! Thanks! |
@viadea Thanks for the input, very helpful :) I've added the following as per the comments in the explain output As far as I can say, these are the remaning issues preventing the query to run entirely on the GPU:
` Is there anything further I can try to make it run on the GPU? thanks |
Are there further questions for this issue, or is it covered by the other issues? |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I'm trying to run some queries on big data. I've taken a portion of our data (only 43GB) and test some query with 15 fields in two scenarios:
24 CPU cores with 200 files, up to 400MB per file
X CPU cores with one V100 GPU with 10 files, each about 4+GB as per the tuning guide suggestions.
The GPU is mostly idle and runs much slower than the CPU. Running the Spark on the GPU with the 400MBs files, runs slow as well.
I'm using the following command to run the GPU code:
$SPARK_HOME/bin/spark-shell --master "local[10]" --driver-memory 50g --conf spark.locality.wait=0s --conf spark.rapids.memory.pinnedPool.size=30G --conf spark.sql.files.maxPartitionBytes=256m --conf spark.rapids.sql.concurrentGpuTasks=2 --conf spark.plugins=com.nvidia.spark.SQLPlugin --jars ${SPARK_CUDF_JAR},${SPARK_RAPIDS_PLUGIN_JAR}
Changing maxPartitionBytes or concurrentGpuTasks or any other parameter, doesn't seem to have any effect.
As far as I can see most of the time the network I/O is not working nor does the GPU.
Any idea would be highly appereciated.
The text was updated successfully, but these errors were encountered: