[QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) #16065

Ginger-Tec · 2024-06-24T10:51:57Z

I conducted performance tests on basic arithmetic operations (+) with two DataFrames of size (4303, 3766) using different libraries: pandas, numpy, polars, and cuDF. Here are the results:

Local Environment (i7-12 Windows)

pandas: 30.4ms
numpy: 30.9ms
polars: 8.18ms
cuDF: 1.18s

Colab Environment

pandas: 50.8ms
numpy: 49.4ms
polars: 84.5ms
cuDF: 1.09s

Analysis of cuDF Performance

Despite the expectation that cuDF (GPU-accelerated) should outperform the other libraries, it showed the slowest performance. Here are the potential reasons for this outcome:

Apache Arrow Columnar Memory Format:
- cuDF operates based on the Apache Arrow columnar memory format, which might introduce overhead in certain scenarios.
Small Row Count:
- The dataset used has only about 4300 rows. GPU acceleration is generally more beneficial for larger datasets, where the overhead of transferring data to the GPU and initializing computations is amortized over a larger number of operations.
Potential Bottlenecks:
- Data Transfer Overhead: The time taken to transfer data from CPU memory to GPU memory can be significant, especially for smaller datasets.
- Insufficient Row Count: With fewer rows, the advantages of parallel processing on the GPU are not fully realized.
- Column-Oriented Processing: The dataset has a high number of columns (3766), and the overhead of managing such a wide dataset may not be effectively offset by the parallelism of the GPU.

Conclusion

Given these factors, it is reasonable to conclude that cuDF may not perform optimally for datasets with a small number of rows and a large number of columns in simple arithmetic operations. The overhead associated with data transfer and initialization on the GPU, combined with the columnar processing model, can outweigh the benefits of GPU acceleration in this specific context.

Is it okay to organize the reason why cudf is slow in the above scenario as above?
I think there is something wrong or wrong, so I would really appreciate it if you could let me know.

vyasr · 2024-06-24T16:26:54Z

Your analysis is correct. Wide dataframes (many columns, few rows) are not what cuDF is optimized for. That is a fundamental property of the Arrow format, and that property is accentuated on GPUs because of the performance characteristics of memory accesses on GPUs relative to CPUs. #14548 has a lot of good discussion on this topic, so I'd have a look there and see if that discussion matches your expectations. Feel free to follow up here if you have more questions.

Ginger-Tec · 2024-06-24T22:33:37Z

Thank you, @vyasr , for your quick and clear response. Thanks to you, I am confident in introducing cuDF in my presentation today!

This PR adds a section with performance tips to the `cudf.pandas` FAQ. I based this section on some common user questions about performance, to make it clearer that `cudf.pandas` is designed for optimal performance with large data sizes and provide some alternatives for common needs where `cudf` or `cudf.pandas` aren't the best fit. See these links for examples: - #14548 (comment) - #16065 - https://stackoverflow.com/questions/78626099/cudf-is-very-slow Authors: - Bradley Dice (https://github.com/bdice) - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #16693

Ginger-Tec added the question Further information is requested label Jun 24, 2024

Ginger-Tec changed the title ~~[QST] cudf many column is slow? (pandas, numpy, polars, cuDF Comparison)~~ [QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) Jun 24, 2024

Ginger-Tec closed this as completed Jun 24, 2024

bdice mentioned this issue Aug 29, 2024

Add performance tips to cudf.pandas FAQ. #16693

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) #16065

[QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) #16065

Ginger-Tec commented Jun 24, 2024

vyasr commented Jun 24, 2024

Ginger-Tec commented Jun 24, 2024

[QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) #16065

[QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) #16065

Comments

Ginger-Tec commented Jun 24, 2024

Local Environment (i7-12 Windows)

Colab Environment

Analysis of cuDF Performance

Conclusion

vyasr commented Jun 24, 2024

Ginger-Tec commented Jun 24, 2024