Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) #16065

Closed
Ginger-Tec opened this issue Jun 24, 2024 · 2 comments
Labels
question Further information is requested

Comments

@Ginger-Tec
Copy link

image

I conducted performance tests on basic arithmetic operations (+) with two DataFrames of size (4303, 3766) using different libraries: pandas, numpy, polars, and cuDF. Here are the results:

Local Environment (i7-12 Windows)

  • pandas: 30.4ms
  • numpy: 30.9ms
  • polars: 8.18ms
  • cuDF: 1.18s

Colab Environment

  • pandas: 50.8ms
  • numpy: 49.4ms
  • polars: 84.5ms
  • cuDF: 1.09s

Analysis of cuDF Performance

Despite the expectation that cuDF (GPU-accelerated) should outperform the other libraries, it showed the slowest performance. Here are the potential reasons for this outcome:

  1. Apache Arrow Columnar Memory Format:

    • cuDF operates based on the Apache Arrow columnar memory format, which might introduce overhead in certain scenarios.
  2. Small Row Count:

    • The dataset used has only about 4300 rows. GPU acceleration is generally more beneficial for larger datasets, where the overhead of transferring data to the GPU and initializing computations is amortized over a larger number of operations.
  3. Potential Bottlenecks:

    • Data Transfer Overhead: The time taken to transfer data from CPU memory to GPU memory can be significant, especially for smaller datasets.
    • Insufficient Row Count: With fewer rows, the advantages of parallel processing on the GPU are not fully realized.
    • Column-Oriented Processing: The dataset has a high number of columns (3766), and the overhead of managing such a wide dataset may not be effectively offset by the parallelism of the GPU.

Conclusion

Given these factors, it is reasonable to conclude that cuDF may not perform optimally for datasets with a small number of rows and a large number of columns in simple arithmetic operations. The overhead associated with data transfer and initialization on the GPU, combined with the columnar processing model, can outweigh the benefits of GPU acceleration in this specific context.


Is it okay to organize the reason why cudf is slow in the above scenario as above?
I think there is something wrong or wrong, so I would really appreciate it if you could let me know.

@Ginger-Tec Ginger-Tec added the question Further information is requested label Jun 24, 2024
@Ginger-Tec Ginger-Tec changed the title [QST] cudf many column is slow? (pandas, numpy, polars, cuDF Comparison) [QST] Is cuDF slow if there are many columns? (pandas, numpy, polars, cuDF Comparison) Jun 24, 2024
@vyasr
Copy link
Contributor

vyasr commented Jun 24, 2024

Your analysis is correct. Wide dataframes (many columns, few rows) are not what cuDF is optimized for. That is a fundamental property of the Arrow format, and that property is accentuated on GPUs because of the performance characteristics of memory accesses on GPUs relative to CPUs. #14548 has a lot of good discussion on this topic, so I'd have a look there and see if that discussion matches your expectations. Feel free to follow up here if you have more questions.

@Ginger-Tec
Copy link
Author

Thank you, @vyasr , for your quick and clear response. Thanks to you, I am confident in introducing cuDF in my presentation today!

rapids-bot bot pushed a commit that referenced this issue Sep 5, 2024
This PR adds a section with performance tips to the `cudf.pandas` FAQ.

I based this section on some common user questions about performance, to make it clearer that `cudf.pandas` is designed for optimal performance with large data sizes and provide some alternatives for common needs where `cudf` or `cudf.pandas` aren't the best fit. See these links for examples:

- #14548 (comment)
- #16065
- https://stackoverflow.com/questions/78626099/cudf-is-very-slow

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #16693
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants