-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST]How to improve performance of read_parquet and url_decode further ? #7571
Comments
I think this is a duplicate of #7545. We know there's room to improve the |
Agree mostly this is a duplicate of #7545 but spending 600+ msec on a urldecode isn't ideal either. We may need to see if there's additional optimizations that can be done there. @chenrui17 what version (or git commit hash) of cudf is this trace based upon? |
|
Reference #7571 This improves the performance of the `cudf::make_strings_column` for long strings. It uses a similar approach from `cudf::strings::detail::gather` and also use thresholding as in the optimized `cudf::strings::replace`. This may not be the right solution for overall optimizing #7571 but may be helpful in other places where long strings are used for created a strings column in libcudf. This PR also includes a gbenchmark to help measure the performance results of this factory function. The results of the benchmark are that longer strings (~ >64 bytes on average) showed about a 10x improvement. I can post benchmark results here if needed. The character-parallel algorithm was slower for shorter strings so the existing algorithm is used based on the a threshold calculation. I also added an additional gtest with a mixture of nulls and empty strings to make sure the new algorithm handles these correctly. Authors: - David (@davidwendt) Approvers: - Jason Lowe (@jlowe) - Nghia Truong (@ttnghia) - Jake Hemstad (@jrhemstad) URL: #7576
@jlowe is this addressed now based on the |
@chenrui17 please verify whether the new performance of Parquet load and url_decode meets your needs. If there's additional work needed for just one, I propose tracking that with updated trace/metrics in a new feature request. |
i will propose a new feature request to improve url_decode further |
What is your question?
@jlowe As discussed at the meeting yesterday, from this nsight trace, we can see
make_strings_column
cost almost half of the time , rest time is url_decode and replace .By the way, my input parquet record is long url , length of url about 500~1500. so this problem is depends on #7545 , right ?
The text was updated successfully, but these errors were encountered: