[QST]How to improve performance of read_parquet and url_decode further ? #7571

chenrui17 · 2021-03-11T11:52:05Z

What is your question?

@jlowe As discussed at the meeting yesterday, from this nsight trace, we can see make_strings_column cost almost half of the time , rest time is url_decode and replace .
By the way, my input parquet record is long url , length of url about 500~1500. so this problem is depends on #7545 , right ?

The text was updated successfully, but these errors were encountered:

jrhemstad · 2021-03-11T13:57:00Z

I think this is a duplicate of #7545. We know there's room to improve the make_strings_column implementation.

jlowe · 2021-03-11T14:46:27Z

Agree mostly this is a duplicate of #7545 but spending 600+ msec on a urldecode isn't ideal either. We may need to see if there's additional optimizations that can be done there.

@chenrui17 what version (or git commit hash) of cudf is this trace based upon?

chenrui17 · 2021-03-11T15:29:38Z

Agree mostly this is a duplicate of #7545 but spending 600+ msec on a urldecode isn't ideal either. We may need to see if there's additional optimizations that can be done there.

@chenrui17 what version (or git commit hash) of cudf is this trace based upon?
cudf 0.19 commit id : c76949e

@davidwendt

Reference #7571 This improves the performance of the `cudf::make_strings_column` for long strings. It uses a similar approach from `cudf::strings::detail::gather` and also use thresholding as in the optimized `cudf::strings::replace`. This may not be the right solution for overall optimizing #7571 but may be helpful in other places where long strings are used for created a strings column in libcudf. This PR also includes a gbenchmark to help measure the performance results of this factory function. The results of the benchmark are that longer strings (~ >64 bytes on average) showed about a 10x improvement. I can post benchmark results here if needed. The character-parallel algorithm was slower for shorter strings so the existing algorithm is used based on the a threshold calculation. I also added an additional gtest with a mixture of nulls and empty strings to make sure the new algorithm handles these correctly. Authors: - David (@davidwendt) Approvers: - Jason Lowe (@jlowe) - Nghia Truong (@ttnghia) - Jake Hemstad (@jrhemstad) URL: #7576

kkraus14 · 2021-03-26T21:08:02Z

@jlowe is this addressed now based on the make_strings_column optimizations?

jlowe · 2021-03-26T21:30:34Z

make_strings_column and url_decode are significantly faster on long strings, but we may need some further optimizations for url_decode in some cases. I'm not sure if the trace at the top of this description was captured before or after the recent url_decode optimizations.

@chenrui17 please verify whether the new performance of Parquet load and url_decode meets your needs. If there's additional work needed for just one, I propose tracking that with updated trace/metrics in a new feature request.

chenrui17 · 2021-04-22T10:15:47Z

i will propose a new feature request to improve url_decode further

chenrui17 added Needs Triage Need team to review and classify question Further information is requested labels Mar 11, 2021

davidwendt mentioned this issue Mar 11, 2021

Optimize cudf::make_strings_column for long strings #7576

Merged

davidwendt mentioned this issue Mar 23, 2021

[FEA] Improve performance of loading long strings from Parquet #7545

Closed

kkraus14 added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Mar 26, 2021

chenrui17 closed this as completed Apr 22, 2021

chenrui17 mentioned this issue Apr 22, 2021

[FEA] Improve url_decode performance further #8030

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST]How to improve performance of read_parquet and url_decode further ? #7571

[QST]How to improve performance of read_parquet and url_decode further ? #7571

chenrui17 commented Mar 11, 2021

jrhemstad commented Mar 11, 2021

jlowe commented Mar 11, 2021

chenrui17 commented Mar 11, 2021

kkraus14 commented Mar 26, 2021

jlowe commented Mar 26, 2021

chenrui17 commented Apr 22, 2021

[QST]How to improve performance of read_parquet and url_decode further ? #7571

[QST]How to improve performance of read_parquet and url_decode further ? #7571

Comments

chenrui17 commented Mar 11, 2021

jrhemstad commented Mar 11, 2021

jlowe commented Mar 11, 2021

chenrui17 commented Mar 11, 2021

kkraus14 commented Mar 26, 2021

jlowe commented Mar 26, 2021

chenrui17 commented Apr 22, 2021