Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lifespan of the temporary directory that holds cuFile configuration file #10403

Merged

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Mar 9, 2022

The temporary directory is created when we modify the cuFile JSON file to control the compat mode. This directory is a local variable in the modify_cufile_json function, so the modified JSON file is deleted as soon as the function returns.
Because of this, cuFile is not able to read the modified JSON config and falls back to the default config file.

This PR makes the temporary directory a static variable so that the cuFile config file is avaliable until the end of program.
This fixes the compat mode override that we do via LIBCUDF_CUFILE_POLICY environment variable.

@vuule vuule added bug Something isn't working cuIO cuIO issue non-breaking Non-breaking change labels Mar 9, 2022
@vuule vuule self-assigned this Mar 9, 2022
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Mar 9, 2022
@vuule vuule marked this pull request as ready for review March 9, 2022 02:17
@vuule vuule requested a review from a team as a code owner March 9, 2022 02:17
@codecov
Copy link

codecov bot commented Mar 9, 2022

Codecov Report

Merging #10403 (7b530ef) into branch-22.04 (b3dc9d6) will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.04   #10403      +/-   ##
================================================
+ Coverage         86.12%   86.16%   +0.03%     
================================================
  Files               139      139              
  Lines             22450    22462      +12     
================================================
+ Hits              19336    19354      +18     
+ Misses             3114     3108       -6     
Impacted Files Coverage Δ
python/cudf/cudf/core/single_column_frame.py 97.03% <0.00%> (-0.03%) ⬇️
python/cudf/cudf/core/series.py 95.16% <0.00%> (-0.01%) ⬇️
python/cudf/cudf/io/csv.py 91.80% <0.00%> (ø)
python/cudf/cudf/io/text.py 100.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 92.70% <0.00%> (ø)
python/cudf/cudf/core/dataframe.py 93.56% <0.00%> (ø)
python/cudf/cudf/core/udf/utils.py 98.63% <0.00%> (ø)
python/dask_cudf/dask_cudf/core.py 73.36% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 92.13% <0.00%> (ø)
python/dask_cudf/dask_cudf/groupby.py 96.66% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3dc9d6...7b530ef. Read the comment docs.

@vuule vuule requested a review from davidwendt March 9, 2022 20:54
@vuule
Copy link
Contributor Author

vuule commented Mar 9, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit c08a97e into rapidsai:branch-22.04 Mar 9, 2022
@karthikeyann
Copy link
Contributor

@vuule This PR is increasing cuIO benchmark runtimes.

./benchmarks/ORC_READER_BENCH --benchmark_filter=integral_file_input
Commit c08a97e (This PR)

------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                    Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------------
OrcRead/integral_file_input/30/0/1/1/0/iterations:1/manual_time            107 ms          107 ms            1 bytes_per_second=4.67743G/s encoded_file_size=394.698M peak_memory_usage=1.12115G
OrcRead/integral_file_input/30/1000/1/1/0/iterations:1/manual_time         128 ms          128 ms            1 bytes_per_second=3.91999G/s encoded_file_size=327.952M peak_memory_usage=1.1814G
OrcRead/integral_file_input/30/0/32/1/0/iterations:1/manual_time          49.7 ms         49.7 ms            1 bytes_per_second=10.07G/s encoded_file_size=22.2569M peak_memory_usage=683.337M
OrcRead/integral_file_input/30/1000/32/1/0/iterations:1/manual_time       48.3 ms         48.3 ms            1 bytes_per_second=10.3588G/s encoded_file_size=20.2656M peak_memory_usage=683.321M
OrcRead/integral_file_input/30/0/1/0/0/iterations:1/manual_time           97.5 ms         97.5 ms            1 bytes_per_second=5.12949G/s encoded_file_size=396.373M peak_memory_usage=951.615M
OrcRead/integral_file_input/30/1000/1/0/0/iterations:1/manual_time         101 ms          101 ms            1 bytes_per_second=4.94368G/s encoded_file_size=396.189M peak_memory_usage=951.43M
OrcRead/integral_file_input/30/0/32/0/0/iterations:1/manual_time          44.8 ms         45.0 ms            1 bytes_per_second=11.1543G/s encoded_file_size=24.706M peak_memory_usage=579.948M
OrcRead/integral_file_input/30/1000/32/0/0/iterations:1/manual_time       45.6 ms         45.6 ms            1 bytes_per_second=10.9714G/s encoded_file_size=24.6618M peak_memory_usage=579.904M

vs
Commit 45acfc4 (before this PR).

------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                    Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------------
OrcRead/integral_file_input/30/0/1/1/0/iterations:1/manual_time           89.1 ms         33.8 ms            1 bytes_per_second=5.61115G/s encoded_file_size=394.698M peak_memory_usage=1.12115G
OrcRead/integral_file_input/30/1000/1/1/0/iterations:1/manual_time        89.0 ms         60.6 ms            1 bytes_per_second=5.61593G/s encoded_file_size=327.952M peak_memory_usage=1.1814G
OrcRead/integral_file_input/30/0/32/1/0/iterations:1/manual_time          47.1 ms         44.6 ms            1 bytes_per_second=10.6159G/s encoded_file_size=22.2569M peak_memory_usage=683.337M
OrcRead/integral_file_input/30/1000/32/1/0/iterations:1/manual_time       46.8 ms         44.6 ms            1 bytes_per_second=10.6765G/s encoded_file_size=20.2656M peak_memory_usage=683.321M
OrcRead/integral_file_input/30/0/1/0/0/iterations:1/manual_time           66.8 ms         32.6 ms            1 bytes_per_second=7.48906G/s encoded_file_size=396.373M peak_memory_usage=951.615M
OrcRead/integral_file_input/30/1000/1/0/0/iterations:1/manual_time        65.9 ms         32.0 ms            1 bytes_per_second=7.58566G/s encoded_file_size=396.189M peak_memory_usage=951.43M
OrcRead/integral_file_input/30/0/32/0/0/iterations:1/manual_time          53.4 ms         50.7 ms            1 bytes_per_second=9.35778G/s encoded_file_size=24.706M peak_memory_usage=579.948M
OrcRead/integral_file_input/30/1000/32/0/0/iterations:1/manual_time       48.9 ms         46.3 ms            1 bytes_per_second=10.2189G/s encoded_file_size=24.6618M peak_memory_usage=579.904M

@vuule
Copy link
Contributor Author

vuule commented Mar 11, 2022

The effect of this PR should be the same as changing the default LIBCUDF_CUFILE_POLICY from "ALWAYS" to "GDS". I would expect that this causes the benchmarks to use our IO instead of GDS' compatibility mode. I have noticed that GDS compat mode outperforms our IO when used in our benchmarks.

One way to check would be to change the line https://github.com/rapidsai/cudf/blob/branch-22.04/cpp/src/io/utilities/config_utils.cpp#L45
to set the default to "ALWAYS". This should revert the performance changes.
@karthikeyann is it possible to make this change and rerun the benchmarks on the same system?

@karthikeyann
Copy link
Contributor

karthikeyann commented Mar 11, 2022

Yes. That solves the issue.
Before
OrcRead/integral_file_input/30/0/1/1/0/manual_time 99.3 ms 99.2 ms 7 bytes_per_second=5.03772G/s encoded_file_size=394.698M peak_memory_usage=1.12115G
After set the default to "ALWAYS",
OrcRead/integral_file_input/30/0/1/1/0/manual_time 67.7 ms 33.4 ms 9 bytes_per_second=7.38259G/s encoded_file_size=394.698M peak_memory_usage=1.12115G

You can make the change if required.

@karthikeyann
Copy link
Contributor

karthikeyann commented Mar 11, 2022

With the suggested change, on bare metal, it ran fine. On rapids-compose environment, cuIO benchmarks crashed

terminate called after throwing an instance of 'cudf::logic_error'
  what():  cuDF failure at: /home/karthikeyan/dev/rapids/cudf/cpp/src/io/utilities/file_io_utilities.cpp:164: Cannot register file handle with cuFile

@vuule
Copy link
Contributor Author

vuule commented Mar 11, 2022

To clarify, the current behavior is correct, we chose to disable compatibility mode by default. I don't think we should treat this as a perf regression. However, the regression does indicate that the GDS default and the fall back logic could be improved.
Opened issue #10415

@vuule vuule deleted the bug-cufile-json-tmp-dir-lifespan branch August 10, 2023 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants