Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add peak memory usage tracking to cuIO benchmarks #7770

Merged
merged 14 commits into from
Jul 19, 2021

Conversation

devavret
Copy link
Contributor

@devavret devavret commented Mar 30, 2021

Uses rmm::mr::statistics_resource_adapter to track peak memory usage in cuIO benchmarks.

And sample use in parquet writer bench
Separate out memory_tracking_resource into its own header and remove association with benchmark fixture
@devavret devavret requested a review from a team as a code owner March 30, 2021 22:10
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Mar 30, 2021
@devavret devavret requested a review from vuule March 30, 2021 22:11
@devavret devavret added 3 - Ready for Review Ready for review by team cuIO cuIO issue non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Mar 30, 2021
Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is way cool.

cpp/benchmarks/common/memory_tracking_resource.hpp Outdated Show resolved Hide resolved
cpp/benchmarks/common/memory_tracking_resource.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Mostly got questions :)
(requesting changes just for the copyright years)

cpp/benchmarks/fixture/benchmark_fixture.hpp Show resolved Hide resolved
cpp/benchmarks/io/parquet/parquet_writer_benchmark.cpp Outdated Show resolved Hide resolved
Comment on lines 86 to 87
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource();
cudf::memory_tracking_resource<rmm::mr::device_memory_resource> tracking_mr(mr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make changes to all benchmarks. Do you want to add this in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I just wanted to get reviews on the method first, before going ahead and changing all the benchmarks. In the first commit, I made the tracking resource part of the base fixture and that automatically enabled it for every benchmark. But a problem with that was that it tracked memory usage for the setup as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 4664f9a


state.SetBytesProcessed(data_size * state.iterations());
state.counters["peak_memory_usage"] = tracking_mr.max_allocated_size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use this to catch memory leaks in tests (just query current instead of max on exit)?
Edit: leaks are pretty unlikely given libcudf's coding style, we probably don't need to invest time into this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be pretty easy. But since this is more of a test (curr value should always be 0 at the end) rather than a benchmark, should we make this opt-in? If anyone suspects a leak, they can add a check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of adding this check to the unit tests, rather than benchmarks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well sure. I can do that in a subsequent PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that any leak we have would probably be due to memory allocated without the rmm resource. Anything with rmm is likely using some RAII object to hold the memory. And this doesn't track non-rmm allocated memory.


state.SetBytesProcessed(data_size * state.iterations());
state.counters["peak_memory_usage"] = tracking_mr.max_allocated_size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you getting consistent numbers between runs? @kaatish mentioned that the peak usage varies when running the ORC writer benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested, it reports the same number for corresponding benchmarks across runs for ORC. @kaatish, what method did you use to test peak memory usage? If you're using nsys, you need to turn off the pool allocator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the script referenced in issue #7661 which calls pynvml methods.

@devavret devavret requested a review from harrism March 31, 2021 08:25
@codecov
Copy link

codecov bot commented Mar 31, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@3ed87f3). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #7770   +/-   ##
===============================================
  Coverage                ?   10.53%           
===============================================
  Files                   ?      116           
  Lines                   ?    18916           
  Branches                ?        0           
===============================================
  Hits                    ?     1993           
  Misses                  ?    16923           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3ed87f3...baa4150. Read the comment docs.

@davidwendt
Copy link
Contributor

Should this be going into 0.19. I thought we were in burndown.

@devavret devavret changed the base branch from branch-0.19 to branch-0.20 March 31, 2021 13:22
@devavret devavret requested review from a team as code owners March 31, 2021 13:22
@devavret devavret requested review from cwharris and isVoid and removed request for cwharris March 31, 2021 13:22
@github-actions github-actions bot added conda Java Affects Java cuDF API. Python Affects Python cuDF API. labels Jul 1, 2021
@devavret devavret changed the base branch from branch-21.06 to branch-21.08 July 1, 2021 18:57
@devavret devavret requested a review from harrism July 1, 2021 19:03
@harrism
Copy link
Member

harrism commented Jul 6, 2021

Why not make this apply to all benchmarks by putting it in the base fixture?

@devavret
Copy link
Contributor Author

devavret commented Jul 6, 2021

Why not make this apply to all benchmarks by putting it in the base fixture?

I did that previously in this PR c34b32b but changed it to per benchmark in b9147f4 because adding it to base fixture would cause allocations during the setup stage to also be included. We only care about the allocations happening in the API being benchmarked.

@harrism
Copy link
Member

harrism commented Jul 7, 2021

What allocations are there in the setup stage? Can't those be moved before where you create the statistics tracking resource and set it as the default?

@devavret
Copy link
Contributor Author

devavret commented Jul 7, 2021

What allocations are there in the setup stage? Can't those be moved before where you create the statistics tracking resource and set it as the default?

Setup happens in most benchmarks' primary function. e.g. BM_parq_read_varying_input where a table is created to write and then a call to write_parquet happens before read_parquet is profiled in the for (auto _ : state) loop. This entire function is called after cudf::benchmark::SetUp. We want to activate the statistics_resource right before the for (auto _ : state) loop

@harrism
Copy link
Member

harrism commented Jul 8, 2021

I see. Perhaps we can provide another function in the fixture that all benchmarks can use to initiate statistics logging...

@github-actions github-actions bot removed CMake CMake build issue Python Affects Python cuDF API. gpuCI Java Affects Java cuDF API. labels Jul 9, 2021
@devavret
Copy link
Contributor Author

devavret commented Jul 9, 2021

I see. Perhaps we can provide another function in the fixture that all benchmarks can use to initiate statistics logging...

How about an RAII object that does this 8d85fa9.

@harrism
Copy link
Member

harrism commented Jul 13, 2021

I like it. Easy to add to other benchmarks.

@devavret
Copy link
Contributor Author

rerun tests

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just got an optional nitpick.

cpp/benchmarks/fixture/benchmark_fixture.hpp Outdated Show resolved Hide resolved
@devavret
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit fdf4901 into rapidsai:branch-21.08 Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants