Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update benchmarking guide to use NVBench. #10093

Merged
merged 3 commits into from
Jan 24, 2022

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Jan 20, 2022

This PR relates to #7960 and indicates that new benchmarks should be written using NVBench. (This PR does not resolve that issue because it does not transition any existing benchmarks to nvbench, it is only an update in the developer guide.)

@bdice bdice requested a review from a team as a code owner January 20, 2022 18:27
@bdice bdice requested review from trxcllnt and davidwendt January 20, 2022 18:27
@bdice bdice self-assigned this Jan 20, 2022
@bdice bdice added the doc Documentation label Jan 20, 2022
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 20, 2022
@bdice bdice requested a review from jrhemstad January 20, 2022 18:27
cpp/docs/BENCHMARKING.md Outdated Show resolved Hide resolved
@bdice bdice changed the title Update benchmarking guide to use nvbench. Update benchmarking guide to use NVBench. Jan 20, 2022
@bdice bdice added the non-breaking Non-breaking change label Jan 20, 2022
`cudf/cpp/src/copying/scatter.cu` has benchmarks in
`cudf/cpp/benchmarks/copying/scatter_benchmark.cu`.
`cpp/src/copying/scatter.cu` has benchmarks in
`cpp/benchmarks/copying/scatter_benchmark.cu`.
Copy link
Member

@PointKernel PointKernel Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been naming nvbench files with _nvbench.cu/cpp suffix as opposed to _benchmark.cu/cpp in gbench and executable with <feature>_NVBENCH (instead of <feature>_BENCH) to distinguish them from gbench. Not sure it's the right way to go for all nvbench benchmarks though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer no suffix on the source files. As nvbench becomes the primary benchmark it becomes just noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertmaynard Does "no suffix" mean "remove _benchmark" as well?

I find _benchmark helpful for finding files by name, but I agree that _nvbench is too specific.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall I decided to do so because of CMake:

  • ConfigureBench to build _benchmark.cu/cpp
  • ConfigureNVBench to build _nvbench.cu/cpp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertmaynard Does "no suffix" mean "remove _benchmark" as well?

I find _benchmark helpful for finding files by name, but I agree that _nvbench is too specific.

I was thinking of tool specific suffix. So benchmark seems okay to me. The purist in me says that we have benchmarks in the path to the file already so no future distinction is required ( we don't use scatter_header.h ).

I recall I decided to do so because of CMake:

  • ConfigureBench to build _benchmark.cu/cpp
  • ConfigureNVBench to build _nvbench.cu/cpp

ConfigureBench and ConfigureNVBench are given a set of souce files to compile, and don't care how those files are named. So they would have no problem with scatter_badly_named.cu :)

The primary issue is that we have join_nvbench.cu and join_benchmark.cu, where the nvbench version tests the left_join, full_join, and inner_join algorithms, and google benchmarks tests the semi and anti joins. If we had gone with file names that matched the algorithms ( anti_join_benchmark.cu, left_join_benchmark.cu ) we wouldn't need to have anything end with _nvbench.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we agree that no change is needed in the guide? (Perhaps some changes are needed to rename _nvbench.cu/cpp files.)

Copy link
Contributor

@karthikeyann karthikeyann Jan 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to ideally lose suffixes.
Once our benchmarks become large enough to need to split across many files, we might need to add suffixes based on context of that benchmark. At that point, we would prefer to lose _nvbench, _benchmark suffixes because of long filenames if we had it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karthikeyann I can see the argument to remove _benchmark. However, that wouldn't align with the current state of the benchmarks. I'd like to keep this PR focused on the change to "use nvbench" and that change can be done at a later point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make sure when someone goes through and removes _benchmark from every benchmark source file, they also update the guide to not use the suffix. (If it were me, I would change the example in the guide now...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harrism I'll merge this PR and do that now. 👍

@bdice
Copy link
Contributor Author

bdice commented Jan 21, 2022

rerun tests

@codecov
Copy link

codecov bot commented Jan 23, 2022

Codecov Report

Merging #10093 (ebd33fb) into branch-22.04 (e24fa8f) will increase coverage by 0.04%.
The diff coverage is 3.18%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.04   #10093      +/-   ##
================================================
+ Coverage         10.37%   10.42%   +0.04%     
================================================
  Files               119      119              
  Lines             20149    20607     +458     
================================================
+ Hits               2091     2148      +57     
- Misses            18058    18459     +401     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/orc.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/parquet.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/api/types.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/decimal.py 0.00% <0.00%> (ø)
... and 81 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1e0bb6...ebd33fb. Read the comment docs.

@bdice
Copy link
Contributor Author

bdice commented Jan 24, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 01b7c83 into rapidsai:branch-22.04 Jan 24, 2022
@bdice bdice mentioned this pull request Jan 24, 2022
@bdice
Copy link
Contributor Author

bdice commented Jan 24, 2022

I have addressed follow-up comments about removing _benchmark from the file names in a separate PR: #10112

rapids-bot bot pushed a commit that referenced this pull request Jan 25, 2022
This PR addresses suggestions from @robertmaynard, @karthikeyann, and @harrism in #10093 to remove the suffix `_benchmark` from the name of the benchmark files. The developer guide has been updated to reflect this change.

I left a few shared files with the word "benchmark" in them, such as [cpp/benchmarks/fixture/benchmark_fixture.hpp](https://github.com/rapidsai/cudf/blob/cfb6cbe6a8e4b01d7ee28fbe3212b5f2eb275167/cpp/benchmarks/fixture/benchmark_fixture.hpp) and [cpp/benchmarks/fixture/templated_benchmark_fixture.hpp](https://github.com/rapidsai/cudf/blob/cfb6cbe6a8e4b01d7ee28fbe3212b5f2eb275167/cpp/benchmarks/fixture/templated_benchmark_fixture.hpp).

I built the benchmarks locally to ensure that nothing was broken (I'm noting this because I recall that our CI does not build the benchmarks).

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Karthikeyan (https://github.com/karthikeyann)
  - Nghia Truong (https://github.com/ttnghia)

URL: #10112
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants