Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ benchmarking for MG PageRank #1755

Merged
merged 11 commits into from
Aug 5, 2021

Conversation

seunghwak
Copy link
Contributor

@seunghwak seunghwak commented Aug 4, 2021

  • Update google tests to take additional command line arguments (to report performance measurements and control R-mat graph size in command line)
  • Store the parsed command line arguments in globally accessible variables
  • Update MG PageRank test code to behave based on the globally accessible variables storing command line inputs.

Exemplar benchmark scripts

// 32 bit vertex & edge IDs
mpirun -n 2 --tag-output --output-filename log ./tests/MG_PAGERANK_TEST --gtest_filter=rmat_large_tests/Tests_MGPageRank_Rmat.CheckInt32Int32* --perf --rmat_scale=25 --rmat_edge_factor=16
// 32bit vertex ID & 64 bit edge ID
mpirun -n 2 --tag-output --output-filename log ./tests/MG_PAGERANK_TEST --gtest_filter=rmat_large_tests/Tests_MGPageRank_Rmat.CheckInt32Int64* --perf --rmat_scale=25 --rmat_edge_factor=16
// 64 bit vertex & edge IDs
mpirun -n 2 --tag-output --output-filename log ./tests/MG_PAGERANK_TEST --gtest_filter=rmat_large_tests/Tests_MGPageRank_Rmat.CheckInt64Int64* --perf --rmat_scale=25 --rmat_edge_factor=16

@seunghwak seunghwak requested a review from a team as a code owner August 4, 2021 13:59
@seunghwak seunghwak changed the title [WIP] C++ benchmarking for MG PageRank [WIP][skip-ci] C++ benchmarking for MG PageRank Aug 4, 2021
@seunghwak seunghwak added 2 - In Progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 4, 2021
@seunghwak seunghwak changed the title [WIP][skip-ci] C++ benchmarking for MG PageRank C++ benchmarking for MG PageRank Aug 4, 2021
@seunghwak seunghwak self-assigned this Aug 4, 2021
@seunghwak
Copy link
Contributor Author

@rlratzel @ChuckHastings Please review this based on our previous discussion about C++ benchmarking. Once we agree on the methodology, I will apply this to other tests as well (in a separate PR).

@kaatish You may also take a look as MG graph primitive tests should follow this.

@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@f1dffc4). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 23b4b66 differs from pull request most recent head 09fcdee. Consider uploading reports for the commit 09fcdee to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.10    #1755   +/-   ##
===============================================
  Coverage                ?   59.72%           
===============================================
  Files                   ?       77           
  Lines                   ?     3521           
  Branches                ?        0           
===============================================
  Hits                    ?     2103           
  Misses                  ?     1418           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1dffc4...09fcdee. Read the comment docs.

@seunghwak
Copy link
Contributor Author

rerun tests

auto param = GetParam();
run_current_test<int32_t, int32_t, float, float>(std::get<0>(param), std::get<1>(param));
auto param = GetParam();
auto pagerank_usecase = std::get<0>(param);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach will override all rmat tests, right? In this case we have a small rmat use case that has the correctness test enabled and a large rmat use case that has the correctness disabled. I believe this change would consequently run the test twice for each combination (32/32, 32/64, 64/64) once with and once without validation. Not sure that's what we want.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we create a separate test to address this. I'm bad at naming things, so imagine better names:

  1. Create a test suite called Benchmark_MGPageRank_Rmat. Disable validation in that test suite and provide a single rmat test (something small so that when we're doing unit tests in CI it won't take long)
  2. Add this logic to that unit test, but not the basic unit tests
  3. When you want to run with these overriding parameters you can run: mpirun -np 2 tests/MG_PAGERANK_TEST --gtest_filter=Benchmark_MGPageRank_Rmat* and append whatever overriding parameters are desired

That would allow you to get a clean run without the extra runs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach will override all rmat tests, right?

Yes, without --gtest_filter, no, with the filter. This was intended to use in the benchmarking mode with something like --gtest_filter=rmat_large_tests/Tests_MGPageRank_Rmat.CheckInt32Int32*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the "Exemplar benchmark scripts" part of this PR description.

Copy link
Contributor Author

@seunghwak seunghwak Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the filter to run only rmat_large_tests, I think the current code can achieve what we can do with Benchmark_MGPageRank_Rmat without actually adding additional code for that (and with minimum interruption in our C++ testing work process).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So I should read all of the description when reviewing the PR, just not jump to the code :-)

It would be nice to encapsulate this logic somehow rather than having to repeat it in every test. Could we create a function that updates the rmat_usecase that can be reusable across all of the tests (SG or MG)? Perhaps it could be a method in the rmat_usecase class?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, did we want to allow for specifying other input files (an analog for the file_usecase)? That might require some refactoring of the tests since most of our tests don't have a single file test without validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to encapsulate this logic somehow rather than having to repeat it in every test.

Agreed, I will push this update in the next commit.

Also, did we want to allow for specifying other input files (an analog for the file_usecase)?

Yeah... eventually, but let's think about this later (I guess we may need to add something like file_large_tests... but let's worry about this when we have a specific usecase).

@BradReesWork BradReesWork added this to the 21.10 milestone Aug 5, 2021
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving but I do have some feedback that, if you can address would be nice, but I think it might be out-of-scope for this PR.

@@ -316,7 +311,22 @@ TEST_P(Tests_MGPageRank_File, CheckInt32Int32FloatFloat)
TEST_P(Tests_MGPageRank_Rmat, CheckInt32Int32FloatFloat)
{
auto param = GetParam();
run_current_test<int32_t, int32_t, float, float>(std::get<0>(param), std::get<1>(param));
run_current_test<int32_t, int32_t, float, float>(
std::get<0>(param), override_Rmat_Usecase_with_cmd_line_arguments(std::get<1>(param)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A downside to this approach is that gtest doesn't know the override is happening, and as a result it may end up producing redundant combinations of input params (since each param is being overridden the same way), if there are >1 rmat usecase instances in the param list. I'm guessing --gtest_list_tests may not print the correct list of tests when one of the rmat options are passed either (eg. TestSuite/TestCase/2 will be associated with the default params and not the overridden ones), but that's relatively minor.

Unfortunately I don't have a better suggestion that's in the scope of this PR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... maybe it might be better to rename rmat_larget_tests to something like rmat_benchmarks (@ChuckHastings also mentioned about creating Benchmark_MGPageRank_Rmat) and enforce rmat_benchmarks to have only one Rmat_Usecase (and in the benchmark mode, just select rmat_benchmarks using --gtest_filter).

@ChuckHastings
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit c6974d7 into rapidsai:branch-21.10 Aug 5, 2021
rapids-bot bot pushed a commit that referenced this pull request Aug 6, 2021
Apply the updates in PR #1755 (for MG PageRank) to

MG: Katz Centrality, BFS, SSSP, WCC, and primitive tests
SG: PageRank, Katz Centrality, BFS, SSSP, and WCC tests

@ChuckHastings I will defer Louvain test updates to you :-) once Louvain tests get updated to take R-mat graphs.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #1762
@seunghwak seunghwak deleted the fea_cpp_benchmark branch October 20, 2021 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants