-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
device input generation in join bench #10277
device input generation in join bench #10277
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #10277 +/- ##
================================================
- Coverage 10.67% 10.67% -0.01%
================================================
Files 122 122
Lines 20873 20878 +5
================================================
Hits 2228 2228
- Misses 18645 18650 +5
Continue to review full report at Codecov.
|
cpp/benchmarks/join/join_common.hpp
Outdated
@@ -20,9 +20,14 @@ | |||
#include <nvbench/nvbench.cuh> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://github.com/rapidsai/cudf/blob/branch-22.04/cpp/docs/DEVELOPER_GUIDE.md#includes, headers should be included from the nearest to farthest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor suggestion for you to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
rerun tests |
@gpucibot merge |
To speedup generate benchmark input generation, move all data generation to device. To address #5773 (comment) This PR moves the random input generation to device. Rest all of the original work in this PR was split to multiple PRs and merged. #10277 #10278 #10279 #10280 #10281 #10300 With all of these changes, single iteration of all benchmark runs in <1000 seconds. (from 3067s to 964s). Running more iterations would see higher benefit too because the benchmark is restarted several times during run which again calls benchmark input generation code. closes #9857 Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Vukasin Milovanovic (https://github.com/vuule) - David Wendt (https://github.com/davidwendt) URL: #10109
Use device functions to move input generation to device in join benchmark.
Splitting PR #10109 for review