Add device create_sequence_table for benchmarks #10300

karthikeyann · 2022-02-15T18:04:49Z

addresses parts of #5773

Add create_sequence_table which creates sequences in device (only numeric types supported) with/without nulls.
Add create_random_null_mask to create random null mask with given probability. (0.0-1.0 null probability)
~~- add gnu++17 to generate_input.cu (temporarily for int128 STL support).~~
renamed repeat_dtypes to cycle_dtypes and moved out of create_* methods
updated ast bench, search, scatter , binary ops bench

Splitting PR #10109 for review

…hmark_speedup_2.6

codecov · 2022-02-15T20:41:42Z

Codecov Report

Merging #10300 (788b859) into branch-22.04 (c163886) will decrease coverage by 0.00%.
The diff coverage is 0.00%.

❗ Current head 788b859 differs from pull request most recent head fbd5708. Consider uploading reports for the commit fbd5708 to get more accurate results

@@               Coverage Diff                @@
##           branch-22.04   #10300      +/-   ##
================================================
- Coverage         10.62%   10.62%   -0.01%     
================================================
  Files               122      122              
  Lines             20961    20973      +12     
================================================
  Hits               2228     2228              
- Misses            18733    18745      +12

Impacted Files	Coverage Δ
python/cudf/cudf/core/_base_index.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/_internals/where.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/categorical.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/column.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/decimal.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/lists.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/string.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/dataframe.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/df_protocol.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/frame.py	`0.00% <0.00%> (ø)`
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c163886...fbd5708. Read the comment docs.

karthikeyann · 2022-02-16T04:45:10Z

rerun tests

cpp/benchmarks/CMakeLists.txt

cpp/benchmarks/common/generate_input.cu

cpp/benchmarks/common/generate_input.hpp

seperate nullmask generator code This reverts commit 916ce00.

robertmaynard

CMake and refactoring to avoid build flag changes looks great. Thank you

cpp/benchmarks/common/generate_input.hpp

bdice · 2022-02-23T18:08:17Z

cpp/benchmarks/common/generate_nullmask.cu

+#include <thrust/random.h>
+
+/**
+ * @brief valid bit generator with given probability [0.0 - 1.0]


I'd change the docs / naming here. There's no such thing as a "valid" bit -- only set (1) or unset (0) bits. This is layered with our interpretation of a set of bits as a null mask, but this function as written only deals with generating bools. (Moreover, this generates a byte-size bool and not a "bit".)

This mixing of semantics was a large part of what I untangled in #9588. In particular, there were issues with how that interpretation of valid/set and null/unset results in higher-order semantics when a null mask is nullptr. It's not possible to count "set" bits in the data pointed to by nullptr, but we have a specific interpretation of that in the context of validity / cudf null masks.

I want to emphasize this distinction and keep our lowest-level primitives (such as random generators) free of the valid/null interpretations of bits/bools. Including those semantics in functions like create_random_null_mask is fine, because the interpretation is explicitly intended and communicated in the function name / docstring.

cpp/benchmarks/common/generate_nullmask.cu

cpp/benchmarks/copying/scatter.cu

cpp/benchmarks/search/search.cpp

bdice

I have a few comments above - requesting changes for those. (I would have submitted them as a proper review rather than individual comments, but I didn't expect to add more than 1-2 small comments. Sorry about that.)

karthikeyann · 2022-02-24T18:39:22Z

all review comments addressed.
number of files changed are high due to cycle_dtypes change.

cpp/benchmarks/common/generate_nullmask.cu

karthikeyann · 2022-02-25T02:18:43Z

@gpucibot merge

Fixes `BINARYOP_BENCH` which is throwing an error for non-numeric types: ``` terminate called after throwing an instance of 'cudf::logic_error' what(): cuDF failure at: /cudf/cpp/src/filling/sequence.cu:139: init scalar type must be numeric ``` The `compiled_binaryop.cpp` was recently changed in #10300 to create test columns using the benchmark utility `create_sequence_table` which internally calls `cudf::sequence` API. Unfortunately, [only `numeric` types can be used with this API](https://github.com/rapidsai/cudf/blob/a9b6cb113bcacecd0752d2957971c0d417cf719e/cpp/src/filling/sequence.cu#L139) which throws an error for types like `timestamp, duration, and decimal` which are being measured in this file. https://docs.rapids.ai/api/libcudf/stable/group__transformation__fill.html#gaeda630c9dcdd152eeecf0a1b636244ac The fix replaces the `create_sequence_table` call with `create_random_table` to generate the source columns instead. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Nghia Truong (https://github.com/ttnghia) URL: #10398

To speedup generate benchmark input generation, move all data generation to device. To address #5773 (comment) This PR moves the random input generation to device. Rest all of the original work in this PR was split to multiple PRs and merged. #10277 #10278 #10279 #10280 #10281 #10300 With all of these changes, single iteration of all benchmark runs in <1000 seconds. (from 3067s to 964s). Running more iterations would see higher benefit too because the benchmark is restarted several times during run which again calls benchmark input generation code. closes #9857 Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Vukasin Milovanovic (https://github.com/vuule) - David Wendt (https://github.com/davidwendt) URL: #10109

karthikeyann added 11 commits February 14, 2022 21:48

rename generate_input.cpp to generate_input.cu

916ce00

add create_sequence_table, create_random_null_mask

d7f0f29

fix includes, seed

bb74cc7

use cuda::std to include int128

0ea4f60

use -std=gnu++17 for generate_input.cu for int128 support

a25241e

go back to using BENCHMARK_TEMPLATE_DEFINE_F

dfd33f2

use create_sequence_table in ast bench

f9f3eec

use create_sequence_table in binops bench

81ac53a

use create_sequence_table, thrust::shuffle in scatter bench

6c659d4

use cudf::sequence, create_random_null_mask in search bench

9f5c5ba

update copyright year

6758095

karthikeyann added this to the C++ Benchmark Runtime Improvements milestone Feb 15, 2022

karthikeyann requested a review from a team as a code owner February 15, 2022 18:04

karthikeyann self-assigned this Feb 15, 2022

karthikeyann requested review from mythrocks and ttnghia February 15, 2022 18:04

github-actions bot added the CMake CMake build issue label Feb 15, 2022

karthikeyann added 2 commits February 15, 2022 23:57

style fix clang format

718e269

Merge branch 'branch-22.04' of github.com:rapidsai/cudf into fea-benc…

704bb72

…hmark_speedup_2.6

bdice requested changes Feb 16, 2022

View reviewed changes

karthikeyann and others added 2 commits February 22, 2022 23:11

Merge branch 'branch-22.04' into fea-benchmark_speedup_2.6

bdbdf49

Revert "rename generate_input.cpp to generate_input.cu"

02ef0d2

seperate nullmask generator code This reverts commit 916ce00.

karthikeyann requested review from robertmaynard and mythrocks February 22, 2022 20:06

robertmaynard approved these changes Feb 22, 2022

View reviewed changes

mythrocks approved these changes Feb 23, 2022

View reviewed changes

bdice reviewed Feb 23, 2022

View reviewed changes

cpp/benchmarks/common/generate_input.hpp Show resolved Hide resolved

bdice reviewed Feb 23, 2022

View reviewed changes

cpp/benchmarks/common/generate_nullmask.cu Outdated Show resolved Hide resolved

bdice reviewed Feb 23, 2022

View reviewed changes

cpp/benchmarks/copying/scatter.cu Outdated Show resolved Hide resolved

bdice reviewed Feb 23, 2022

View reviewed changes

cpp/benchmarks/search/search.cpp Outdated Show resolved Hide resolved

bdice requested changes Feb 23, 2022

View reviewed changes

karthikeyann added 6 commits February 24, 2022 21:10

rename generator functor

820b417

simplify create null mask

9028a80

rename repeat_dtypes to cycle_dtypes

4f1f3e8

move cycle_dtypes out for create_sequence_table

b31de3a

move cycle_dtypes out of create_random_table

1d4d57a

fix null mask null_count

581e4b8

karthikeyann requested a review from bdice February 24, 2022 18:38

bdice approved these changes Feb 24, 2022

View reviewed changes

cpp/benchmarks/common/generate_nullmask.cu Outdated Show resolved Hide resolved

cpp/benchmarks/common/generate_nullmask.cu Outdated Show resolved Hide resolved

address review comments

fbd5708

rapids-bot bot merged commit eaae94b into rapidsai:branch-22.04 Feb 25, 2022

karthikeyann mentioned this pull request Mar 7, 2022

generate benchmark input in device #10109

Merged

davidwendt mentioned this pull request Mar 8, 2022

Fix error thrown in compiled-binaryop benchmark #10398

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add device create_sequence_table for benchmarks #10300

Add device create_sequence_table for benchmarks #10300

karthikeyann commented Feb 15, 2022 •

edited

Loading

codecov bot commented Feb 15, 2022 •

edited

Loading

karthikeyann commented Feb 16, 2022

robertmaynard left a comment

bdice Feb 23, 2022 •

edited

Loading

karthikeyann Feb 24, 2022

bdice left a comment

karthikeyann commented Feb 24, 2022 •

edited

Loading

karthikeyann commented Feb 25, 2022

Add device create_sequence_table for benchmarks #10300

Add device create_sequence_table for benchmarks #10300

Conversation

karthikeyann commented Feb 15, 2022 • edited Loading

codecov bot commented Feb 15, 2022 • edited Loading

Codecov Report

karthikeyann commented Feb 16, 2022

robertmaynard left a comment

Choose a reason for hiding this comment

bdice Feb 23, 2022 • edited Loading

Choose a reason for hiding this comment

karthikeyann Feb 24, 2022

Choose a reason for hiding this comment

bdice left a comment

Choose a reason for hiding this comment

karthikeyann commented Feb 24, 2022 • edited Loading

karthikeyann commented Feb 25, 2022

karthikeyann commented Feb 15, 2022 •

edited

Loading

codecov bot commented Feb 15, 2022 •

edited

Loading

bdice Feb 23, 2022 •

edited

Loading

karthikeyann commented Feb 24, 2022 •

edited

Loading