Add implicit stream benchmarking support #76

PointKernel · 2022-02-04T22:33:33Z

Closes #13

This PR adds support to benchmark functions that do not expose stream parameters.

jrhemstad

We can simplify this quite a bit with rule-of-0 using a unique_ptr with custom deleter.

docs/benchmarks.md

nvbench/cuda_stream.cuh

docs/benchmarks.md

nvbench/cuda_stream.cuh

Co-authored-by: Jake Hemstad <[email protected]>

alliepiper · 2022-02-07T16:52:28Z

Thanks for the PR! I'll take a look at this once I get the Thrust/CUB 1.16 release candidate prepped. Should be later this week if all goes well.

alliepiper · 2022-02-07T16:54:28Z

@PointKernel Could you add an example for this feature in nvbench/examples?

Also update to use `nvbench::make_cuda_stream_view`.

alliepiper

I pushed a couple of commits that update the documentation and add a nvbench::make_cuda_stream_view method. I think that's a clean solution to the non-owning API issue.

This looks good to me -- thanks for working on this @PointKernel and @jrhemstad!

alliepiper · 2022-02-11T17:08:54Z

docs/benchmarks.md

+NVBench records the elapsed time of work on a CUDA stream for each iteration of a benchmark. 
+By default, NVBench creates and provides an explicit stream via `launch::get_stream()` 
+to pass to every stream-ordered operation. 


This isn't quite true -- for the isolated/cold measurements, each iteration is recorded, but for the batch/hot measurements, several iterations are lumped together in a single timer.

I'd also move this down into it's own section -- this section is meant to give an extremely brief overview of a minimal benchmark specification and introduce key concepts. Using an explicit stream is an advanced usecase that should have it's own section.

I'll push a commit to this branch that restructures this a bit, since I'm pretty picky about these docs 😅

alliepiper · 2022-02-11T17:58:23Z

nvbench/detail/measure_cupti.cu

+try : m_state
+{
+  exec_state
+}
+, m_launch{m_state.get_cuda_stream()},
+  m_cupti{*m_state.get_device(), add_metrics(m_state)}


Heh. Understandably, clang-format is not a fan of initializer-scope try statements. I'll clean this up a bit in my follow up patch.

PointKernel and others added 9 commits February 4, 2022 13:26

Add owning and non-owning semantics to nvbench::cuda_stream

15f2e92

Add a cuda stream member to nvbench::state

8aea3e4

Update launch to hold a const ref of nvbenc::cuda_stream

c510a0e

Update measure_* classes to construct launch from the state cuda stream

14eab07

Fix a stream destroy bug

86708ec

Minor correction

439ffec

Add nvbench::state stream tests

470beda

Update benchmarks.md

76cbbcc

Update copyright year

33a896f

jrhemstad requested changes Feb 6, 2022

View reviewed changes

PointKernel and others added 4 commits February 6, 2022 19:31

Update docs/benchmarks.md

a2a12c6

Co-authored-by: Jake Hemstad <[email protected]>

Update docs

e7c29c1

Use unique_ptr + custom deleter to simplify destroy logic

e05bf00

Minor correction in unit test

6159d9c

alliepiper self-requested a review February 7, 2022 16:52

alliepiper self-assigned this Feb 7, 2022

alliepiper added type: enhancement New feature or request. helps: rapids Helps or needed by RAPIDS. P1: should have Necessary, but not critical. labels Feb 7, 2022

alliepiper added this to the 1.0 - Initial Public Release milestone Feb 7, 2022

Add stream benchmark example

fde2e40

jrhemstad approved these changes Feb 10, 2022

View reviewed changes

alliepiper added 4 commits February 11, 2022 13:20

Exclude some bits from clang-format.

da2ec38

Add docs for launch and cuda_stream.

8ae5898

Add nvbench::make_cuda_stream_view(cudaStream_t).

3b41387

Move documentation on streams to new subsection.

039d455

Also update to use `nvbench::make_cuda_stream_view`.

alliepiper approved these changes Feb 11, 2022

View reviewed changes

alliepiper merged commit 38cecd5 into NVIDIA:main Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add implicit stream benchmarking support #76

Add implicit stream benchmarking support #76

PointKernel commented Feb 4, 2022

jrhemstad left a comment

alliepiper commented Feb 7, 2022

alliepiper commented Feb 7, 2022

alliepiper left a comment

alliepiper Feb 11, 2022

alliepiper Feb 11, 2022

Add implicit stream benchmarking support #76

Add implicit stream benchmarking support #76

Conversation

PointKernel commented Feb 4, 2022

jrhemstad left a comment

Choose a reason for hiding this comment

alliepiper commented Feb 7, 2022

alliepiper commented Feb 7, 2022

alliepiper left a comment

Choose a reason for hiding this comment

alliepiper Feb 11, 2022

Choose a reason for hiding this comment

alliepiper Feb 11, 2022

Choose a reason for hiding this comment