Use opt-in shared memory carveout for FIL #3759

levsnv · 2021-04-17T06:39:45Z

To speed up inference on GPUs where extra shared memory can be opted in (and when this enables inference out of shared memory), take advantage.

levsnv · 2021-04-18T00:18:03Z

rerun tests
HTTP 504 GATEWAY TIME-OUT for url https://conda.anaconda.org/conda-forge/linux-64/glog-0.4.0-h49b9bf7_3.tar.bz2

cpp/src/fil/common.cuh

canonizer · 2021-04-23T22:51:50Z

cpp/src/fil/infer.cu

@@ -576,21 +576,53 @@ __global__ void infer_k(storage_type forest, predict_params params) {
  }
 }

+void set_carveout(void* kernel, int footprint, int max_shm) {
+  CUDA_CHECK(
+    cudaFuncSetAttribute(kernel, cudaFuncAttributePreferredSharedMemoryCarveout,


Is this actually needed?

yes. I've added the comments to clarify

cpp/src/fil/infer.cu

cpp/test/sg/fil_test.cu

cpp/src/fil/infer.cu

…d-memory

…ery inference

levsnv · 2021-05-26T05:39:48Z

rerun tests:
[ FAILED ] RFBatchedRegTests/RFBatchedRegTestD.Fit/0, where GetParam() = 64-byte object <05-00 00-00 01-00 00-00 01-00 00-00 00-00 80-3F 00-00 80-3F 01-00 00-00 FF-FF FF-FF 00-00 00-00 05-00 00-00 01-00 00-00 01-00 00-00 02-00 00-00 00-00 00-00 01-00 00-00 02-00 00-00 00-00 A0-C0

…e_type>(predict_params, ...)

…ared-memory

Co-authored-by: Andy Adinets <[email protected]>

…ra-shared-memory

dantegd · 2021-11-09T18:38:45Z

rerun tests

cpp/src/fil/fil.cu

dantegd · 2021-11-10T14:08:56Z

rerun tests

dantegd · 2021-11-11T20:08:56Z

rerun tests

dantegd · 2021-11-12T13:13:46Z

rerun tests

codecov-commenter · 2021-11-12T16:34:25Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.12@6c99d64). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-21.12    #3759   +/-   ##
===============================================
  Coverage                ?   85.92%           
===============================================
  Files                   ?      231           
  Lines                   ?    18587           
  Branches                ?        0           
===============================================
  Hits                    ?    15971           
  Misses                  ?     2616           
  Partials                ?        0

Flag	Coverage Δ
dask	`46.64% <0.00%> (?)`
non-dask	`78.56% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c99d64...27bdced. Read the comment docs.

wphicks · 2021-11-12T21:16:15Z

@gpucibot merge

To speed up inference on GPUs where extra shared memory can be opted in (and when this enables inference out of shared memory), take advantage. Authors: - Levs Dolgovs (https://github.com/levsnv) Approvers: - Andy Adinets (https://github.com/canonizer) - William Hicks (https://github.com/wphicks) URL: rapidsai#3759

try 1

53e0683

github-actions bot added the CUDA/C++ label Apr 17, 2021

levsnv added 3 - Ready for Review Ready for review by team CUDA / C++ CUDA issue non-breaking Non-breaking change Perf Related to runtime performance of the underlying code labels Apr 18, 2021

levsnv changed the title ~~[WIP] Use opt-in shared memory carveout for FIL~~ Use opt-in shared memory carveout for FIL Apr 18, 2021

levsnv marked this pull request as ready for review April 18, 2021 00:08

levsnv requested a review from a team as a code owner April 18, 2021 00:08

levsnv requested a review from canonizer April 18, 2021 00:08

levsnv added the feature request New feature or request label Apr 18, 2021

raydouglass removed the CUDA/C++ label Apr 19, 2021

canonizer suggested changes Apr 23, 2021

View reviewed changes

dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels Apr 24, 2021

levsnv added 3 commits May 25, 2021 16:15

draft of set-and-launch

31b885f

Merge remote-tracking branch 'rapidsai/branch-21.06' into extra-share…

589f9ad

…d-memory

set carveout and occupancy-affecting preferred cache config before ev…

26480b0

…ery inference

github-actions bot added the CUDA/C++ label May 26, 2021

other review comments

a037f16

levsnv requested a review from canonizer May 26, 2021 02:15

levsnv added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Author Waiting for author to respond to review labels May 26, 2021

levsnv added 2 commits June 11, 2021 23:30

DRY: rewrote in terms of dispatch_on_FIL_template_params<func, storag…

2a1d622

…e_type>(predict_params, ...)

Merge branch 'branch-21.08' of github.com:rapidsai/cuml into extra-sh…

bd3c505

…ared-memory

levsnv and others added 5 commits November 8, 2021 20:34

Apply suggestions from code review

43967f0

Co-authored-by: Andy Adinets <[email protected]>

some review comments

0f00f39

Merge branch 'extra-shared-memory' of github.com:levsnv/cuml into ext…

beb2ed8

…ra-shared-memory

misc

73a0d10

simplified the code by unconditionally enabling opt-in at model load

7938c13

levsnv added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on Author Waiting for author to respond to review labels Nov 9, 2021

levsnv removed the request for review from hcho3 November 9, 2021 07:25

canonizer reviewed Nov 10, 2021

View reviewed changes

cpp/src/fil/fil.cu Outdated Show resolved Hide resolved

same type on both sides of the comparison

161065d

levsnv added 4 - Waiting on Reviewer Waiting for reviewer to review or respond 3 - Ready for Review Ready for review by team and removed 3 - Ready for Review Ready for review by team 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Nov 10, 2021

static_cast instead of C-style cast

27bdced

levsnv requested a review from wphicks November 11, 2021 08:29

wphicks approved these changes Nov 11, 2021

View reviewed changes

levsnv removed the 3 - Ready for Review Ready for review by team label Nov 11, 2021

levsnv added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Nov 12, 2021

rapids-bot bot merged commit 4527cd4 into rapidsai:branch-21.12 Nov 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use opt-in shared memory carveout for FIL #3759

Use opt-in shared memory carveout for FIL #3759

levsnv commented Apr 17, 2021

levsnv commented Apr 18, 2021

canonizer Apr 23, 2021

levsnv May 26, 2021

levsnv commented May 26, 2021

dantegd commented Nov 9, 2021

dantegd commented Nov 10, 2021

dantegd commented Nov 11, 2021

dantegd commented Nov 12, 2021

codecov-commenter commented Nov 12, 2021

wphicks commented Nov 12, 2021

Use opt-in shared memory carveout for FIL #3759

Use opt-in shared memory carveout for FIL #3759

Conversation

levsnv commented Apr 17, 2021

levsnv commented Apr 18, 2021

canonizer Apr 23, 2021

Choose a reason for hiding this comment

levsnv May 26, 2021

Choose a reason for hiding this comment

levsnv commented May 26, 2021

dantegd commented Nov 9, 2021

dantegd commented Nov 10, 2021

dantegd commented Nov 11, 2021

dantegd commented Nov 12, 2021

codecov-commenter commented Nov 12, 2021

Codecov Report

wphicks commented Nov 12, 2021