Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use opt-in shared memory carveout for FIL #3759

Merged
merged 24 commits into from
Nov 12, 2021

Conversation

levsnv
Copy link
Contributor

@levsnv levsnv commented Apr 17, 2021

To speed up inference on GPUs where extra shared memory can be opted in (and when this enables inference out of shared memory), take advantage.

@levsnv levsnv added 3 - Ready for Review Ready for review by team CUDA / C++ CUDA issue non-breaking Non-breaking change Perf Related to runtime performance of the underlying code labels Apr 18, 2021
@levsnv levsnv changed the title [WIP] Use opt-in shared memory carveout for FIL Use opt-in shared memory carveout for FIL Apr 18, 2021
@levsnv levsnv marked this pull request as ready for review April 18, 2021 00:08
@levsnv levsnv requested a review from a team as a code owner April 18, 2021 00:08
@levsnv levsnv requested a review from canonizer April 18, 2021 00:08
@levsnv levsnv added the feature request New feature or request label Apr 18, 2021
@levsnv
Copy link
Contributor Author

levsnv commented Apr 18, 2021

rerun tests
HTTP 504 GATEWAY TIME-OUT for url https://conda.anaconda.org/conda-forge/linux-64/glog-0.4.0-h49b9bf7_3.tar.bz2

cpp/src/fil/common.cuh Outdated Show resolved Hide resolved
@@ -576,21 +576,53 @@ __global__ void infer_k(storage_type forest, predict_params params) {
}
}

void set_carveout(void* kernel, int footprint, int max_shm) {
CUDA_CHECK(
cudaFuncSetAttribute(kernel, cudaFuncAttributePreferredSharedMemoryCarveout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I've added the comments to clarify

cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/test/sg/fil_test.cu Outdated Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
cpp/src/fil/infer.cu Outdated Show resolved Hide resolved
@dantegd dantegd added 4 - Waiting on Author Waiting for author to respond to review and removed 3 - Ready for Review Ready for review by team labels Apr 24, 2021
@levsnv levsnv requested a review from canonizer May 26, 2021 02:15
@levsnv levsnv added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 4 - Waiting on Author Waiting for author to respond to review labels May 26, 2021
@levsnv
Copy link
Contributor Author

levsnv commented May 26, 2021

rerun tests:
[ FAILED ] RFBatchedRegTests/RFBatchedRegTestD.Fit/0, where GetParam() = 64-byte object <05-00 00-00 01-00 00-00 01-00 00-00 00-00 80-3F 00-00 80-3F 01-00 00-00 FF-FF FF-FF 00-00 00-00 05-00 00-00 01-00 00-00 01-00 00-00 02-00 00-00 00-00 00-00 01-00 00-00 02-00 00-00 00-00 A0-C0

@levsnv levsnv added 3 - Ready for Review Ready for review by team and removed 4 - Waiting on Author Waiting for author to respond to review labels Nov 9, 2021
@levsnv levsnv removed the request for review from hcho3 November 9, 2021 07:25
@dantegd
Copy link
Member

dantegd commented Nov 9, 2021

rerun tests

cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
@levsnv levsnv added 4 - Waiting on Reviewer Waiting for reviewer to review or respond 3 - Ready for Review Ready for review by team and removed 3 - Ready for Review Ready for review by team 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Nov 10, 2021
@dantegd
Copy link
Member

dantegd commented Nov 10, 2021

rerun tests

@levsnv levsnv requested a review from wphicks November 11, 2021 08:29
@dantegd
Copy link
Member

dantegd commented Nov 11, 2021

rerun tests

@levsnv levsnv removed the 3 - Ready for Review Ready for review by team label Nov 11, 2021
@dantegd
Copy link
Member

dantegd commented Nov 12, 2021

rerun tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.12@6c99d64). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.12    #3759   +/-   ##
===============================================
  Coverage                ?   85.92%           
===============================================
  Files                   ?      231           
  Lines                   ?    18587           
  Branches                ?        0           
===============================================
  Hits                    ?    15971           
  Misses                  ?     2616           
  Partials                ?        0           
Flag Coverage Δ
dask 46.64% <0.00%> (?)
non-dask 78.56% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c99d64...27bdced. Read the comment docs.

@levsnv levsnv added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Nov 12, 2021
@wphicks
Copy link
Contributor

wphicks commented Nov 12, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4527cd4 into rapidsai:branch-21.12 Nov 12, 2021
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
To speed up inference on GPUs where extra shared memory can be opted in (and when this enables inference out of shared memory), take advantage.

Authors:
  - Levs Dolgovs (https://github.com/levsnv)

Approvers:
  - Andy Adinets (https://github.com/canonizer)
  - William Hicks (https://github.com/wphicks)

URL: rapidsai#3759
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CUDA / C++ CUDA issue CUDA/C++ feature request New feature or request non-breaking Non-breaking change Perf Related to runtime performance of the underlying code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants