-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use opt-in shared memory carveout for FIL #3759
Use opt-in shared memory carveout for FIL #3759
Conversation
rerun tests |
cpp/src/fil/infer.cu
Outdated
@@ -576,21 +576,53 @@ __global__ void infer_k(storage_type forest, predict_params params) { | |||
} | |||
} | |||
|
|||
void set_carveout(void* kernel, int footprint, int max_shm) { | |||
CUDA_CHECK( | |||
cudaFuncSetAttribute(kernel, cudaFuncAttributePreferredSharedMemoryCarveout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I've added the comments to clarify
rerun tests: |
…e_type>(predict_params, ...)
Co-authored-by: Andy Adinets <[email protected]>
rerun tests |
rerun tests |
rerun tests |
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-21.12 #3759 +/- ##
===============================================
Coverage ? 85.92%
===============================================
Files ? 231
Lines ? 18587
Branches ? 0
===============================================
Hits ? 15971
Misses ? 2616
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
@gpucibot merge |
To speed up inference on GPUs where extra shared memory can be opted in (and when this enables inference out of shared memory), take advantage. Authors: - Levs Dolgovs (https://github.com/levsnv) Approvers: - Andy Adinets (https://github.com/canonizer) - William Hicks (https://github.com/wphicks) URL: rapidsai#3759
To speed up inference on GPUs where extra shared memory can be opted in (and when this enables inference out of shared memory), take advantage.