Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preallocate imfilter kernel memory #45

Merged

Conversation

ClaroHenrique
Copy link
Contributor

@ClaroHenrique ClaroHenrique commented Aug 24, 2022

This PR aims to optimize the kernel padding in GPU imfilter. The idea is to preallocate a CuArray with the same size of TI and copy the kernel data to this array, avoiding the full padded kernel allocation in every imfilter operation.

@codecov-commenter
Copy link

codecov-commenter commented Aug 24, 2022

Codecov Report

Merging #45 (1d935d2) into master (3eebf4b) will decrease coverage by 0.62%.
The diff coverage is 0.00%.

❗ Current head 1d935d2 differs from pull request most recent head ce926cc. Consider uploading reports for the commit ce926cc to get more accurate results

@@            Coverage Diff             @@
##           master      #45      +/-   ##
==========================================
- Coverage   85.71%   85.08%   -0.63%     
==========================================
  Files           9        9              
  Lines         420      456      +36     
==========================================
+ Hits          360      388      +28     
- Misses         60       68       +8     
Impacted Files Coverage Δ
src/imfilter.jl 11.76% <0.00%> (-4.91%) ⬇️
src/relaxation.jl 93.10% <0.00%> (-3.20%) ⬇️
src/iqsim.jl 96.85% <0.00%> (-0.63%) ⬇️
src/geostats.jl 100.00% <0.00%> (ø)
src/graphcut.jl 100.00% <0.00%> (ø)
src/taumodel.jl 100.00% <0.00%> (ø)
src/plotrecipes.jl 0.00% <0.00%> (ø)
src/utils.jl 94.87% <0.00%> (+0.42%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@ClaroHenrique ClaroHenrique marked this pull request as ready for review August 24, 2022 11:56
@juliohm
Copy link
Member

juliohm commented Aug 24, 2022

@ClaroHenrique I was looking into the code and perhaps we can improve the situation by refactoring the imfilter_kernel and fastdistance functions to accept the img and the krn with the same size already. So instead of adding a third argument to all functions we could simply pass the two arrays with the same size (krn padded) and a third argument with the cartesian indices if necessary. That will be more intuitive to maintain. Do you understand what I mean? Can you work on that?

@ClaroHenrique
Copy link
Contributor Author

@ClaroHenrique I was looking into the code and perhaps we can improve the situation by refactoring the imfilter_kernel and fastdistance functions to accept the img and the krn with the same size already. So instead of adding a third argument to all functions we could simply pass the two arrays with the same size (krn padded) and a third argument with the cartesian indices if necessary. That will be more intuitive to maintain. Do you understand what I mean? Can you work on that?

Working on that.

Copy link
Member

@juliohm juliohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ClaroHenrique I think we are taking too long to address these optimizations in the code. Can we set a meeting to discuss a plan that is effective and quick to execute?

src/imfilter.jl Outdated Show resolved Hide resolved
src/imfilter.jl Outdated Show resolved Hide resolved
src/imfilter.jl Outdated Show resolved Hide resolved
@ClaroHenrique ClaroHenrique requested a review from juliohm August 25, 2022 22:17
@juliohm juliohm merged commit d55da6c into JuliaEarth:master Aug 26, 2022
@juliohm
Copy link
Member

juliohm commented Aug 26, 2022

@ClaroHenrique I merged it assuming that you saw good speedups with the change. I will test it locally now to see how it goes on my NVIDIA GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants