-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Expose cumlHandle (+ other goodies!) into cython world #331
Conversation
…een Stream and Handle classes
…scikit-esq ML classes
…a-ext-cython-cumlhandle
…keywords inside sgd.pyx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't really provide feedback on the python side of this. I hope my comments are still useful.
…le of cython build errors
Can one of the admins verify this patch? |
@jirikraus I'd like to particularly bring your attention to my previous commit #000176b. I have migrated the |
Absolutely not. I thought that was the intention of moving the abstract interface of ml-prims. P.S. FYI: I would like to take a more detailed look on the C++ things you added, but I am not sure if I find time for it this week while I am at GTC. One thought I had after giving this a quick view: I think it would make sense to separate out the basic infrastructure changes from applying it to the pca and tsvd algorithms in different PRs. I understand that you need an algorithm to try this, but I think the review would be more efficient if it is done desperately. Does that make sense to you? |
Not sure if you got my point. If tsvd depends on pca: Why not use PCA only for the first PR and after that is merged target tsvd. Is there also a depedency of PCA to TSVD? Or perhaps I do not see the reason to have two examples in the initial PR. |
My bad. I actually meant "pca depends on tsvd". never mind my previous comments. I found a work-around to just confine the changes to PCA alone. |
truncCompExpVars(handle, cov.data(), components, explained_var, | ||
explained_var_ratio, prms); | ||
math_t scalar = (prms.n_rows - 1); | ||
Matrix::seqRoot(explained_var, singular_vals, scalar, prms.n_components, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No stream or handle?
explained_var_ratio, prms); | ||
math_t scalar = (prms.n_rows - 1); | ||
Matrix::seqRoot(explained_var, singular_vals, scalar, prms.n_components, true); | ||
Stats::meanAdd(input, input, mu, prms.n_cols, prms.n_rows, false, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No stream or handle?
pcaFit(handle, input, components, explained_var, explained_var_ratio, singular_vals, | ||
mu, noise_vars, prms); | ||
pcaTransform(handle, input, components, trans_input, singular_vals, mu, prms); | ||
signFlip(trans_input, prms.n_rows, prms.n_components, components, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No stream or handle passed in. I see more instances of this below but will not call them all out.
Stats::sum(total_vars.data(), vars.data(), 1, prms.n_cols, false); | ||
|
||
math_t total_vars_h; | ||
updateHost(&total_vars_h, total_vars.data(), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synchronous updateHost
does not take a stream. This needs to be
updateHostAsync( ..., stream);
cudaStreamSynchronize(stream);
if I do not miss anything.
CUDA_CHECK(cudaGetLastError()); | ||
|
||
int dev_info; | ||
updateHost(&dev_info, d_dev_info, 1); | ||
updateHost(&dev_info, d_dev_info.data(), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure correct stream ordering this needs to be updateHostAsync
+ cudaStreamSynchronize
if I do not miss anything.
…a-ext-cython-cumlhandle
…nt ml-prims calls to pass stream. Still need to update Matrix:: namespace to accept stream and updateHost to be async
…reprocess.h accordingly
Thanks @kkraus14 |
Hi all,
@dantegd you'll have to (re-)review the python changes once more in PR #435 ! Sorry about the extra-work that this has created for you. |
Based on this comment:
@teju85 @jirikraus Can we close this PR? |
@teju85 has the final call, but I think yes this PR is outdated. |
There's only one item pending from my previously commented list above, and that is the corresponding Will close this PR, once that is filed. Give me some more time. |
PR #482 filed for the pca + tsvd related code cleanup and cumlHandle exposure. This PR is no longer needed. Closing. |
This PR exposes
cumlHandle
into the cython world, over-and-above @jirikraus's work in PR #247 .Additionally, this also proposes for a
Base
class to be inherited by all ML algos. Such base class will go a long way in reducing code duplication across cuML python interface.I've used PCA as an example to demonstrate the
cumlHandle
+Base
class related updates. The hope is that all of us can use this as an example to update other algos too (including the new ones).Things addressed in this PR:
cumlHandle
setAllocator
methods for thecumlHandle
class into cython worlddeviceAllocator
instead of the existing 'DeviceAllocator` class.