[FEA] cuML to expose a "proper" CUDA API #92

teju85 · 2019-01-14T05:18:26Z

Is your feature request related to a problem? Please describe.
We currently are not exposing the following things from our C/C++ API:

cudaStream
cublasHandle_t and cusolverDnHandle_t
custom memory allocators

The advantages of doing these are:

performance
tighter control over job scheduling and resource allocation from the wrapping library itself
we'll be in unison with other cuda libraries. So, lesser ramp-up curve for our users

Describe the solution you'd like
One solution can be to:

expose a cumlHandle_t structure (just like cudnn/cublas/cufft/cusolver).
give users an ability to set and get above handles/streams/allocators.
make all of our exposed methods in cuML to accept this object.

Describe alternatives you've considered
There are no alternatives currently.

Additional context
None.

Note
Just like #77 , I'm mostly filing this issue so that it doesn't slip away. Please feel free to set the priority for this accordingly, @datametrician @dantegd .

The text was updated successfully, but these errors were encountered:

teju85 · 2019-02-05T05:59:18Z

Some more concrete descriptions of what needs to be done...

Define a cumlError_t enum and a corresponding cumlGetErrorString method.
Define a cumlHandle_t struct
Users are expected NOT to depend on the internals of this structure! Example definition could be:

struct cumlHandle {
  cublasHandle_t cublas;
  cusolverDnHandle_t cusolverDn;
  DeviceAllocator alloc;
};
typedef cumlHandle_t cumlHandle*;

Define interfaces on this struct, the usual ones just like the others (they must be "extern C"d, obviously)

cumlError_t cumlCreateHandle(cumlHandle_t*);
cumlError_t cumlDestroyHandle(cumlHandle_t);
cumlError_t cumlSetStream(cudaStream_t); // should set the same stream for both cublas and cusolverDn, atleast for now!
cumlError_t cumlGetStream(cudaStream_t*);
cumlError_t cumlSetAllocator(AllocFunctor, DeallocFunctor);
// this allocator is supposed to be used internally in cuml and/or ml-prims for working with temporary workspaces.

Expose these interface in cython world as well.
As a POC, pick one of the existing algos: pca, tsvd or dbscan and update their interface to accept cumlHandle_t

Note: the 'Allocator' thingy can only work after the PR #167 is done!

Vinay Deshpande (@vinaydes) will be helping us here as a "starter" task for his onboarding on cuml.

oyilmaz-nvidia · 2019-02-06T05:12:10Z

@teju85 Exposing library handlers, streams, etc is a pretty good idea. We should definitely include this in the coming versions.

oyilmaz-nvidia · 2019-02-06T05:13:15Z

@teju85 Maybe, we might not need the cumlAlloc or its variations because we use cuDF for this kind of operations.

teju85 · 2019-02-06T05:17:29Z

I see your point regarding alloc/free functions. Those were added in order to be used by cuml and/or ml-prims, where the function needs temporary allocations.

Your point makes sense. Let's not expose these 2 methods, instead, have the custom allocator being used totally internally inside cuml and ml-prims.

teju85 · 2019-02-06T05:18:51Z

@oyilmaz-nvidia @vinaydes Updated the interface proposal above based on the above feedback.

cjnolet · 2019-02-06T05:20:40Z

I was going to recommend that we have some global handle for being able to track workspace allocations. I'm 100% for this.

teju85 · 2019-02-06T05:23:25Z

@cjnolet Let's try to avoid such global vars as much as possible.

cjnolet · 2019-02-06T07:18:21Z

I misread the description on this issue. I was thinking along the lines of #186, which would also be good to standardize in the CUDA API.

teju85 · 2019-02-06T08:16:18Z

Agree @cjnolet . I just tagged you on that issue and mentioned the same thing. This allocator being discussed here should simplify workspace allocation logic by a lot.

dantegd · 2019-02-06T14:01:01Z

@teju85 regarding the allocation, the idea works perfectly, and also allows us to start using RMM for the allocations that cuDF doesn’t handle. @oyilmaz-nvidia In fact we don’t always depend on cuDF for allocation, our python apis accept host numpy arrays that are then transferred to the GPU with numba, which will change to use RMM.

I also absolutely love the proposed cuml_handle. On the other hand, regarding error handling, since python is our main end user interface for the time being, exceptions are going to give us a more robust error reporting infrastructure for that, and in general we are not limiting cuML to be a C api (so the c api can be a wrapper around the c++, which it currently mostly is), so at least at the c++ level we can raise the exceptions. Which is what RMM and cuDF are moving to soon. (Hope I was clear in that explanation, it’s early here)

dantegd · 2019-02-06T17:00:21Z

@teju85 now that its a little later in the day I just wanted to clarify. In my comment above, I am trying to say that I favor both having the error code and throwing exceptions. Languages that can link to C++ libraries great the benefit of exceptions, languages without C++ support only get the cuml_error_t support.

teju85 · 2019-02-12T11:26:23Z

Fair enough. Makes sense @dantegd

@jirikraus and @vinaydes, there's a request to add random_state as a way to repro the ML algos in issue #207 . I was thinking about creating a cumlError_t cumlSetSeed(uint64_t); method to the proposed list as well. What do you folks think about it?

jirikraus · 2019-03-01T14:26:02Z

Just explicitly adding a reference to the PR addressing this. #247

[REVIEW] Proposal for "proper" C/C++ API (issue #92)

[REVIEW] No random dataset for tsvd and pca algorithm

teju85 added ? - Needs Triage Need team to review and classify feature request New feature or request labels Jan 14, 2019

teju85 mentioned this issue Feb 6, 2019

[FEA] Catching C++ and CUDA errors in the python level #185

Closed

teju85 mentioned this issue Feb 6, 2019

[FEA] Estimating the required space for a ML algorithm before starting the execution #186

Open

teju85 mentioned this issue Feb 12, 2019

[FEA] Add 'random_state' parameter to ML algorithms #207

Closed

fondaing removed the ? - Needs Triage Need team to review and classify label Feb 25, 2019

teju85 mentioned this issue Mar 1, 2019

[FEA] [DEBT] cumlHandle: Reduce duplication of resources/code between ml-prims and cuML #255

Closed

jirikraus added a commit to jirikraus/cuml that referenced this issue Mar 8, 2019

Proposal for "proper" C/C++ API (issue rapidsai#92)

25997e8

dantegd added a commit that referenced this issue Mar 12, 2019

Merge pull request #247 from jirikraus/fea-ext-cuML-cpp-iface

49dfd22

[REVIEW] Proposal for "proper" C/C++ API (issue #92)

dantegd closed this as completed May 10, 2019

Salonijain27 pushed a commit to Salonijain27/cuml that referenced this issue Jan 22, 2020

Merge pull request rapidsai#92 from Salonijain27/no_random_tsvd

e2957ce

[REVIEW] No random dataset for tsvd and pca algorithm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] cuML to expose a "proper" CUDA API #92

[FEA] cuML to expose a "proper" CUDA API #92

teju85 commented Jan 14, 2019

teju85 commented Feb 5, 2019 •

edited

Loading

oyilmaz-nvidia commented Feb 6, 2019

oyilmaz-nvidia commented Feb 6, 2019

teju85 commented Feb 6, 2019

teju85 commented Feb 6, 2019

cjnolet commented Feb 6, 2019

teju85 commented Feb 6, 2019

cjnolet commented Feb 6, 2019 •

edited

Loading

teju85 commented Feb 6, 2019

dantegd commented Feb 6, 2019

dantegd commented Feb 6, 2019

teju85 commented Feb 12, 2019 •

edited

Loading

jirikraus commented Mar 1, 2019

[FEA] cuML to expose a "proper" CUDA API #92

[FEA] cuML to expose a "proper" CUDA API #92

Comments

teju85 commented Jan 14, 2019

teju85 commented Feb 5, 2019 • edited Loading

oyilmaz-nvidia commented Feb 6, 2019

oyilmaz-nvidia commented Feb 6, 2019

teju85 commented Feb 6, 2019

teju85 commented Feb 6, 2019

cjnolet commented Feb 6, 2019

teju85 commented Feb 6, 2019

cjnolet commented Feb 6, 2019 • edited Loading

teju85 commented Feb 6, 2019

dantegd commented Feb 6, 2019

dantegd commented Feb 6, 2019

teju85 commented Feb 12, 2019 • edited Loading

jirikraus commented Mar 1, 2019

teju85 commented Feb 5, 2019 •

edited

Loading

cjnolet commented Feb 6, 2019 •

edited

Loading

teju85 commented Feb 12, 2019 •

edited

Loading