Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Stream-preserving constructors for scalar class and its derived classes #14333

Open
shrshi opened this issue Oct 25, 2023 · 6 comments
Open
Labels
2 - In Progress Currently a work in progress feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.

Comments

@shrshi
Copy link
Contributor

shrshi commented Oct 25, 2023

Is your feature request related to a problem? Please describe.
When the class constructor is implicitly called (when the object is passed-by-value, for instance), the constructor is called with the default stream, and not the CUDA stream passed by the user on which the algorithm is running.
string_scalar(string_scalar const& other, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

Describe the solution you'd like
Consider requiring stream to be passed to constructor by making it a non-default argument.
string_scalar(string_scalar const& other, rmm::cuda_stream_view stream , rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

Describe alternatives you've considered
None

Additional context
None

@shrshi shrshi added feature request New feature or request Needs Triage Need team to review and classify labels Oct 25, 2023
@GregoryKimball
Copy link
Contributor

I believe #14354 will address this

@GregoryKimball GregoryKimball added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Nov 9, 2023
@karthikeyann
Copy link
Contributor

#14354 replaces all such occurences but it does not remove the default argument.
Removing default value for stream in scalars, is good suggestion. column also has similar copy constructor with defaulted stream argument. should similar removal be done for it too?

@davidwendt
Copy link
Contributor

These are public interfaces and all public APIs currently have defaulted stream parameters.
Wouldn't removing the default break many upstream callers?
Specifically, the cuDF Cython and Python layers would be required to manufacture a stream to create a scalar?
I don't think we should remove the default from any specific public interfaces unless we consider removing them from all perhaps.

@karthikeyann
Copy link
Contributor

karthikeyann commented Nov 14, 2023

Specifically, the cuDF Cython and Python layers would be required to manufacture a stream to create a scalar?

Right. We can't remove stream from scalar methods or any other public interfaces.

cudf data types does not store streams (scalar, column, table, groupby). https://docs.rapids.ai/api/libcudf/nightly/developer_guide (libcudf API and Implementation → Streams). streams are passed to libcudf APIs. So, storing them inside data structure doesn't align well, and also execution should use stream (including memory allocation), IMO, data structures should not hold streams (although rmm differs from this). The constructors are called with streams because memory allocation and copy will execute on streams.

@karthikeyann
Copy link
Contributor

It's made clear than the default arguments on public APIs can't be removed without a breaking change.
Clarification on the issue title:

Stream-preserving constructors for scalar class and its derived classes

Does this mean to propose storing streams inside scalar class?

@vyasr
Copy link
Contributor

vyasr commented Nov 15, 2023

Could we take a step back here and ask why

When the class constructor is implicitly called (when the object is passed-by-value, for instance)

is happening at all? cudf APIs are generally designed so that APIs should accept views rather than values. We don't have a scalar_view type (for reasons) but we typically handle this by instead passing references to scalar objects. Are there places where we really should be passing scalars by value? If so, why? Perhaps I'm missing something obvious, but this seems like potentially a design smell to me. If a scalar is passed by value and a copy constructor is called that implies the construction of a new scalar and a new device allocation, which isn't usually desirable.

IOW I don't know if there's a good reason for the copy constructor to not be explicit in the long run. I made the same point in a comment on the corresponding PR as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress feature request New feature or request libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

5 participants