-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] External libcudf APIs should expose CUDA streams #925
Comments
I woudl argue that we should take it one step further and say that every exposed api should provide the capacity at some point to allow stream to be defined with the most priority ones you mention which do allocation which can block execution , super gross. What other options would you want to pass it? An allocator? (besides stream) Creating / Destroying C vs C++ api Exposing in python If we are compiling with --default-stream per-thread then this is slightly less of a concern in the more immediate future. |
Sure, what I really meant is that every API that it makes sense to have execute on a stream should accept a stream argument. Some functions obviously never need a stream, e.g., https://github.com/rapidsai/cudf/blob/branch-0.6/cpp/include/cudf/functions.h#L72
Maybe a device ID? Device properties? I was thinking of the ModernGPU |
I like the approach that was taken in cuML here using a You can bundle lots of library specific resources in there such as:
|
Drawing inspiration from various other Python libraries it seems that some include a While othres use context managers, which are nice at managing global state sensibly. Here is an example from CuPy n = 10
zs = []
map_streams = []
stop_events = []
reduce_stream = cupy.cuda.stream.Stream()
for i in range(n):
map_streams.append(cupy.cuda.stream.Stream())
start_time = time.time()
# Map
for stream in map_streams:
with stream:
x = rand.normal(size=(1, 1024 * 256))
y = rand.normal(size=(1024 * 256, 1))
z = cupy.matmul(x, y)
zs.append(z)
stop_event = stream.record()
stop_events.append(stop_event) Personally I like the cupy solution, just because it doesn't require touching all of the API (I acknowledge that this response may be outside of the original scope of libcudf. My apologies if so). |
Hey @jrhemstad checking in about the status of this. This is an issue for the PyTorch and TF dataloaders that we're building. I know there are potentially other workarounds being explored. |
It's still an active point of conversation, but we don't have any current plans to add streams to public APIs in the near future. |
Contributes to #925. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - MithunR (https://github.com/mythrocks) - David Wendt (https://github.com/davidwendt) URL: #13629
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) - Divye Gala (https://github.com/divyegala) URL: #13987
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #13990
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - MithunR (https://github.com/mythrocks) - David Wendt (https://github.com/davidwendt) URL: #14010
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) URL: #14034
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: #14146
This PR adds overloads of `from_arrow` and `to_arrow` for scalars to enable interoperability on par with Arrow Arrays. The new public APIs accept streams, and for consistency streams have also been added to the corresponding column APIs, so this PR contributes to #925. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - David Wendt (https://github.com/davidwendt) - Bradley Dice (https://github.com/bdice) URL: #14121
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Karthikeyan (https://github.com/karthikeyann) URL: #14187
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: #14263
Contributes to #925 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) - Nghia Truong (https://github.com/ttnghia) URL: #14342
Closing in favor of #13744 |
Contributes to #925. Introduces cuda_stream parameter for downstream users to provide for `labeling_bins` Authors: - Danial Javady (https://github.com/ZelboK) - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #14401
Is your feature request related to a problem? Please describe.
In order to maximize efficiency and utilization of the GPU, you need to be able to execute kernels and memory copies on independent streams. However, no libcudf API currently exposes streams to the end user.
Describe the solution you'd like
Any API that executes a GPU kernel or allocates/copies memory should provide the end user with the option to specify a stream.
Open questions:
cudaStream_t*
? Or bundled inside of anoptions
struct?I suspect we'll all agree this does need to be done, the question is rather one of how and when.
The text was updated successfully, but these errors were encountered: