-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add a poll method to cuStreamz for single messages #13600
Comments
Thank you @chinmaychandak for reporting this. Would you please share a bit more background? Have you been relying on the |
Hey @GregoryKimball! We're resuming dev on This worked well for versions around Here's a smaller MRE:
Since the |
@chinmaychandak, @randerzander, & @GregoryKimball I just started looking into this. I wanted to make everyone aware that the "poll" method only returns a single "kafka message" as described in the Confluent Kafka documentation here. What that means is cudf would only be provided with a single message each time that the "poll" method is invoked in custreamz. Cudf gains its speed advantages by operating on "bulk reads" of data. So getting a single message at a time via custreamz is likely to be slower than just using the confluent kafka python library, ("ck_consumer.poll") if you follow the MRE ^^. I'm willing to add the function but want to make everyone aware that it will not be fast and in fact might be slower. My personal recommendation is to just use the existing confluent kafka python library for this operation. I will wait for feedback from others before I continue. |
Bulk reads are still the way to go, no changes there: https://github.com/python-streamz/streamz/blob/master/streamz/sources.py#L746-L758. |
That makes sense. So, given that what about if I created a function in |
Yes, that sounds reasonable. Thanks Jeremy! |
Streamz has updated their codebase to include a call to the Confluent Kafka Consumer library function 'poll'. Currently custreamz does not include this method. This PR adds the 'poll' function to custreamz to simply proxy the call to the underlying confluent kafka library so that streamz is no longer broken for end users. Without this function end users are no longer able to use custreamz with newer versions of the streamz library. This closes: #13600 Authors: - Jeremy Dyer (https://github.com/jdye64) Approvers: - Bradley Dice (https://github.com/bdice) URL: #13782
I am using cuStreamz with cudf 23.06.
When using the Confluent Kafka engine, things work as expected. When using the accelerated
cudf_kafka
datasource, I see the below error:@jdye64 / @randerzander - Can you please help take a look?
The text was updated successfully, but these errors were encountered: