-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to ArrowDataSource in SourceInfo #16050
Add support to ArrowDataSource in SourceInfo #16050
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, rather than doing this I think we should go ahead and deprecate the native file support. @rjzamora any objections?
If we want to port all the I/O readers to pylibcudf, we'll have to add native file support here (at least until the deprecation is executed). I suppose we could read all the bytes from the Arrow NativeFile as an alternative, but that's probably not a good idea. Is it possible to put this in for now, given the diff is pretty small? I also do a couple other things in this PR in addition to the Arrow support. |
Fair enough, I agree that if we're planning a deprecation cycle we don't want to gate the progress here. I'll review as is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to do the deprecation in this PR or a follow-up, I'm approving either way.
@@ -84,6 +86,13 @@ cdef class SourceInfo: | |||
|
|||
self.c_obj = move(source_info(c_files)) | |||
return | |||
elif isinstance(sources[0], Datasource): | |||
for csrc in sources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go ahead and insert the deprecation warning as part of this PR, either here or in the NativeFileDatasource constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take care of this in a followup, since I have another PR stacked on top of this one.
Thanks for the review!
/merge |
Description
ArrowDataSources weren't previously supported in SourceInfo.
(since we didn't need it for Avro).
Adding it now so we can pass tests for orc reader and co.
(even though ArrowDataSource may potentially be removed in the future)
Checklist