Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Arrow dependency from the datasource.hpp public header #13698

Merged
merged 18 commits into from
Aug 10, 2023

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Jul 14, 2023

Description

Remove arrow dependency from datasource.hpp.

Breaking only because users of arrow_io_source now need to include the new arrow_io_source.hpp header instead on datasource.hpp

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vuule vuule added cuIO cuIO issue improvement Improvement / enhancement to an existing function breaking Breaking change labels Jul 14, 2023
@vuule vuule self-assigned this Jul 14, 2023
@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. conda labels Jul 14, 2023
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jul 17, 2023
rapids-bot bot pushed a commit that referenced this pull request Jul 18, 2023
Adds the `cudf::io::datasource` source (header) to the doxygen IO Data Sources group/modules:
https://docs.rapids.ai/api/libcudf/stable/group__io__datasources.html

Found while working #13698 

Also created a Data Sinks group and added the `cudf::io::data_sink` source to it.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #13718
@vuule
Copy link
Contributor Author

vuule commented Jul 21, 2023

Will re-target for 23.10 once changes are done; this one is not worth the risk to merge late into the release cycle.

@vuule vuule changed the base branch from branch-23.08 to branch-23.10 August 1, 2023 18:04
@vuule vuule marked this pull request as ready for review August 2, 2023 23:32
@vuule vuule requested review from a team as code owners August 2, 2023 23:32
@vuule vuule changed the title Remove arrow dependency from the datasource.hpp public header Remove Arrow dependency from the datasource.hpp public header Aug 2, 2023
@galipremsagar
Copy link
Contributor

Thanks @vuule ! LGTM

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving ops-codeowner file changes

@@ -11,5 +11,5 @@ cdef extern from "cudf/io/arrow_io_source.hpp" \
namespace "cudf::io" nogil:

cdef cppclass arrow_io_source(cudf_io_datasource.datasource):
arrow_io_source(string arrow_uri) except +
arrow_io_source(const string& arrow_uri) except +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

east const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I only see "const string" in pxd files so I kept this consistent with other instances. I'm not sure if our east const guideline reached Cython.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cython doesn't support east const.

Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small software engineering comment. LGTM

*
* @param arrow_uri Apache Arrow Filesystem URI
*/
explicit arrow_io_source(std::string const& arrow_uri)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same to all other member functions. Maybe considering separating class declaration from definition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to keep this class separate from libcudf, but we don't have to take that route.
@davidwendt thoughts on keeping the implementation in libcudf?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what you mean by keeping this separate from libcudf.
I think Yunsong means moves the implementation to cpp/src/io/arrow_io_source.cpp and keep only the declarations in this header file. I agree with that approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that I will need to compile cpp/src/io/arrow_io_source.cpp as part of libcudf. Not an issue, just different from what I imagined in the beginning.

@vuule vuule requested a review from a team as a code owner August 10, 2023 07:10
@github-actions github-actions bot added the CMake CMake build issue label Aug 10, 2023
@vuule
Copy link
Contributor Author

vuule commented Aug 10, 2023

/merge

@rapids-bot rapids-bot bot merged commit 2801a27 into rapidsai:branch-23.10 Aug 10, 2023
@vuule vuule deleted the impr-separate-arrow-source branch August 10, 2023 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake CMake build issue cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants