Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json APIs to pylibcudf #17025

Merged
merged 5 commits into from
Oct 10, 2024

Conversation

mroeschke
Copy link
Contributor

Description

Contributes to #15162

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@mroeschke mroeschke added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package labels Oct 8, 2024
@mroeschke mroeschke requested a review from a team as a code owner October 8, 2024 23:41
@mroeschke mroeschke requested review from wence- and vyasr October 8, 2024 23:41
@github-actions github-actions bot added Python Affects Python cuDF API. CMake CMake build issue labels Oct 8, 2024
@@ -12,6 +12,7 @@ strings
find_multiple
findall
padding
json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_json_object and related APIs are not part of the cudf::strings namespace in libcudf.
https://github.com/rapidsai/cudf/blob/branch-24.12/cpp/include/cudf/json/json.hpp
Should pylibcudf match libcudf more closely than cudf?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes pylibcudf should match libcudf more closely than cudf python. So, I think we should move json.pxd and json.pyx out of strings.

But a broader point: I think pylibcudf is first and foremost "python bindings for libcudf", so we should match libcudf as closely as possible. But pylibcudf is a python library so I think it makes sense over time to make it more "pythonic" (eg. easier to use with other python libraries).
cc. @vyasr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch. I moved this out of the string namespace.

Copy link
Contributor

@vyasr vyasr Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not putting this in the strings namespace.

@Matt711 I don't understand the relevance of your second point. I agree that we'll be making some cosmetic changes to make it more Pythonic than a raw translation of libcudf API calls, but does that have any particular bearing on what's being discussed here? Or were you just responding to David's question with a statement that while pylibcudf APIs should closely match libcudf APIs, they may not match exactly if there is a more Pythonic choice for the pylibcudf API that still faithfully reflects the semantics of the C++ functions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or were you just responding to David's question with a statement that while pylibcudf APIs should closely match libcudf APIs, they may not match exactly if there is a more Pythonic choice for the pylibcudf API that still faithfully reflects the semantics of the C++ functions?

That's right. I realized I didn't add "Is that right?" before I @'d you because I wanted to get your thoughts.

@mroeschke mroeschke requested a review from a team as a code owner October 9, 2024 22:10
@mroeschke mroeschke changed the title Add string.json APIs to pylibcudf Add json APIs to pylibcudf Oct 10, 2024
Copy link
Member

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving you a packaging-codeowners approval. Left one packaging-related comment for your consideration. I don't need to re-review unless the packaging-related stuff changes. Will defer to other reviewers on the API changes in this PR.

@@ -97,7 +97,8 @@ skip = [
]

[tool.pytest.ini_options]
addopts = "--tb=native --strict-config --strict-markers"
# --import-mode=importlib because two test_json.py exists and tests directory is not a structured module
addopts = "--tb=native --strict-config --strict-markers --import-mode=importlib"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://docs.pytest.org/en/7.1.x/explanation/pythonpath.html, --import-mode=importlib was added in pytest 6.0.

So we probably need a floor here:

- pytest<8

like:

- pytest>=6.0,<8

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically true, but I think we have other reasons why we wouldn't be compatible with lower pytest versions either and we're just generally underspecified. I don't think that we need to address this here and should instead look into this as part of rapidsai/build-planning#105.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, definitely not worth another run of CI for.

@vyasr vyasr requested a review from Matt711 October 10, 2024 21:18
@mroeschke
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 7d49df7 into rapidsai:branch-24.12 Oct 10, 2024
110 checks passed
@mroeschke mroeschke deleted the pylibcudf/strings/json branch October 10, 2024 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants