Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Adapt to NumPy 2 promotion changes #16141

Merged
merged 9 commits into from
Jul 15, 2024

Conversation

seberg
Copy link
Contributor

@seberg seberg commented Jul 1, 2024

Splitting out the non API changes from gh-15897, the Scalar API change is required for the tests to pass with NumPy 2, but almost all changes should be relatively straight forward here on their own.

(I will add inline comments.)


This PR does not fix integer comparisons, there are currently no tests that run into these.

xref: rapidsai/build-planning#38

@seberg seberg requested a review from a team as a code owner July 1, 2024 10:50
@seberg seberg requested review from vyasr and mroeschke July 1, 2024 10:50
Copy link

copy-pr-bot bot commented Jul 1, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the Python Affects Python cuDF API. label Jul 1, 2024
try:
is_safe = source_dtype.type(other) == other
except OverflowError:
is_safe = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just adds the OverflowError check, because NumPy is now strict, otherwise the error type would change here.

# Go via NumPy to get the value
other = np.array(other)
if other.dtype.kind in "ifc":
other = other.item()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest change, on NumPy 2, we need to pass Python scalars to align with Pandas (not NumPy, which pandas does not align with here).
On NumPy 1, this doesn't really make a difference.

To get those Python scalars, .item() is one reasonable way.

@@ -341,7 +341,6 @@ def test_dtype(in_dtype, expect):
np.complex128,
complex,
"S",
"a",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just avoids an error "a" has been removed, and seems not particularly important here.

@mroeschke mroeschke added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 1, 2024
@mroeschke
Copy link
Contributor

/okay to test

1 similar comment
@mroeschke
Copy link
Contributor

/okay to test

def prinoptions(cls):
# TODO: NumPy now prints scalars as `np.int8(1)`, etc. this should
# be adapted evantually.
if np.lib.NumpyVersion(np.__version__) >= "2.0.0rc1":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another suggestion to use packaging.version.parse if you recommend against np.lib usage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is used in many tests, adopted now (and rebased, but all except last commits are unchanged).

Copy link
Contributor

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seberg added 8 commits July 12, 2024 19:14
Note that pandas `where` seems to promote the Series based on the
value even with NumPy 2.  This was never copied by cudf (i.e. an
outstanding issue)
Pandas keeps using weak promotion even for strongly typed
"scalars" (i.e. 0-d objects).
This tries to (mostly) match that, but there may be better ways to
do it.  I am having difficulty to think of the best way though.
@seberg seberg force-pushed the numpy2-compat-not-full branch from 66efccf to 927904c Compare July 12, 2024 17:15
@seberg seberg force-pushed the numpy2-compat-not-full branch from 927904c to 6af0dec Compare July 12, 2024 17:16
@mroeschke
Copy link
Contributor

/okay to test

@seberg
Copy link
Contributor Author

seberg commented Jul 15, 2024

Friendly ping: I think this should be pretty safe to push through. There is the follow-up needed to ensure integer comparisons don't misbehave (the test suite currently doesn't catch this problem).
I'll open an issue for that once merged.

@mroeschke
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit 1889c7c into rapidsai:branch-24.08 Jul 15, 2024
79 checks passed
@seberg seberg deleted the numpy2-compat-not-full branch July 15, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants