MAINT: Adapt to NumPy 2 promotion changes #16141

seberg · 2024-07-01T10:50:26Z

Splitting out the non API changes from gh-15897, the Scalar API change is required for the tests to pass with NumPy 2, but almost all changes should be relatively straight forward here on their own.

(I will add inline comments.)

This PR does not fix integer comparisons, there are currently no tests that run into these.

xref: rapidsai/build-planning#38

copy-pr-bot · 2024-07-01T10:50:30Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

seberg · 2024-07-01T10:51:21Z

python/cudf/cudf/core/_internals/where.py

+            try:
+                is_safe = source_dtype.type(other) == other
+            except OverflowError:
+                is_safe = False


This just adds the OverflowError check, because NumPy is now strict, otherwise the error type would change here.

seberg · 2024-07-01T10:53:58Z

python/cudf/cudf/core/column/numerical.py

+            # Go via NumPy to get the value
+            other = np.array(other)
+            if other.dtype.kind in "ifc":
+                other = other.item()


The biggest change, on NumPy 2, we need to pass Python scalars to align with Pandas (not NumPy, which pandas does not align with here).
On NumPy 1, this doesn't really make a difference.

To get those Python scalars, .item() is one reasonable way.

seberg · 2024-07-01T10:54:54Z

python/cudf/cudf/tests/test_dtypes.py

@@ -341,7 +341,6 @@ def test_dtype(in_dtype, expect):
        np.complex128,
        complex,
        "S",
-        "a",


Just avoids an error "a" has been removed, and seems not particularly important here.

mroeschke · 2024-07-01T19:24:46Z

/okay to test

mroeschke · 2024-07-02T19:55:32Z

/okay to test

mroeschke · 2024-07-10T01:06:04Z

python/cudf/cudf/tests/test_doctests.py

+    def prinoptions(cls):
+        # TODO: NumPy now prints scalars as `np.int8(1)`, etc. this should
+        #       be adapted evantually.
+        if np.lib.NumpyVersion(np.__version__) >= "2.0.0rc1":


Another suggestion to use packaging.version.parse if you recommend against np.lib usage

Since it is used in many tests, adopted now (and rebased, but all except last commits are unchanged).

mroeschke

An optional fix https://github.com/rapidsai/cudf/pull/16141/files#r1671422229 but overall looks good

Note that pandas `where` seems to promote the Series based on the value even with NumPy 2. This was never copied by cudf (i.e. an outstanding issue)

Pandas keeps using weak promotion even for strongly typed "scalars" (i.e. 0-d objects). This tries to (mostly) match that, but there may be better ways to do it. I am having difficulty to think of the best way though.

mroeschke · 2024-07-12T17:27:17Z

/okay to test

seberg · 2024-07-15T09:39:26Z

Friendly ping: I think this should be pretty safe to push through. There is the follow-up needed to ensure integer comparisons don't misbehave (the test suite currently doesn't catch this problem).
I'll open an issue for that once merged.

mroeschke · 2024-07-15T18:32:10Z

/merge

seberg requested a review from a team as a code owner July 1, 2024 10:50

seberg requested review from vyasr and mroeschke July 1, 2024 10:50

github-actions bot added the Python Affects Python cuDF API. label Jul 1, 2024

seberg commented Jul 1, 2024

View reviewed changes

mroeschke added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 1, 2024

mroeschke reviewed Jul 10, 2024

View reviewed changes

mroeschke approved these changes Jul 10, 2024

View reviewed changes

seberg added 8 commits July 12, 2024 19:14

TST: NumPy now prints np.int8(1), keep using old printing in tests

b1d7498

TST: NumPy is deprecating "a" as a dtype, just skip it

dd8ff3f

MAINT: Use int8(-1) as default categorical as np.uint8(-1) fails

4d1bea5

MAINT: Adapt to e.g. uint8(-1) failing now. Mostly in where.

dbd38ba

Note that pandas `where` seems to promote the Series based on the value even with NumPy 2. This was never copied by cudf (i.e. an outstanding issue)

MAINT: avoid can_cast(pyscalar, dtype) NumPy 2 refuses it (for now)

b125041

MAINT: Adapt numerical promotion to NumPy 2 and Pandas 2.2

9e2f334

Pandas keeps using weak promotion even for strongly typed "scalars" (i.e. 0-d objects). This tries to (mostly) match that, but there may be better ways to do it. I am having difficulty to think of the best way though.

MAINT: Adapt to NumPy weak-promotion in comparisons

ddc91b7

STY: Fix trailing comma style

d0dee3e

seberg force-pushed the numpy2-compat-not-full branch from 66efccf to 927904c Compare July 12, 2024 17:15

Apply suggestion to use packaging.version

6af0dec

seberg force-pushed the numpy2-compat-not-full branch from 927904c to 6af0dec Compare July 12, 2024 17:16

rapids-bot bot merged commit 1889c7c into rapidsai:branch-24.08 Jul 15, 2024
79 checks passed

seberg deleted the numpy2-compat-not-full branch July 15, 2024 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: Adapt to NumPy 2 promotion changes #16141

MAINT: Adapt to NumPy 2 promotion changes #16141

seberg commented Jul 1, 2024 •

edited by jakirkham

Loading

copy-pr-bot bot commented Jul 1, 2024

seberg Jul 1, 2024

seberg Jul 1, 2024

seberg Jul 1, 2024

mroeschke commented Jul 1, 2024

mroeschke commented Jul 2, 2024

mroeschke Jul 10, 2024

seberg Jul 12, 2024

mroeschke left a comment

mroeschke commented Jul 12, 2024

seberg commented Jul 15, 2024

mroeschke commented Jul 15, 2024

MAINT: Adapt to NumPy 2 promotion changes #16141

MAINT: Adapt to NumPy 2 promotion changes #16141

Conversation

seberg commented Jul 1, 2024 • edited by jakirkham Loading

copy-pr-bot bot commented Jul 1, 2024

seberg Jul 1, 2024

Choose a reason for hiding this comment

seberg Jul 1, 2024

Choose a reason for hiding this comment

seberg Jul 1, 2024

Choose a reason for hiding this comment

mroeschke commented Jul 1, 2024

mroeschke commented Jul 2, 2024

mroeschke Jul 10, 2024

Choose a reason for hiding this comment

seberg Jul 12, 2024

Choose a reason for hiding this comment

mroeschke left a comment

Choose a reason for hiding this comment

mroeschke commented Jul 12, 2024

seberg commented Jul 15, 2024

mroeschke commented Jul 15, 2024

seberg commented Jul 1, 2024 •

edited by jakirkham

Loading