-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COMPAT/TST: Ensure that numpy 2 string dtype converts to object array in pandas 2.2.x #58104
Comments
Wouldn’t not-converting be nicer behavior? |
I was hoping we could get @ngoldbaum 's numpy string array feature in in some form for 3.0, but that's probably too big of a change to get in for 2.2.2. The main thing I want to avoid is a weird error happening somewhere in our internals if a numpy string array is used. |
One big issue with just accepting stringdtype arrays is that stringdtype can’t support the buffer protocol, so a lot of cython code in pandas that expects an array type that supports the buffer protocol won’t work. My fork of pandas adding stringdtype support avoids low-level cython operations by calling numpy directly. |
I would also support not coercing to object by default if possible (IMO it's OK postponing 2.2.2 extra to wait for compatibility) Just curious, in numpy 2.0, will |
No, someone needs to explicitly pass e.g. If there's interest in helping me out with getting my stringdtype changes upstreamed, I'd really appreciate it. My current latest work is here. Currently it's based on the pandas 3.0 branch but backporting it wouldn't be hard. I was hoping to have a PR open already but between getting pulled into other projects and helping out with shipping numpy 2.0 I didn't quite make it in time. |
I don't think we should back port this, that change is too large imo |
We should add a test for this, given that numpy 2.0 support is happening for pandas 2.2.2.
The text was updated successfully, but these errors were encountered: