-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataArray.where() can truncate strings with <U
dtypes
#9180
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
This is because the data type of the array is I think that's really confusing behavior. Does anyone know whether this has always been the case? I admittedly don't use strings that much... |
@max-sixty thanks a lot for your quick reply! I can confirm that it worked at least until 2024.3.0. (I didn't update in the meantime, but I could do that) EDIT: a colleague told me it probably worked until 2024.5.0, but I haven't tried that. |
not sure whether this used to work (it could have), but the new string dtype in |
OK, if it works on |
note that at the moment you still get the old character-based string dtypes by default, so you have to explicitly opt into the new string dtype (using |
Ah OK. So maybe we don't deprioritize :) |
<U
dtypes
I just had a better look at this issue, and I believe it relates to us preferring explicit dtypes over implicit dtypes. What happens within np.result_type(np.dtype("<U1"), type("<=")) # `str` does not have a length, so the explicit dtype is taken To work around that, we can pass a 0d array to sign_3.where(sign_3 != "=", np.array("<=")) but I'm not sure how to best fix this in general. In theory, we could special-case pre- # instead of `preprocess_scalar_types`
def preprocess_types(t):
if isinstance(t, str | bytes):
return type(t)
elif isinstance(dtype := getattr(t, "dtype", t), np.dtypes.StrDType | np.dtypes.BytesDType):
return dtype.type
return t Edit: though the best way would be to have |
What happened?
I want to replace all
"="
occurrences in an xr.DataArray calledsign
with"<="
.The resulting DataArray then does not contain
"<="
though, but"<"
. This only happens ifsign
only has "=" entries.What did you expect to happen?
That all
"="
occurrences in sign are replaced with"<="
.Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
No response
Environment
The text was updated successfully, but these errors were encountered: