-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode strings unexpectedly transformed to byte strings upon open_dataset
#1638
Comments
Also, how did all the |
Which backend are you using to save the data? Try explicitly setting
This looks like a bug of some sort -- not sure how that happened! |
Reading over Unidata/netcdf-c#402, it seems like we should probably copy the handling of _Encoding from netcdf4-python (Unidata/netcdf4-python#665) to our scipy interface. That would solve our problem of faithfully round-tripping data. |
Thank you for looking into this! I used the default engine to save, which looks like it was
Not sure if it matters, but one detail is that I created ~250 individual datasets (each sized at ~300 samples x 20,000 features) and then used |
Hmm. I'm not sure why h5netcdf was so much slower. I suspect the default engine you used might have been scipy, which we've noticed can be significantly faster in some cases. If you have time, I would be curious how well my branch in 1648 works using Please file a separate issue if you can put together example code that reproduces the lost coordinate issue. I would like to dig into this. |
Using v0.9.6 with
Using #1648:
|
Posted the lost coordinate issue here: #1680 |
When I first create the dataset, all the metadata is stored as unicode strings (yay!):
but then when I save using
to_netcdf
using the default arguments, thenxr.open_dataset
on the same dataset using default arguments, all of them get converted to byte strings:So then things I expect like selecting on gene, e.g.
ds.sel(gene="Ins1")
don't work unless they're byte strings, i.e.ds.sel(gene=b"Ins1")
works just fine.Do you know why this may be happening?
The text was updated successfully, but these errors were encountered: