-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missing_value is a string for numeric NetCDF array (getindex fail with v0.12.0) #166
Comments
Huh that's odd, there is a missing_value with a string value in this dataset.
It looks like that case is not handled by this commit: 083f23f#diff-014bcddfc04a56e29ce6e91a3f2ab177d429989fd4709895908b2937ada9feafR372 Where it tries to convert the value to the eltype of the array. |
Good point! Not sure why they did that. Also I don't understand why "-1.e+34f" rather than "-1.e+34", why I sent a PR with a rudimentary fix in #167 |
NCDatasets 0.12 is indeed the first release which parses |
@Gael can you try the branch "not_convert_missing_values" ? |
By the way, this is what I get in Python's netCDF4:
Maybe it would be good to contact the author of the dataset. |
The branch Note that we do not try to parse the strings. The string missing value is effectively ignored (for numeric NetCDF variables). If you want to use the missing value you can use the new function nv = cfvariable(ds,"longitude", missing_value = -1.e+34); # instead of nv = ds["longitude"];
lon = nv[:] I can also read your file |
Thanks for the fix. I had sent PR #167 which kept the type conversion when missing value is not a text string and skipped it otherwise. Seemed useful to me but maybe you felt this was still dangerous? Thanks for trying this out on Python's netCDF4 too. Good to know this is an issue also for them. Tagging @selipot who might know who to pass the information over to at NOAA ( the file is from https://www.aoml.noaa.gov/phod/gdp/hourly_data.php ) |
Well, the PR solves your issue, but this related issue:
would still produce an error if the missing values would be converted. |
I see. We could do a |
@philippemiron may have some insight |
Thanks for your help Gael, but your PR is no longer needed. For a numeric array, the missing value should also be numeric (not a string). NCDatasets can now read the file (but it doesn't attempt to parse the string attributes, which would not be possible anyway in this case). |
Has someone contacted the dataset owner and told them that they are serving a malformed dataset? Speaking from the experience of an xarray dev, you can potentially go down many rabbit holes trying to deal with malformed datasets--special cases and workarounds in your code, etc. This may be worthwhile in some cases. But it's also important to push back on the data providers who serve these datasets. By silently ignoring the |
I am in touch with the GDP DAC to get this fixed. Bear with me folks! |
In fact, in the current version (v0.12.2) of NCDatasets, there is no code that actively ignores string Luckily,
The consistency of In any case, I agree that a warning would be nice. For the file above you will see a warning like this:
Thanks a lot to to @selipot for your reactivity ! |
Thanks a lot for the clarification. Makes sense. |
Describe the bug
Indexing a Dataset variable by name, in a case that used to work until at least v0.11.18, now returns an error with v0.12.0. Could have to do with the introduction of
SymbolOrString
but unclear.To Reproduce
Expected behavior
ds["longitude"]
should work. The example quoted here has been part of the test suite for OceanRobots.jl for a while.Environment
Full output
The text was updated successfully, but these errors were encountered: