Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode variable with mismatched coordinate attribute #8195

Merged
merged 19 commits into from
Sep 25, 2023

Conversation

kmuehlbauer
Copy link
Contributor

@kmuehlbauer kmuehlbauer commented Sep 16, 2023

  • decode variable with mismatched coordinate attribute
  • propagate coordinates to encoding in any case

@kmuehlbauer
Copy link
Contributor Author

This is still draft, because I wanted to get more insight on the approach first.

For now this will properly decode and propagate coordinates to encoding, but the missing coordinate is removed from that string. Additionally a warning is raised. I've added a "coordinates_strict" to the available options for decode_coords to enable a workflow which raises in the case of mismatched coordinates.

Some questions:

Any comments appreciated!

xarray/conventions.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
@kmuehlbauer kmuehlbauer marked this pull request as ready for review September 18, 2023 08:22
@kmuehlbauer
Copy link
Contributor Author

This solves the problem in #1809.

But I'm not really happy with the naming of decode_coords-strings.

@kmuehlbauer
Copy link
Contributor Author

OK, thanks to the test suite we now have a contradicting issue here:

#308

@shoyer It would be great if you can share your thoughts on how to handle mismatched coordinates here. In #313 it silently does not handle mismatched coordinates and keeps mismatched coordinates attributes on the variable. Should we just raise on mismatch (no good idea!) or keep current behaviour but issue warning (might be overlooked or disabled)? Or is there some other solution, beside the proposed by this PR?

@dcherian
Copy link
Contributor

I think we shouldn't add more options and just silently ignore the names of any missing variables. This attribute gets out of sync quite easily.

Should we just not propagate coordinate to encoding in case of mismatches?

Let's propagate if decode_coords is not False

Should we also check if coordinates are matching on write

We could warn if the coordinates attribute refers to absent variables at write time. But I suspect it would just be annoying.

@kmuehlbauer
Copy link
Contributor Author

Thanks @dcherian, that greatly simplifies the code.

As a compromise with #308 coordinates are now only propagated to encoding if at least one coordinate name matches a variable.

@kmuehlbauer
Copy link
Contributor Author

As this now does not emit any warning, should I revert the change using warnings.warn -> emit_user_level_warning or is this OK to proceed?

@@ -488,6 +488,10 @@ def open_dataset(
as coordinate variables.
- "all": Set variables referred to in ``'grid_mapping'``, ``'bounds'`` and
other attributes as coordinate variables.

Only existing variables can be set as coordinates. Missing variables
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that this is short enough but explains how missing variables are treated. No native speaker here, so please suggest better wording/grammar.

doc/whats-new.rst Outdated Show resolved Hide resolved
xarray/backends/api.py Outdated Show resolved Hide resolved
xarray/backends/api.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
xarray/conventions.py Outdated Show resolved Hide resolved
xarray/tests/test_conventions.py Outdated Show resolved Hide resolved
xarray/tests/test_conventions.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :) but could use a pass from someone else to double check our decisions

@kmuehlbauer
Copy link
Contributor Author

LGTM :) but could use a pass from someone else to double check our decisions

Current behaviour:

coordinates are propagated to encoding if all coordinates are in variables

New behaviour:

coordinates are propagated to encoding if at least one coordinate is in variables

@kmuehlbauer kmuehlbauer added the plan to merge Final call for comments label Sep 21, 2023
xarray/conventions.py Outdated Show resolved Hide resolved
@kmuehlbauer kmuehlbauer mentioned this pull request Sep 24, 2023
5 tasks
@kmuehlbauer
Copy link
Contributor Author

Current behaviour (if decode_coords is not False):

coordinates are only propagated to encoding if all coordinates are in variables

New behaviour (if decode_coords is not False):

coordinates are propagated to encoding in any case, but only coordinates in variables are added as coords

@kmuehlbauer kmuehlbauer merged commit da647b0 into pydata:main Sep 25, 2023
24 checks passed
@kmuehlbauer kmuehlbauer deleted the coordinates branch September 25, 2023 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

to_netcdf is not idempotent when stacking rename and set_coords WRF output : cannot serialize variable
3 participants