Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-deterministic ordering within "coordinates" attribute written by .to_netcdf #8026

Closed
itcarroll opened this issue Jul 27, 2023 · 2 comments · Fixed by #8034
Closed

non-deterministic ordering within "coordinates" attribute written by .to_netcdf #8026

itcarroll opened this issue Jul 27, 2023 · 2 comments · Fixed by #8034

Comments

@itcarroll
Copy link
Contributor

itcarroll commented Jul 27, 2023

What is your issue?

Under the assumption that deterministic output is preferred whenever feasible, I'd like to point out that the variable names written into "coordinates" attributes with .to_netcdf are not ordered deterministically. For pipelines that depend on file hashes to validate dependencies, this can be a real headache.

Consider the dataset xarray.Dataset({"x": ((), 0)}, coords={"a": 0, "b": 0}). The NetCDF file XArray writes will include either:

variables:
	int64 x ;
		x:coordinates = "a b" ;
	int64 a ;
	int64 b ;

or

variables:
	int64 x ;
		x:coordinates = "b a" ;
	int64 a ;
	int64 b ;

My review of _encode_coordinates leads me to think the behavior results from collecting names in a set. I'd be happy to offer a PR to make the coordinates attribute deterministic. I am not aware of a CF convention regarding any ordering, but would research and follow if it exists. If not, then I would probably sort at L701 and L722.

@itcarroll itcarroll added the needs triage Issue that has not been reviewed by xarray team member label Jul 27, 2023
@dcherian
Copy link
Contributor

Your proposal sounds good to me. A PR would be welcome!

@dcherian dcherian added topic-CF conventions and removed needs triage Issue that has not been reviewed by xarray team member labels Jul 27, 2023
@trexfeathers
Copy link

By coincidence I was just looking at this corner of the Iris code. Here is how we achieve ordering (link):

            for element in sorted(
                coordlike_elements, key=lambda element: element.name()
            ):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants