Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backward compatibility of raw-converted dataset after xarray.DataTree migration #1420

Open
2 tasks
leewujung opened this issue Dec 27, 2024 · 2 comments
Open
2 tasks
Assignees
Milestone

Comments

@leewujung
Copy link
Member

leewujung commented Dec 27, 2024

Context

Part of the xarray-contrib/datatree to datatree to xarray.DataTree migration requires strict inheritance of the coordinates:
https://github.com/pydata/xarray/blob/main/DATATREE_MIGRATION_GUIDE.md#data-model-changes

There should just be 1 case of this in the current echopype-adapted SONAR-netCDF4 structure, under Platform/NMEA group, since the Platform/time1 coordinate is a subset of the Platform/NMEA/time1 coordinate -- Thanks @ctuguinay for finding this out!

One workaround is to remove the indexes associated of the coordinate, but that may introduce confusion down the road, so we should probably go with renaming the Platform/NMEA/time1 coordinate (in the work in #1419).

Issue

However, once this is fixed, we have a larger problem that the previous raw-converted zarr dataset cannot be opened by the new xarray.open_datatree functions.

Tasks

  • Use xarray.open_groups to circumvent the inheritance problem for users to open previously stored raw-converted data
  • Create a mechanism for users to optionally convert their data to the new version
@oftfrfbf
Copy link
Contributor

I will be taking a look at this issue. I am currently getting up to speed with xarray.DataTree and its use throughout the project. Will be starting out by getting caught up on the tickets leading up to this and then will circle back here and gather more requirements.

I am thinking it would be nice to know what domain of versions of echopype will need to be upgradable, e.g. v7.0 to v9.0 or v8.0 to v9.0, and if there there is any specific test data you would like me to work with.

@oftfrfbf oftfrfbf self-assigned this Dec 29, 2024
@leewujung
Copy link
Member Author

Thanks @oftfrfbf !

I am thinking it would be nice to know what domain of versions of echopype will need to be upgradable, e.g. v7.0 to v9.0 or v8.0 to v9.0, and if there there is any specific test data you would like me to work with.

I don't know exactly who has large volumes of data converted using previous echopype versions, but my guess would be that the current format (the one v0.9.0 generates) is the most widely used. To simplify things and also so that we don't get swamped by data versioning just yet, I think we can focus on something that can take v0.9.X-generated data (which uses the older xarray-contrib/datatree) and open it with echopype >=v0.10.0 (which uses the new xarray.DataTree).

I'll take a look at when the last format change was so that we can put an accurate statement in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

2 participants