Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More pint + cf-xarray usage in examples #325

Merged
merged 8 commits into from
Mar 22, 2024
Merged

More pint + cf-xarray usage in examples #325

merged 8 commits into from
Mar 22, 2024

Conversation

navidcy
Copy link
Collaborator

@navidcy navidcy commented Feb 27, 2024

No description provided.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

review-notebook-app bot commented Feb 27, 2024

View / edit / reply to this conversation on ReviewNB

navidcy commented on 2024-02-27T10:41:56Z
----------------------------------------------------------------

What's all these warnings?? Should we pay attention to them?? Seems like xarray is trying to tell us something


anton-seaice commented on 2024-02-27T22:24:50Z
----------------------------------------------------------------

This warning means that the dask chunks are smaller than the chunks in the source netcdf file. This is inefficient because it means the same chunk in the source file needs to be read multiple times.

I think, the workaround is to manually set chunks in the getvar : e.g. chunks={'time':'auto', 'xt_ocean':-1, 'yt_ocean':-1})

The fix is to update the cookbook, but I don't think anyone is working on the cookbook anymore.

navidcy commented on 2024-02-28T09:54:18Z
----------------------------------------------------------------

yeap, let's leave this for the cookbook to deal with.

Copy link
Collaborator

This warning means that the dask chunks are smaller than the chunks in the source netcdf file. This is inefficient because it means the same chunk in the source file needs to be read multiple times.

I think, the workaround is to manually set chunks in the getvar : e.g. chunks={'time':'auto', 'xt_ocean':-1, 'yt_ocean':-1})

The fix is to update the cookbook, but I don't think anyone is working on the cookbook anymore.


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Feb 27, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-02-27T22:56:23Z
----------------------------------------------------------------

Line #25.        u = u.pint.quantify()

Maybe a note to say what this means is good?


navidcy commented on 2024-02-28T09:54:05Z
----------------------------------------------------------------

ok, done

Copy link

review-notebook-app bot commented Feb 27, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-02-27T22:56:24Z
----------------------------------------------------------------

'We xarray to index using dimension, e.g., to get the final timestamp.'

I don't understand what is being said here?


navidcy commented on 2024-02-28T09:53:58Z
----------------------------------------------------------------

Indeed it was convoluted. I rephrased.

Copy link

review-notebook-app bot commented Feb 27, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-02-27T22:56:24Z
----------------------------------------------------------------

I think we should do something about the tripole ... at least acknowledge the plots are wrong above 65N, or fix the plots


navidcy commented on 2024-02-28T09:53:31Z
----------------------------------------------------------------

The plots looks alright, even north of 65N. I believe I took care of that by using x='longitude', y='latitude' . Isn't it?

anton-seaice commented on 2024-02-28T22:54:37Z
----------------------------------------------------------------

There is data missing north of Western Russia and Eastern Canada/Greenland but it blends in well due to the color scale. And some data is hidden under the land too.

Labelling xu_ocean as longitutude, and yu_ocean as latitude is a bit of a shortcut, because on the tripole grid, at North of 65N these grids no longer follow normal lat/lon.

I think for our cf labels of latitude and longitude to be strictly correct, we would need to load geolat and geolon, and assign them as coordinates and add the cf attributes to those. But as geolon and geolat are 2D, plotting then becomes more complicated too ... you either need to do the re-grid, or use contourf, and both those options are much slower.

The plot being wrong in a couple of spots doesn't really impact the notebook and what it is demonstrating, so I think a note is fine.

(I realise now there are several/(many?) places in the recipes where we ignore the tripole!)

anton-seaice commented on 2024-02-28T22:55:23Z
----------------------------------------------------------------

The example using contourf is in https://cosima-recipes.readthedocs.io/en/latest/DocumentedExamples/SeaIce_Plot_Example.html#Making-Maps

EDIT: Thankyou to Aidan for the correction below :)

Copy link

review-notebook-app bot commented Feb 27, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-02-27T23:00:59Z
----------------------------------------------------------------

Line #24.    axes[2].set_title('ACCESS-OM-010 regridded on 1 

I think this change was accidental?


navidcy commented on 2024-02-28T09:35:14Z
----------------------------------------------------------------

indeed!

Copy link

review-notebook-app bot commented Feb 27, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-02-27T23:01:00Z
----------------------------------------------------------------

Linking to https://cosima-recipes.readthedocs.io/en/latest/DocumentedExamples/SeaIce_Obs_Model_Compare.html#Sea-Ice-Concentration-Anomalies would keep this in-house


navidcy commented on 2024-02-28T09:44:55Z
----------------------------------------------------------------

I added both!

Copy link
Collaborator Author

navidcy commented Feb 28, 2024

indeed!


View entire conversation on ReviewNB

Copy link
Collaborator Author

navidcy commented Feb 28, 2024

I added both!


View entire conversation on ReviewNB

Copy link
Collaborator Author

navidcy commented Feb 28, 2024

The plots looks alright, even north of 65N. I believe I took care of that by using x='longitude', y='latitude' . Isn't it?


View entire conversation on ReviewNB

Copy link
Collaborator Author

navidcy commented Feb 28, 2024

Indeed it was convoluted. I rephrased.


View entire conversation on ReviewNB

Copy link
Collaborator Author

navidcy commented Feb 28, 2024

ok, done


View entire conversation on ReviewNB

Copy link
Collaborator Author

navidcy commented Feb 28, 2024

yeap, let's leave this for the cookbook to deal with.


View entire conversation on ReviewNB

@navidcy
Copy link
Collaborator Author

navidcy commented Feb 28, 2024

Thanks @anton-seaice. I think I dealt with everything you raised. Can you have a second look?

Copy link
Collaborator

anton-seaice commented Feb 28, 2024

There is data missing north of Western Russia and Eastern Canada/Greenland but it blends in well due to the color scale. And some data is hidden under the land too.

Labelling xu_ocean as longitutude, and yu_ocean as latitude is a bit of a shortcut, because on the tripole grid, at North of 65N these grids no longer follow normal lat/lon.

I think for our cf labels of latitude and longitude to be strictly correct, we would need to load geolat and geolon, and assign them as coordinates and add the cf attributes to those. But as geolon and geolat are 2D, plotting then becomes more complicated too ... you either need to do the re-grid, or use contourf, and both those options are much slower.

The plot being wrong in a couple of spots doesn't really impact the notebook and what it is demonstrating, so I think a note is fine.

(I realise now there are several/(many?) places in the recipes where we ignore the tripole!)


View entire conversation on ReviewNB

Copy link
Collaborator

@aidanheerdegen
Copy link
Contributor

Agreed that proper projection is required for plotting the tripole, but as long as there are no missing coordinate values the xarray.DataArray.plot method works ok I believe. Happy to be corrected.

https://cosima-recipes.readthedocs.io/en/latest/Tutorials/Spatial_selection.html

Copy link

review-notebook-app bot commented Feb 29, 2024

View / edit / reply to this conversation on ReviewNB

aidanheerdegen commented on 2024-02-29T07:02:17Z
----------------------------------------------------------------

Line #14.        for coord in [u.xu_ocean, v.xu_ocean]:

Part of the reason to use cf_xarray is so you don't need to refer to coordinates etc by name, instead it should be possible to use the .cf accessor.

In the same way anything accessing time could use the .cf accessor, see https://cosima-recipes.readthedocs.io/en/latest/Tutorials/Model_Agnostic_Analysis.html


navidcy commented on 2024-02-29T07:25:09Z
----------------------------------------------------------------

But I'm doing this, e.g., u.yt_ocean.attrs['standard_name'] = 'latitude' exactly because cf-xarray wasn't able to figure it out via adding .cf ...

We weren't careful enough to use CF metadata when outputting ACCESS-OM2? Is this why?f

Copy link
Collaborator Author

navidcy commented Feb 29, 2024

But I'm doing this, e.g., u.yt_ocean.attrs['standard_name'] = 'latitude' exactly because cf-xarray wasn't able to figure it out...

We weren't careful enough to use CF metadata when outputting ACCESS-OM2? Is this why?


View entire conversation on ReviewNB

@anton-seaice
Copy link
Collaborator

Agreed that proper projection is required for plotting the tripole, but as long as there are no missing coordinate values the xarray.DataArray.plot method works ok I believe. Happy to be corrected.

https://cosima-recipes.readthedocs.io/en/latest/Tutorials/Spatial_selection.html

You are very correct! Good pick up, I was convinced about that for some reason.

@navidcy : I realise this is only half in scope for what you are trying to change ... but I think we should be loading geolat and geolon and using those as the cf coordinates for lat and lon. But there are some other places in the recipes that would have the same issue :)

@aidanheerdegen
Copy link
Contributor

You are very correct! Good pick up, I was convinced about that for some reason.

In lots of cases the ocean files had essentially broken coordinates: they had missing/masked values. When that happens the normal plot stuff will not work.

It is an odd interaction between the layout and io_layout. Depending on their relative values if an output "tile" has absolutely no unmasked tiles it is is skipped entirely in the output. When the mppnccombine step happens it knows there should be a tile there, but has no data, so fills its with missing values, including coordinate data. All it requires is one compute tile in an output tile and the full coordinates for that entire io tile to be written correctly.

So it is difficult to predict, as essentially the same config can produce fine coordinate data, or broken ones, and all it depends on is the io_layout parameter. In general the larger the ratio between io_layout and layout (the more compute tiles for each io tile) the less likely it is to occur.

Once you do encounter the problem though, it is a real pain, and completely understandable that you would develop the intuition that .plot() doesn't work, or at least the other methods are more consistently reliable.

Might even be worth creating a script to check for this condition to avoid people trying stuff that doesn't work, or better yet use it as a check for model configurations so we're producing good quality outputs.

@aidanheerdegen
Copy link
Contributor

But I'm doing this, e.g., u.yt_ocean.attrs['standard_name'] = 'latitude' exactly because cf-xarray wasn't able to figure it out...

Ah, right sorry. My mistake.

Point still stands for the time dim thought. Should be able to use .cf accessor and make the code more portable, e.g. change

    u = u.cf.sel(vertical=depth, method='nearest').sel(time=slice(start_time, end_time))

to

    u = u.cf.sel(vertical=depth, method='nearest').cf.sel(time=slice(start_time, end_time))

We weren't careful enough to use CF metadata when outputting ACCESS-OM2? Is this why?

In a nutshell. Yes.

@anton-seaice
Copy link
Collaborator

anton-seaice commented Mar 4, 2024

You are very correct! Good pick up, I was convinced about that for some reason.

In lots of cases the ocean files had essentially broken coordinates: they had missing/masked values. When that happens the normal plot stuff will not work.

I went back and looked at where I had tried this (e.g. ds.aice_m.isel(time=0).plot(x='geolon_t', y='geolat_t'), and the error I get is :

ValueError: x and y arguments to pcolormesh cannot have non-finite values or be of type numpy.ma.core.MaskedArray with masked values

and it is because the default behaviour (in the intake cat) if you search for 'geolat_t' or 'geolon_t' is to get them from the CICE output files. But in the output files the geolat and geolon have a landmask applied:

Screenshot 2024-03-04 at 12 06 08 pm

If you load the CICE input grid (as done in the Spatial Selection example), and get use the lat/lon fields from there, then it works ok.

p.s. It is a slightly odd behavior in matplotlib ... the mask is the same in the source data and the x and y coords

@aidanheerdegen
Copy link
Contributor

If you load the CICE input grid (as done in the Spatial Selection example), and get use the lat/lon fields from there, then it works ok.

Ah right. Yeah, I think that was the only reliable method because, as you point out, the coordinates are masked in the ice data. My recollection was the ocean data was as I said, where it depended on io_layout and layout. The fact that there are two models with wildly different metadata standards and naming conventions (looking at you x and y) in the same model doesn't help either.

@dougiesquire
Copy link
Collaborator

and it is because the default behaviour (in the intake cat) if you search for 'geolat_t' or 'geolon_t' is to get them from the CICE output files. But in the output files the geolat and geolon have a landmask applied:

I could be misunderstanding what you're saying, but I think you'll only find geolat_t and geolon_t in ACCESS-OM2 MOM output, not CICE. So if you ask the intake catalog for these variables, I think they'll be coming from MOM files. I think the masking that you show in your attached screenshot is the MOM behaviour @aidanheerdegen is describing?

As already agreed, it's probably best to get the ice grid from the ice input grid. Especially since the MOM and CICE grids may not actually be the same.

@anton-seaice
Copy link
Collaborator

and it is because the default behaviour (in the intake cat) if you search for 'geolat_t' or 'geolon_t' is to get them from the CICE output files. But in the output files the geolat and geolon have a landmask applied:

I could be misunderstanding what you're saying, but I think you'll only find geolat_t and geolon_t in ACCESS-OM2 MOM output, not CICE. So if you ask the intake catalog for these variables, I think they'll be coming from MOM files. I think the masking that you show in your attached screenshot is the MOM behaviour @aidanheerdegen is describing?

Yes you are right. (It turns out I am loading these from the output file, not from intake cat directly).

The masking is very similar to the CICE masking, and trying to plot using 'TLON' and 'TLAT' from cice output gives the same error.

@navidcy
Copy link
Collaborator Author

navidcy commented Mar 6, 2024

I've been thinking that 2D unmasked coords should be loaded automatically by getvar; see COSIMA/cosima-cookbook#336

But that's a bit irrelevant for this PR. Could/should we merge this PR and continue the discussion in an issue either in this repo or in the cookbook?

Copy link

review-notebook-app bot commented Mar 7, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-03-07T03:14:55Z
----------------------------------------------------------------

Line #15.            coord.attrs['standard_name'] = 'longitude'

'grid_longitude' and 'grid_latitude' would be more appropriate (per https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html)


navidcy commented on 2024-03-14T12:49:07Z
----------------------------------------------------------------

Done

Copy link

review-notebook-app bot commented Mar 7, 2024

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2024-03-07T03:14:56Z
----------------------------------------------------------------

Line #16.            coord.attrs['units'] = 'degrees'

Per https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html it should be 'degree' not 'degrees'


navidcy commented on 2024-03-14T12:49:02Z
----------------------------------------------------------------

Done

Copy link
Collaborator Author

navidcy commented Mar 14, 2024

Done


View entire conversation on ReviewNB

Copy link
Collaborator Author

navidcy commented Mar 14, 2024

Done


View entire conversation on ReviewNB

Copy link
Collaborator

@anton-seaice anton-seaice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, noteing that our handling of the tripole is a bit messy, and would be simplified by loading the geographic grids with every variable :)

@navidcy
Copy link
Collaborator Author

navidcy commented Mar 21, 2024

Actually with the CF-compliances now you can't do

u.cf['latitude']

but you have to do

u.cf['grid_latitude']

which is counter-intuitive and brings back some "trauma" from having coordinate names like yu_ocean and what-not.

Can we resolve this?

@anton-seaice
Copy link
Collaborator

Actually with the CF-compliances now you can't do

u.cf['latitude']

but you have to do

u.cf['grid_latitude']

which is counter-intuitive and brings back some "trauma" from having coordinate names like yu_ocean and what-not.

Can we resolve this?

Yeah - loading the true lat/lon from the ice grid file. See example in https://cosima-recipes.readthedocs.io/en/latest/Tutorials/Spatial_selection.html#Load-data

@navidcy
Copy link
Collaborator Author

navidcy commented Mar 22, 2024

OK, I admit I'm a bit worn out from this PR so I'm merging without resolving this last issue... :(

I think the proper solution is to have the cookbook assign the unmasked 2D lat-lon coordinates on every variable it returns! See COSIMA/cosima-cookbook#336

@navidcy navidcy merged commit 8bae69b into main Mar 22, 2024
3 checks passed
@navidcy navidcy deleted the ncc/regridding branch March 22, 2024 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🛸 updating An existing notebook needs to be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants