Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update plot.py for more recent xarray; also allow dask arrays to be passed to single_panel, six_plot routines #257

Merged
merged 4 commits into from
Aug 30, 2023

Conversation

yantosca
Copy link
Contributor

This PR fixes an issue that seems to have been introduced with recent versions of xarray. The following updates were made:

(1) The following code in routine get_extents_for_color (in gcpy/plot.py):

          return ds_new.where(\
              ds_new[lon_var] >= minlon, drop=True).\
              where(ds_new[lon_var] <= maxlon, drop=True).\
              where(ds_new[lat_var]>= minlat, drop=True).\
              where(ds_new[lat_var] <= maxlat, drop=True)

needed to be changed to

          # Add .compute() to force evaluation of ds_new[lon_var]
          # See https://github.com/geoschem/gcpy/issues/254
          # Also note: This may return as a dask.array.Array object
          return ds_new.where(\
              ds_new[lon_var].compute() >= minlon, drop=True).\
              where(ds_new[lon_var].compute() <= maxlon, drop=True).\
              where(ds_new[lat_var].compute() >= minlat, drop=True).\
              where(ds_new[lat_var].compute() <= maxlat, drop=True)

as calling where with drop=True on an xarray object silently evaluates the data. Using .compute() forces xarray to do the actual computation. This behavior seems to have changed in xarray recently. For a similar issue, see: hainegroup/oceanspy#332. The object returned also seems to be of type dask.array.Array instead of xarray.DataArray or numpy.ndarray.

(2) We now must add this import statement;

from dask array import Array as DaskArray

so that we can add this to calls to verify_variable_type.

(3) We must now also add DaskArray to the calls to verify_variable_type in six_plot and single_panel in plot.py:

    verify_variable_type(plot_val, (np.ndarray, xr.DataArray, DaskArray))

(4) Update Pydoc headers accordingly:

        """
        ... etc ...

        plot_vals: xarray.DataArray, numpy.ndarray, or dask.array.Array
            Single data variable GEOS-Chem output to plot

        ... etc ...
        """

(5) Because these fixes allow benchmark plots to proceed, we can remove the pegged xarray from environment.yml

    #
    # NOTE: The most recent xarray (2023.8.0) seems to break backwards
    # compatibility with the benchmark plotting code.  Peg to 2023.2.0
    # until we can update GCPy for the most recent xarray.
    #  -- Bob Yantosca (29 Aug 2023)
    #
    - xarray==2023.2.0                # Read data from netCDF etc files

and replace it with

    - xarray                          # Read data from netCDF etc files```

gcpy/plot.py
- Import the dask.array.Array type definition (as DaskArray)
- Update the calls to verify_variable_type in routines "six_plot"
  and "single_panel" so that allowable input arguments may be of type
  xarray.DataArray, numpy.ndarray, or dask.array.Array.
- In internal routine "get_extent_for_colors" (located within the
  "compare_single_level" routine, we must now use the expression
  ds_new[lon_var].compute() so that Xarray will evaluate the
  "ds_new[lon_var]" expression.  This may return as a dask.array.Array.
- Updated Pydoc comments

We have confirmed that this update works with xarray==2023.8.0.

Signed-off-by: Bob Yantosca <[email protected]>
docs/environment_files/environment.yml
- In the prior commit we have added updates to plot.py that render
  the pegging of xarray to version 2023.2.0 unnecessary.  Restore
  the original code in the GCPy environment.yml file.

Signed-off-by: Bob Yantosca <[email protected]>
plot.py
- Updated Pydoc in "six_plot" to state that plot_val can be
  of type xarray.DataArray, numpy.ndarray, dask.array.Array

CHANGELOG.md
- Updated accordingly

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca added topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output category: Bug Fix Fixes a bug that was previously reported labels Aug 29, 2023
@yantosca yantosca added this to the 1.4.0 milestone Aug 29, 2023
@yantosca yantosca requested a review from msulprizio August 29, 2023 21:44
@yantosca yantosca self-assigned this Aug 29, 2023
@msulprizio
Copy link
Contributor

This fix resolves the error reported in #254.

gcpy/plot.py
- "plot_vals" should be "plot_val" in the Pydoc header, as this
  is the name of the argument.

Signed-off-by: Bob Yantosca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Fix Fixes a bug that was previously reported topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG/ISSUE] Index error when creating 1-year benchmark plots for GCClassic vs GCHP
2 participants