Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid value encountered in cast #101

Closed
veenstrajelmer opened this issue Jun 19, 2023 · 2 comments
Closed

invalid value encountered in cast #101

veenstrajelmer opened this issue Jun 19, 2023 · 2 comments

Comments

@veenstrajelmer
Copy link
Collaborator

veenstrajelmer commented Jun 19, 2023

When opening a dataset with triangles and squares, there are _FilllValue values in the face_node_connectivity. This is expected. However, it seems that a warning was introduced "RuntimeWarning: invalid value encountered in cast". This also occurs when testing with older xugrid versions, so it might be because of xarray or a related package that I only see this now.

Figures all seem fine, so this issue is just to check if the warning should be supressed or removed. I noticed the face_node_connectivity (the int32 variable mesh2d_face_nodes in this example) is converted to nans by xarray, since it contains nans. The function _prepare_connectivity in xugrid\ugrid\ugridbase.py then tries to cast this value back to int again.

The function searches for the _FillValue attribute in da.attrs, but with xarray this attribute is located in da.encoding instead. Still is_fill results in a correct boolean. However, the fillvalues are not replaced by fill_value before the array is being converted back to int32 again. I would suggest adding data[is_fill] = fill_value right before cast = data.astype(dtype, copy=True):

def _prepare_connectivity(
da: xr.DataArray, fill_value: Union[float, int], dtype: type
) -> xr.DataArray:
start_index = da.attrs.get("start_index", 0)
if start_index not in (0, 1):
raise ValueError(f"start_index should be 0 or 1, received: {start_index}")
data = da.values
if "_FillValue" in da.attrs:
is_fill = data == da.attrs["_FillValue"]
else:
is_fill = np.isnan(data)
cast = data.astype(dtype, copy=True)
if start_index:
cast -= start_index
cast[is_fill] = fill_value
if (cast[~is_fill] < 0).any():
raise ValueError("connectivity contains negative values")
return da.copy(data=cast)

(and probably removing cast[is_fill] = fill_value)

Either way, the resulting face_node_connectivity seems to be correct (0-based and -1 as fillvalue)

A MWE below:

import warnings
import xugrid as xu
import xarray as xr
from netCDF4 import Dataset

warnings.filterwarnings("error") #set warnings as errors
file_nc = r'c:\DATA\dfm_tools_testdata\DFM_grevelingen_3D\Grevelingen-FM_0003_map.nc'

#original data is int32
print('netcdf4 Dataset')
data_nc = Dataset(file_nc)
print(data_nc.variables['mesh2d_face_nodes'].dtype)
#int32
print(data_nc.variables['mesh2d_face_nodes']._FillValue)
#-999
print(data_nc.variables['mesh2d_face_nodes'][0,:].data)
#[   1    5    7 -999]

#xarray data is converted to floats since it contains nan-values
print('xarray Dataset')
ds = xr.open_dataset(file_nc)
print(ds['mesh2d_face_nodes'].dtype)
#float64
print(ds['mesh2d_face_nodes'].encoding['_FillValue'])
#-999
print(ds['mesh2d_face_nodes'].isel(nmesh2d_face=0).to_numpy())
#[ 1.  5.  7. nan]

#when opening with xugrid, this results in a warning (can be raised as error to get the traceback), but the face_node_connectivity seems to be ok.
print('xugrid Dataset')
uds = xu.open_dataset(file_nc)
#RuntimeWarning: invalid value encountered in cast
print(uds.grid.face_node_connectivity[0,:])
#[ 0  4  6 -1]
@Huite
Copy link
Collaborator

Huite commented Jun 20, 2023

Great, this is very helpful!

As far as I can tell, the _FillValue should be present in the attributes according to the UGRID conventions: https://ugrid-conventions.github.io/ugrid-conventions/

But I should take another look at the .encoding property.

Moving the replacement up is indeed the better choice and will silence the warning (at the minor cost of creating another copy of the data in memory).

@veenstrajelmer
Copy link
Collaborator Author

You are right, but xarray moves some of these attributes to the encoding when os uses them for processing. _FillValue is one of these moved attrs (used to replace fillvalues with nan in the arrays), but it also goes for the 'units' attr of the time variable (used to decode the time variable into datetimes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants