-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore missing variables when concatenating datasets? #508
Comments
Closing as stale, please reopen if still relevant |
I just ran in to this issue. While the previous fix seems to handle one case it doesn't handle all the cases. Before I clean this up and open a new PR does this look like its on the right track (it worked for my issue where I was concating multiple datasets which always had the same dims and coordinates but sometimes were missing variables)? starts at line 353 on concat.py
|
Thanks for tackling this very important issue @scottcha !
Instead of creating a DataArray we only need to create a I would instead try variables = []
for ds in datasets:
if k in ds.variables:
filled = full_like(ds.variables[k], fill_value=np.nan)
break
for ds in datasets:
if k not in ds.variables:
variables.append(filled)
else:
variables.append(ds.variables[k])
vars = ensure_common_dims(variables) Please send in a PR with any progress you make. We are happy to help out. We have some documentation on contributing and testing here: https://xarray.pydata.org/en/stable/contributing.html |
Ok got it, I'll take a look and spin up a PR. |
Yes that is correct |
Any plans to support this? |
There is another attempt to get this resolved in #7400. Any input appreciated over there. |
Several users (@raj-kesavan, @richardotis, now myself) have wondered about how to concatenate xray Datasets with different variables.
With the current
xray.concat
, you need to awkwardly create dummy variables filled withNaN
in datasets that don't have them (or drop mismatched variables entirely). Neither of these are great options --concat
should have an option (the default?) to take care of this for the user.This would also be more consistent with
pd.concat
, which takes a more relaxed approach to matching dataframes with different variables (it does an outer join).The text was updated successfully, but these errors were encountered: