-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keep_attrs for Dataset.resample and DataArray.resample #825
Comments
@mcgibbon, I would agree that in general attributes should be preserved to maintain provenance of DataArrays or Datasets unless there is a really good reason to drop them. |
@mcgibbon - yes, we can add a @pwolfram - we had a lot of discussion early on about what to do with attributes after an object had been manipulated. The consensus was to force the user to maintain the attributes to the extent he/she desired. |
Thanks @jhamman. You are correct that this could get challenging without proper notions of units. Do we have a utility to transfer attributes from one Dataset to another? If not, perhaps that is the simplest, short term resolution to this issue that is even more general than addition of a |
The da_resampled = da.resample(...)
da_resampled.attrs = da.attrs Or you could just copy them over one by one. Either way, I don't think we need much more of a utility than that. |
This keeps coming up, but I don't know what the obvious solution is. We certainly could add an option that would change the default for When merging datasets, |
@shoyer the default keep_attrs isn't the problem here, the issue is that there is currently no keep_attrs option at all for resampling. I've implemented a solution, but now test TestDataset.test_resample_and_first is failing. This is because for how="first" and how="last", attributes are currently kept (keep_attrs=True). This may break some code if resample is given a default of keep_attrs=False. Using a default of keep_attrs=True for how in ('first', 'last') results in the test passing. Alternatively I could make it so the default behavior is to not pass any keep_attrs value on to the grouper function, which would keep the current defaults of those groupers. The code would be a bit uglier but it's not hard, and it would prevent breaking scripts. What do we want for the default behavior? |
It turns out that in addition, first and last in ops don't accept keep_attrs as a keyword argument, so right now they always preserve attributes. A side effect of this is that the keep_attrs arguments passed around by _first_and_last and whatnot in groupby actually don't do anything (though their default value, True, reflects what happens). |
It turns out the bug was line 323 of groupby.py, _concat_shortcut silently copies the metadata of the array doing the concatenation to the result. I've removed that line and now the tests are passing. |
I think it's best to make |
Currently there is no option for preserving attributes when resampling a Dataset or DataArray. Could there be a keep_attrs keyword argument for these methods?
The text was updated successfully, but these errors were encountered: