-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: use dask.array.apply_gufunc
in xr.apply_ufunc
#4060
Conversation
…utputs when `dask='parallelized'`, add/fix tests
This would need some docstring changing too. But I first want to check, if I've missed anything vital in the implementation. |
This is ready for review from my side. |
|
Thanks @mathause for your comments and raising those questions. JFTR, I was taking the road from #1815, so my explicit use-case was the multiple (dask) outputs.
I'll try to add some tests for the multiple output using dask.
AFAIK,
That's a good question. If you want me to go the long way, please be aware, that I'm a novice in xarray as well as in dask. A complete refactor of |
Ah yes I see (#1815 (comment)).
Indeed - I think it could simplify |
Exactly. It would be nice remove the use of |
I've given this a try, but this will need some design decisions.
This will really have much impact on the code/tests. I'll come up with a updated PullRequest in short time, but any thoughts /remarks whatsoever are very much appreciated. |
Thanks! That would be quite cool!
|
Hello @kmuehlbauer! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-08-19 05:41:15 UTC |
First serve of trying to use Most problematic issue now: |
From looking at the tests, it seems that setting |
Should you keep the |
@mathause Great! Seems like this works better, thanks! Will update the PR after some more tests etc. |
@mathause All tests green, good starting point for review. Please notify other people who should have a look at this. There are still things to discuss:
|
The original motivation for requiring Maybe this is too defensive/surprising, and could be relaxed. We don't really have any guard-rails like this elsewhere in xarray. |
You would remove the For the functions that don't handle dask arrays gracefully, Very cool - good progress.
I'll only be able to look at it properly next week. |
This is probably another good motivation: defaulting to The problem is that we don't have any way to detect ahead of time whether the applied function already supports dask arrays (e.g., if it is built-up out of functions from dask.array). If it does, we don't want to set |
I think we still need all the current options for the
I don't think we should deprecate meta. Not all user functions can deal with zero shaped inputs, so automatically inferring meta need not always work. We've had to add a similar feature for map_blocks (#3575) so I think
Shall we add a new |
I don't see I only realised the exact distinction between |
good point @mathause. Looks like |
@mathause I'll leave the PR unchanged and catch up with you next week. @shoyer @dcherian Thanks for your comments. Please let me know, which tests should be added to check for any possible surprises with this change to
(Att: no native english speaker here, so bear with me, if something sounds clunky or not exactly matching) For the keywords I think @dcherian 's proposal of something like |
I think all of these can be done in a new PR, we just have to make sure to include them in the next release (which might need to be soon so we regain compatibility with the most recent |
Great, than it looks like it's finally done. 😃 |
While having a last review I've found another small glitch. I'll come back the next days to see, if anything needs to be done from reviewers side. |
@mathause I've merged latest master into this PR to hopefully get all tests green. The former had some problems with a conda error in MinimumVersions job. Please let me know, if there is anything for me to do, to get this merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for patience here with the slow reviews. Looking this over, I have a suggestion for how to improve the warnings, but otherwise this looks good!
warnings.warn( | ||
"``meta`` should be given in the ``dask_gufunc_kwargs`` parameter." | ||
" It will be removed as direct parameter in a future version." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please set a class (DeprecationWarning) and stacklevel=2 on these warnings? That results in better messages for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to nitpick - shouldn't that be a FutureWarning
so that users actually get to see it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mathause At least in the tests the warnings are issued .
What's the actual difference between DeprecationWarning
and FutureWarning
(update: just found PendingDeprecationWarning
)? And when should they be used? Just to know for future contributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FutureWarning would be fine, too. We should probably try to come to consensus on a general policy for xarray.
The Python docs have some guidance but the overall recommendation is not really clear to me: https://docs.python.org/3/library/warnings.html#warning-categories
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FutureWarning
is for users and DeprecationWarning
for library authors (https://docs.python.org/3/library/warnings.html#warning-categories). Which is why you see DeprecationWarning
in the test but won't when you execute the code. Took me a while to figure this out when I wanted to deprecate some stuff in my package.
import warnings
def test():
warnings.warn("DeprecationWarning", DeprecationWarning)
warnings.warn("FutureWarning", FutureWarning)
If you try this in ipython test()
will raise both warnings. But if you save to a file and try
from test_warnings import test
test()
only FutureWarning
will appear (I did not know this detail either https://www.python.org/dev/peps/pep-0565/).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mathause @shoyer I'll switch to FutureWarning
since this seems to be the only user-visible warning, See https://www.python.org/dev/peps/pep-0565/#additional-use-case-for-futurewarning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, thanks for the pointers and explanations.
Nice! Unless @dcherian has any additional comments I'll merge in a few days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks @kmuehlbauer this is a great improvement!
Co-authored-by: Deepak Cherian <[email protected]>
ok then - let's do this. Thanks a lot @kmuehlbauer |
Thanks to all reviewers! Great job! |
…ternal use of `apply_ufunc` (follow-up to pydata#4060, fixes pydata#4385)
use
dask.array.apply_gufunc
inxr.apply_ufunc
for multiple outputs whendask='parallelized'
, add/fix testsisort -rc . && black . && mypy . && flake8
whats-new.rst
for all changes andapi.rst
for new APIRemaining Issues:
dask_gufunc_kwargs
output_core_dims
andoutput_sizes
, eg.xr.apply_ufunc(..., output_core_dims=[{"abc": 2]])