Optimize `ffill`, `bfill` with dask when `limit` is specified #9771

josephnowak · 2024-11-12T16:26:34Z

This improvement comes from the discussion on this PR #9712

Closes #xxxx
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

…h function

…h function, and incorporate the method parameter for the cumreduction on the push method

for more information, see https://pre-commit.ci

josephnowak · 2024-11-12T16:49:27Z

The tests that are failing look unrelated to the change of this PR 🤔

xarray/core/dask_array_ops.py

Co-authored-by: Deepak Cherian <[email protected]>

josephnowak · 2024-11-12T17:10:56Z

@dcherian do you think that is a good idea to add a keepdims parameter to the first and last function implemented on the duck_array_ops.py file? if I add that I could use them directly on the cumreduction with blelloch method, if not I can just create a dummy wrapper like the one that I did for the push method to receive a type parameter that is not use

dcherian · 2024-11-12T17:31:49Z

yes that sounds good to me.

…rameter for the last and first to make it compatible with the blelloch method

for more information, see https://pre-commit.ci

…method directly, they already have the parameter

for more information, see https://pre-commit.ci

dcherian · 2024-11-12T21:47:30Z

Great! Can you add a note to whats-new please? This is worth advertising.

josephnowak · 2024-11-13T15:13:26Z

Hi @dcherian

I think we could add this special case to try to improve the performance when the limit is equal or smaller than the chunksize on the axis (only one chunk would overlap). Is there any kind of benchmark that I can use to see if there is any real benefit from using map overlap over cumsum when the n is small?

    if n is not None and n <= array.chunksize[axis]:
        return array.map_overlap(_push, depth={axis: (n, 0)}, n=n, axis=axis)

dcherian · 2024-11-14T15:44:24Z

I don't have one at hand. @phofl might have suggestions.

Note that chunks can be non-uniform, so you'll have to check every element of that array.chunks[axis] tuple.

phofl · 2024-11-14T15:46:02Z

Yes!

The highlevel_api function in https://github.com/coiled/benchmarks/blob/main/tests/geospatial/workloads/climatology.py

josephnowak · 2024-11-14T15:57:44Z

thanks, I will try to run those benchmarks before modifying this PR, but from my understanding it should generate fewer tasks using the map overlap on a small N.

dcherian · 2024-11-14T15:59:41Z

Yes I believe that's why @phofl proposed that optimization. We just need to make sure to trigger it in the right circumstances. Please add a test for non-uniform chunksizes where map_overlap would give you the wrong answer

josephnowak · 2024-11-14T16:40:51Z

After understanding more about the code that @phofl did, I think that the implementation works in the general case, the main issue with it is that it can load the whole array in memory when the N is equal to the size of the array (based on my understanding of how the map overlaps works) when I saw the PR I thought it was the same implementation than before, but what he did was to define the overlap length based on the limit and the previous algorithm (the one before the implementation with the cumreduction) applied a map_blocks using the push function first and later a map overlap but with a single element of overlapping instead of N which generated the issue when one or more chunks was completely empty. In summary, I think that the best is to use his algorithm when N is smaller than the chunk size of the axis, so we can avoid loading more than 2 chunks at the same time.

dcherian · 2024-11-14T18:56:18Z

doesn't (n <= array.chunks[axis]).all() handle that case. It wouldn't trigger if n is greater than the chunksize, so you'll never load the whole thing in to memory at once?

In any case, how do you feel about merging this and optimizing later? I believe this is already a massive improvement.

josephnowak · 2024-11-14T19:22:06Z

Yes, that condition would prevent loading all the chunks, I just wanted to mention that using that algorithm in the general case would be problematic due that it would load all the chunks at once but it would give the expected result at the end.

I'm okay with merging it without the special case optimization and adding that in a further PR, but I'm also okay with adding that special case on this PR, I let you the final decision, I already have it implemented on my local.

dcherian · 2024-11-14T19:32:47Z

Ah then why don't you push what you have.

phofl · 2024-11-14T20:00:17Z

doesn't (n <= array.chunks[axis]).all() handle that case. It wouldn't trigger if n is greater than the chunksize, so you'll never load the whole thing in to memory at once?

Yes this is what I wanted to do as well (I got side-tracked over the last few days unfortunately). That is memory efficient and helps in some cases

…an N on the push method

josephnowak · 2024-11-14T21:05:37Z

I didn't push it because the PR had a plan to merge the label, so I didn't want to modify the PR without your approval.

And sorry phofl for not getting the idea of your original algorithm the first time that I saw your PR, the good part is that now the algorithm behaves better in all the scenarios.

I'm not sure is why the tests are not running, should the conflicts be resolved first to run the tests?

xarray/tests/test_duck_array_ops.py

for more information, see https://pre-commit.ci

dcherian · 2024-11-14T22:51:11Z

Nice work @josephnowak and @phofl

* main: Add download stats badges (pydata#9786) Fix open_mfdataset for list of fsspec files (pydata#9785) add 'User-Agent'-header to pooch.retrieve (pydata#9782) Optimize `ffill`, `bfill` with dask when `limit` is specified (pydata#9771)

* main: (24 commits) Bump minimum versions (#9796) Namespace-aware `xarray.ufuncs` (#9776) Add prettier and pygrep hooks to pre-commit hooks (#9644) `rolling.construct`: Add `sliding_window_kwargs` to pipe arguments down to `sliding_window_view` (#9720) Bump codecov/codecov-action from 4.6.0 to 5.0.2 in the actions group (#9793) Buffer types (#9787) Add download stats badges (#9786) Fix open_mfdataset for list of fsspec files (#9785) add 'User-Agent'-header to pooch.retrieve (#9782) Optimize `ffill`, `bfill` with dask when `limit` is specified (#9771) fix cf decoding of grid_mapping (#9765) Allow wrapping `np.ndarray` subclasses (#9760) Optimize polyfit (#9766) Use `map_overlap` for rolling reductions with Dask (#9770) fix html repr indexes section (#9768) Bump pypa/gh-action-pypi-publish from 1.11.0 to 1.12.2 in the actions group (#9763) unpin array-api-strict, as issues are resolved upstream (#9762) rewrite the `min_deps_check` script (#9754) CI runs ruff instead of pep8speaks (#9759) Specify copyright holders in main license file (#9756) ...

* main: Bump minimum versions (pydata#9796) Namespace-aware `xarray.ufuncs` (pydata#9776) Add prettier and pygrep hooks to pre-commit hooks (pydata#9644) `rolling.construct`: Add `sliding_window_kwargs` to pipe arguments down to `sliding_window_view` (pydata#9720) Bump codecov/codecov-action from 4.6.0 to 5.0.2 in the actions group (pydata#9793) Buffer types (pydata#9787) Add download stats badges (pydata#9786) Fix open_mfdataset for list of fsspec files (pydata#9785) add 'User-Agent'-header to pooch.retrieve (pydata#9782) Optimize `ffill`, `bfill` with dask when `limit` is specified (pydata#9771) fix cf decoding of grid_mapping (pydata#9765) Allow wrapping `np.ndarray` subclasses (pydata#9760) Optimize polyfit (pydata#9766) Use `map_overlap` for rolling reductions with Dask (pydata#9770)

josephnowak and others added 3 commits November 12, 2024 00:47

Reduce the number of tasks when the limit parameter is set on the pus…

f4fb252

…h function

Reduce the number of tasks when the limit parameter is set on the pus…

40232dc

…h function, and incorporate the method parameter for the cumreduction on the push method

[pre-commit.ci] auto fixes from pre-commit.com hooks

fde767b

for more information, see https://pre-commit.ci

josephnowak changed the title ~~Feature/reduce the number of tasks on the push method when the limit parameter is used~~ Reduce the number of tasks on the push method when the limit parameter is used Nov 12, 2024

dcherian added the topic-dask label Nov 12, 2024

dcherian reviewed Nov 12, 2024

View reviewed changes

xarray/core/dask_array_ops.py Outdated Show resolved Hide resolved

josephnowak and others added 2 commits November 12, 2024 12:56

Update xarray/core/dask_array_ops.py

0d61d74

Co-authored-by: Deepak Cherian <[email protected]>

Merge branch 'main' into feature/reduce-tasks-on-push-method-with-limit

b3d3307

josephnowak and others added 4 commits November 12, 2024 15:52

Use last instead of creating a custom function, and add a keepdims pa…

8d99dd0

…rameter for the last and first to make it compatible with the blelloch method

[pre-commit.ci] auto fixes from pre-commit.com hooks

4fc115c

for more information, see https://pre-commit.ci

Remove the keepdims on the last and first method and use the nanlast …

297afdd

…method directly, they already have the parameter

[pre-commit.ci] auto fixes from pre-commit.com hooks

656c508

for more information, see https://pre-commit.ci

dcherian changed the title ~~Reduce the number of tasks on the push method when the limit parameter is used~~ Optimize ffill, bfill with dask when limit is specified Nov 12, 2024

dcherian added the plan to merge Final call for comments label Nov 12, 2024

dcherian approved these changes Nov 12, 2024

View reviewed changes

Include the optimization of ffill and bfill on the whats-new.rst

100369e

josephnowak added 3 commits November 14, 2024 16:11

Use map_overlap when the n is smaller than all the chunks

b528a4b

Avoid creating a numpy array to check if all the chunks are bigger th…

6ae5e1f

…an N on the push method

Updating the whats-new.rst

aad5ecb

dcherian reviewed Nov 14, 2024

View reviewed changes

xarray/tests/test_duck_array_ops.py Show resolved Hide resolved

dcherian and others added 2 commits November 14, 2024 14:08

Merge branch 'main' into feature/reduce-tasks-on-push-method-with-limit

61d6574

[pre-commit.ci] auto fixes from pre-commit.com hooks

c30ed51

for more information, see https://pre-commit.ci

dcherian approved these changes Nov 14, 2024

View reviewed changes

dcherian merged commit 4c8b03b into pydata:main Nov 14, 2024
28 of 29 checks passed

josephnowak mentioned this pull request Nov 22, 2024

Optimize the push method when the limit (n) parameter is used dask/dask#11559

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `ffill`, `bfill` with dask when `limit` is specified #9771

Optimize `ffill`, `bfill` with dask when `limit` is specified #9771

josephnowak commented Nov 12, 2024 •

edited

Loading

josephnowak commented Nov 12, 2024

josephnowak commented Nov 12, 2024 •

edited

Loading

dcherian commented Nov 12, 2024

dcherian commented Nov 12, 2024

josephnowak commented Nov 13, 2024 •

edited

Loading

dcherian commented Nov 14, 2024 •

edited

Loading

phofl commented Nov 14, 2024

josephnowak commented Nov 14, 2024

dcherian commented Nov 14, 2024

josephnowak commented Nov 14, 2024 •

edited

Loading

dcherian commented Nov 14, 2024

josephnowak commented Nov 14, 2024

dcherian commented Nov 14, 2024

phofl commented Nov 14, 2024

josephnowak commented Nov 14, 2024

dcherian commented Nov 14, 2024

Optimize ffill, bfill with dask when limit is specified #9771

Optimize ffill, bfill with dask when limit is specified #9771

Conversation

josephnowak commented Nov 12, 2024 • edited Loading

josephnowak commented Nov 12, 2024

josephnowak commented Nov 12, 2024 • edited Loading

dcherian commented Nov 12, 2024

dcherian commented Nov 12, 2024

josephnowak commented Nov 13, 2024 • edited Loading

dcherian commented Nov 14, 2024 • edited Loading

phofl commented Nov 14, 2024

josephnowak commented Nov 14, 2024

dcherian commented Nov 14, 2024

josephnowak commented Nov 14, 2024 • edited Loading

dcherian commented Nov 14, 2024

josephnowak commented Nov 14, 2024

dcherian commented Nov 14, 2024

phofl commented Nov 14, 2024

josephnowak commented Nov 14, 2024

dcherian commented Nov 14, 2024

Optimize `ffill`, `bfill` with dask when `limit` is specified #9771

Optimize `ffill`, `bfill` with dask when `limit` is specified #9771

josephnowak commented Nov 12, 2024 •

edited

Loading

josephnowak commented Nov 12, 2024 •

edited

Loading

josephnowak commented Nov 13, 2024 •

edited

Loading

dcherian commented Nov 14, 2024 •

edited

Loading

josephnowak commented Nov 14, 2024 •

edited

Loading