Significantly faster cohorts detection. #272

dcherian · 2023-10-10T21:17:12Z

Closes #271

I was iterating over array.blocks to figure out the shape of each chunk.

When indexing this object, it creates a dask array per chunk, which is slow for many reasons

dask array construction, which is useless
calling np.array() on a chunks tuple repeatedly in a loop (surprisingly slow!)

Replace with a function that calculates the chunk shape. On the arco era5 with 93044 time chunks, this is a speedup from infinite time to 840ms.

TIME = 92044
da = xr.DataArray(
    dask.array.ones((TIME, 721, 1440), chunks=(1, -1, -1)),
    dims=("time", "lat", "lon"),
    coords=dict(time=pd.date_range("1959-01-01", freq="6H", periods=TIME)),
)
%time xarray_reduce(da, da.time.dt.day, method="cohorts", func="any")

dcherian · 2023-10-10T22:18:42Z

The profile looks like

   225     92045   36723000.0    399.0      5.7      for idx, blockindex in enumerate(np.ndindex(array.numblocks)):
   226     92044  169811000.0   1844.9     26.5          chunkshape = get_chunk_shape(array_chunks, blockindex)
   227     92044  142355000.0   1546.6     22.2          blocks[idx] = np.full(chunkshape, idx)
   228         1  189328000.0    2e+08     29.5      which_chunk = np.block(blocks.reshape(shape).tolist()).reshape(-1)

I strongly suspect we can do better. The tolist is copying blocks which should be unnecessary.

dcherian · 2023-10-10T23:06:33Z

Down to

546 ms ± 3.68 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

dcherian · 2023-10-10T23:11:33Z

I strongly suspect we can do better.

Need a way to assign to a nested list. This is why I chose the numpy object array route in the first place :)

Punting to later, since this is already a massive improvement.

dcherian · 2023-10-11T00:54:43Z

Change	Before [`9f82e19`]	After [`fa93406`]	Ratio	Benchmark (Parameter)
-	192±1ms	172±4ms	0.9	cohorts.ERA5DayOfYear.time_graph_construct
-	186±3ms	158±2ms	0.85	cohorts.ERA5DayOfYearRechunked.time_graph_construct
-	95.5±0.9ms	68.6±0.9ms	0.72	cohorts.ERA5DayOfYearRechunked.time_find_group_cohorts
-	48.0±0.1ms	25.0±0.1ms	0.52	cohorts.ERA5DayOfYear.time_find_group_cohorts
-	44.5±2ms	18.6±0.6ms	0.42	cohorts.ERA5MonthHourRechunked.time_graph_construct
-	42.3±2ms	17.1±0.2ms	0.4	cohorts.ERA5MonthHour.time_graph_construct
-	9.52±0.04ms	3.72±0.01ms	0.39	cohorts.PerfectMonthly.time_graph_construct
-	9.60±0.2ms	3.72±0.02ms	0.39	cohorts.PerfectMonthlyRechunked.time_graph_construct
-	73.5±0.9ms	25.4±0.5ms	0.35	cohorts.time_cohorts_era5_single
-	31.2±0.2ms	7.58±0.1ms	0.24	cohorts.ERA5MonthHour.time_find_group_cohorts
-	34.1±0.1ms	7.80±0.2ms	0.23	cohorts.ERA5MonthHourRechunked.time_find_group_cohorts
-	6.87±0.03ms	1.02±0.04ms	0.15	cohorts.PerfectMonthlyRechunked.time_find_group_cohorts
-	6.95±0.1ms	1.00±0.01ms	0.14	cohorts.PerfectMonthly.time_find_group_cohorts

dcherian · 2023-10-11T00:58:14Z

Benchmarks seem to be broken after the numbagg PR. I'll fix in a new branch.

* main: (24 commits) Add `packaging` as dependency use engine flox for ordered groups (#266) Update pyproject.toml: py3.12 Bump numpy to >=1.22 (#278) Cleanups (#276) benchmarks updates (#273) repo-review comments (#270) Significantly faster cohorts detection. (#272) Add engine="numbagg" (#72) Support quantile, median, mode with method="blockwise". (#269) Add multidimensional binning demo (#203) [pre-commit.ci] pre-commit autoupdate (#268) Drop python 3.8, test python 3.11 (#209) tests: move xfail out of functions (#265) Bump actions/checkout from 3 to 4 (#267) convert datetime: micro-optimizations (#261) compatibility with `numpy>=2.0` (#257) replace the deprecated `provision-with-micromamba` with `setup-micromamba` (#258) Fix some typing errors in asv_bench and tests (#253) [pre-commit.ci] pre-commit autoupdate (#250) ...

dcherian added 5 commits October 10, 2023 15:15

Significantly faster cohorts detection.

ecc3e54

cleanup

8628490

add benchmark

bdf41bf

update types

6c5ab08

fix

de90a74

dcherian added 4 commits October 10, 2023 16:29

single chunk optimization

31f849e

more optimization

e417d58

more optimization

ab83e29

add comment

e05b64c

fix benchmark

fa93406

dcherian merged commit a897034 into main Oct 11, 2023
15 of 16 checks passed

dcherian deleted the faster-cohorts branch October 11, 2023 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significantly faster cohorts detection. #272

Significantly faster cohorts detection. #272

dcherian commented Oct 10, 2023 •

edited

Loading

dcherian commented Oct 10, 2023 •

edited

Loading

dcherian commented Oct 10, 2023

dcherian commented Oct 10, 2023

dcherian commented Oct 11, 2023

dcherian commented Oct 11, 2023

Significantly faster cohorts detection. #272

Significantly faster cohorts detection. #272

Conversation

dcherian commented Oct 10, 2023 • edited Loading

dcherian commented Oct 10, 2023 • edited Loading

dcherian commented Oct 10, 2023

dcherian commented Oct 10, 2023

dcherian commented Oct 11, 2023

dcherian commented Oct 11, 2023

dcherian commented Oct 10, 2023 •

edited

Loading

dcherian commented Oct 10, 2023 •

edited

Loading