Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/main' into numbagg
Browse files Browse the repository at this point in the history
* upstream/main:
  Support quantile, median, mode with method="blockwise". (#269)
  Add multidimensional binning demo (#203)
  [pre-commit.ci] pre-commit autoupdate (#268)
  • Loading branch information
dcherian committed Oct 5, 2023
2 parents e1eda24 + 68b122e commit 412f31f
Show file tree
Hide file tree
Showing 14 changed files with 628 additions and 61 deletions.
10 changes: 5 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ ci:
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: 'v0.0.276'
rev: 'v0.0.292'
hooks:
- id: ruff
args: ["--fix"]
Expand All @@ -18,12 +18,12 @@ repos:
- id: check-docstring-first

- repo: https://github.com/psf/black
rev: 23.3.0
rev: 23.9.1
hooks:
- id: black

- repo: https://github.com/executablebooks/mdformat
rev: 0.7.16
rev: 0.7.17
hooks:
- id: mdformat
additional_dependencies:
Expand All @@ -44,13 +44,13 @@ repos:
args: [--extra-keys=metadata.kernelspec metadata.language_info.version]

- repo: https://github.com/codespell-project/codespell
rev: v2.2.5
rev: v2.2.6
hooks:
- id: codespell
additional_dependencies:
- tomli

- repo: https://github.com/abravalheri/validate-pyproject
rev: v0.13
rev: v0.14
hooks:
- id: validate-pyproject
1 change: 1 addition & 0 deletions ci/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,6 @@ dependencies:
- pooch
- toolz
- numba
- scipy
- pip:
- git+https://github.com/numbagg/numbagg
7 changes: 5 additions & 2 deletions docs/source/aggregations.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,11 @@ the `func` kwarg:
- `"std"`, `"nanstd"`
- `"argmin"`
- `"argmax"`
- `"first"`
- `"last"`
- `"first"`, `"nanfirst"`
- `"last"`, `"nanlast"`
- `"median"`, `"nanmedian"`
- `"mode"`, `"nanmode"`
- `"quantile"`, `"nanquantile"`

```{tip}
We would like to add support for `cumsum`, `cumprod` ([issue](https://github.com/xarray-contrib/flox/issues/91)). Contributions are welcome!
Expand Down
2 changes: 1 addition & 1 deletion docs/source/implementation.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ width: 100%
1. Group labels must be known at graph construction time, so this only works for numpy arrays.
1. This does require more tasks and a more complicated graph, but the communication overhead can be significantly lower.
1. The detection of "cohorts" is currently slow but could be improved.
1. The extra effort of detecting cohorts and mul;tiple copying of intermediate blocks may be worthwhile only if the chunk sizes are small
1. The extra effort of detecting cohorts and multiple copying of intermediate blocks may be worthwhile only if the chunk sizes are small
relative to the approximate period of group labels, or small relative to the size of spatially localized groups.

### Example : sensitivity to chunking
Expand Down
1 change: 1 addition & 0 deletions docs/source/user-stories.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@
user-stories/climatology.ipynb
user-stories/climatology-hourly.ipynb
user-stories/custom-aggregations.ipynb
user-stories/nD-bins.ipynb
```
11 changes: 8 additions & 3 deletions docs/source/user-stories/custom-aggregations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,13 @@
">\n",
"> A = da.groupby(['lon_bins', 'lat_bins']).mode()\n",
"\n",
"This notebook will describe how to accomplish this using a custom `Aggregation`\n",
"since `mode` and `median` aren't supported by flox yet.\n"
"This notebook will describe how to accomplish this using a custom `Aggregation`.\n",
"\n",
"\n",
"```{tip}\n",
"flox now supports `mode`, `nanmode`, `quantile`, `nanquantile`, `median`, `nanmedian` using exactly the same \n",
"approach as shown below\n",
"```\n"
]
},
{
Expand Down Expand Up @@ -135,7 +140,7 @@
" # The next are for dask inputs and describe how to reduce\n",
" # the data in parallel\n",
" chunk=(\"sum\", \"nanlen\"), # first compute these blockwise : (grouped_sum, grouped_count)\n",
" combine=(\"sum\", \"sum\"), # reduce intermediate reuslts (sum the sums, sum the counts)\n",
" combine=(\"sum\", \"sum\"), # reduce intermediate results (sum the sums, sum the counts)\n",
" finalize=lambda sum_, count: sum_ / count, # final mean value (divide sum by count)\n",
"\n",
" fill_value=(0, 0), # fill value for intermediate sums and counts when groups have no members\n",
Expand Down
Loading

0 comments on commit 412f31f

Please sign in to comment.