Skip to content

Commit

Permalink
Edits to climatology doc (#361)
Browse files Browse the repository at this point in the history
  • Loading branch information
dcherian authored Apr 26, 2024
1 parent 497e7bc commit 13cb229
Showing 1 changed file with 41 additions and 11 deletions.
52 changes: 41 additions & 11 deletions docs/source/user-stories/climatology.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,9 @@
"source": [
"To account for Feb-29 being present in some years, we'll construct a time vector to group by as \"mmm-dd\" string.\n",
"\n",
"For more options, see https://strftime.org/"
"```{seealso}\n",
"For more options, see [this great website](https://strftime.org/).\n",
"```"
]
},
{
Expand All @@ -80,7 +82,7 @@
"id": "6",
"metadata": {},
"source": [
"## map-reduce\n",
"## First, `method=\"map-reduce\"`\n",
"\n",
"The default\n",
"[method=\"map-reduce\"](https://flox.readthedocs.io/en/latest/implementation.html#method-map-reduce)\n",
Expand Down Expand Up @@ -110,7 +112,7 @@
"id": "8",
"metadata": {},
"source": [
"## Rechunking for map-reduce\n",
"### Rechunking for map-reduce\n",
"\n",
"We can split each chunk along the `lat`, `lon` dimensions to make sure the\n",
"output chunk sizes are more reasonable\n"
Expand Down Expand Up @@ -139,7 +141,7 @@
"But what if we didn't want to rechunk the dataset so drastically (note the 10x\n",
"increase in tasks). For that let's try `method=\"cohorts\"`\n",
"\n",
"## method=cohorts\n",
"## `method=\"cohorts\"`\n",
"\n",
"We can take advantage of patterns in the groups here \"day of year\".\n",
"Specifically:\n",
Expand Down Expand Up @@ -271,7 +273,7 @@
"id": "21",
"metadata": {},
"source": [
"And now our cohorts contain more than one group\n"
"And now our cohorts contain more than one group, *and* there is a substantial reduction in number of cohorts **162 -> 12**\n"
]
},
{
Expand All @@ -281,7 +283,7 @@
"metadata": {},
"outputs": [],
"source": [
"preferrd_method, new_cohorts = flox.core.find_group_cohorts(\n",
"preferred_method, new_cohorts = flox.core.find_group_cohorts(\n",
" labels=codes,\n",
" chunks=(rechunked.chunksizes[\"time\"],),\n",
")\n",
Expand All @@ -295,13 +297,23 @@
"id": "23",
"metadata": {},
"outputs": [],
"source": [
"preferred_method"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "24",
"metadata": {},
"outputs": [],
"source": [
"new_cohorts.values()"
]
},
{
"cell_type": "markdown",
"id": "24",
"id": "25",
"metadata": {},
"source": [
"Now the groupby reduction **looks OK** in terms of number of tasks but remember\n",
Expand All @@ -311,7 +323,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "25",
"id": "26",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -320,7 +332,25 @@
},
{
"cell_type": "markdown",
"id": "26",
"id": "27",
"metadata": {},
"source": [
"flox's heuristics will choose `\"cohorts\"` automatically!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28",
"metadata": {},
"outputs": [],
"source": [
"flox.xarray.xarray_reduce(rechunked, day, func=\"mean\")"
]
},
{
"cell_type": "markdown",
"id": "29",
"metadata": {},
"source": [
"## How about other climatologies?\n",
Expand All @@ -331,7 +361,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "27",
"id": "30",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -340,7 +370,7 @@
},
{
"cell_type": "markdown",
"id": "28",
"id": "31",
"metadata": {},
"source": [
"This looks great. Why?\n",
Expand Down

0 comments on commit 13cb229

Please sign in to comment.