Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major fix to subset_to_blocks #173

Merged
merged 21 commits into from
Oct 16, 2022
Merged

Major fix to subset_to_blocks #173

merged 21 commits into from
Oct 16, 2022

Conversation

dcherian
Copy link
Collaborator

Turns out I wasn't actually subsetting in many cases so we were just repeatedly computing over the entire array for every cohort (!)

Incredibly bad regression!

@dcherian
Copy link
Collaborator Author

 -          155086             7009     0.05  cohorts.ERA5.track_num_tasks

!!!

- Apply one iterable, if present, along with slices.
@dcherian
Copy link
Collaborator Author

dcherian commented Oct 14, 2022

       before           after         ratio
     [681cf74b]       [5b4ac42a]
     <v0.5.9>         <fix-subset>
+            2473             3015     1.22  cohorts.Era5MonthHour.track_num_tasks
-        619±80ms         462±50ms     0.75  cohorts.ERA5DayOfYear.time_graph_construct
-           11874             5808     0.49  cohorts.NWMMidwest.track_num_tasks
-       2.67±0.2s         660±40ms     0.25  cohorts.NWMMidwest.time_find_group_cohorts
-      8.33±0.07s         997±90ms     0.12  cohorts.NWMMidwest.time_graph_construct

So small-ish regression for timeseries type cohorts though we should still benefit from less comms. I should add rechunked versions of these

@dcherian
Copy link
Collaborator Author

dcherian commented Oct 16, 2022

Relative to v0.5.10

       before           after         ratio
     [91b6e193]       [b0dfdb04]
     <v0.5.10>        <fix-subset>
+            3208             4208     1.31  cohorts.ERA5MonthHourRechunked.track_num_tasks_optimized
+             765              939     1.23  cohorts.PerfectMonthly.track_num_tasks
+            2473             3015     1.22  cohorts.ERA5MonthHour.track_num_tasks
+            3645             4232     1.16  cohorts.ERA5MonthHourRechunked.track_num_tasks
-              89               80     0.90  cohorts.ERA5MonthHourRechunked.track_num_layers
-              87               78     0.90  cohorts.ERA5MonthHour.track_num_layers
-              24               21     0.88  cohorts.PerfectMonthly.track_num_layers
-             759              585     0.77  cohorts.PerfectMonthly.track_num_tasks_optimized
-            6858             4896     0.71  cohorts.NWMMidwest.track_num_tasks_optimized
-            1104              741     0.67  cohorts.ERA5DayOfYearRechunked.track_num_layers
-             842              509     0.60  cohorts.NWMMidwest.track_num_layers
-            1828             1101     0.60  cohorts.ERA5DayOfYear.track_num_layers
-           11874             5184     0.44  cohorts.NWMMidwest.track_num_tasks
-      11.1±0.02s       1.31±0.03s     0.12  cohorts.NWMMidwest.time_graph_construct

I think these increased number of tasks are reindex_intermediates. I think I could avoid this but that would take some extra graph construction time, note that for PerfectMonthly tasks after optimization is smaller... and number of layers is always smaller.

relative to main

       before           after         ratio
     [0bf35e03]       [217935a0]
-         747±2ms          441±1ms     0.59  cohorts.ERA5DayOfYearRechunked.time_graph_construct
-            8339             4896     0.59  cohorts.NWMMidwest.track_num_tasks_optimized
-            8872             5184     0.58  cohorts.NWMMidwest.track_num_tasks
-            2196             1101     0.50  cohorts.ERA5DayOfYear.track_num_layers
-            1224              585     0.48  cohorts.PerfectMonthly.track_num_tasks_optimized
-            1224              585     0.48  cohorts.PerfectMonthlyRechunked.track_num_tasks_optimized
-         1.13±0s          501±3ms     0.45  cohorts.ERA5DayOfYear.time_graph_construct
-           12329             4232     0.34  cohorts.ERA5MonthHourRechunked.track_num_tasks
-           12293             4208     0.34  cohorts.ERA5MonthHourRechunked.track_num_tasks_optimized
-           10488             3015     0.29  cohorts.ERA5MonthHour.track_num_tasks
-           10440             2461     0.24  cohorts.ERA5MonthHour.track_num_tasks_optimized
-          155086             7009     0.05  cohorts.ERA5DayOfYearRechunked.track_num_tasks
-          154720             6570     0.04  cohorts.ERA5DayOfYearRechunked.track_num_tasks_optimized
-          270284             5294     0.02  cohorts.ERA5DayOfYear.track_num_tasks
-          268824             4562     0.02  cohorts.ERA5DayOfYear.track_num_tasks_optimized

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@dcherian dcherian merged commit 6897240 into main Oct 16, 2022
@dcherian dcherian deleted the fix-subset branch October 16, 2022 19:39
dcherian added a commit that referenced this pull request Oct 17, 2022
* main: (29 commits)
  Major fix to subset_to_blocks (#173)
  Performance improvements for cohorts detection (#172)
  Remove split_out (#170)
  Deprecate resample_reduce (#169)
  More efficient cohorts. (#165)
  Allow specifying output dtype (#131)
  Add a dtype check for numpy arrays in assert_equal (#158)
  Update ci-additional.yaml (#167)
  Refactor before redoing cohorts (#164)
  Fix mypy errors in core.py (#150)
  Add link to numpy_groupies (#160)
  Bump codecov/codecov-action from 3.1.0 to 3.1.1 (#159)
  Use math.prod instead of np.prod (#157)
  Remove None output from _get_expected_groups (#152)
  Fix mypy errors in xarray.py, xrutils.py, cache.py (#144)
  Raise error if multiple by's are used with Ellipsis (#149)
  pre-commit autoupdate (#148)
  Add mypy ignores (#146)
  Get pre commit bot to update (#145)
  Remove duplicate examples headers (#147)
  ...
@dcherian
Copy link
Collaborator Author

dcherian commented Oct 25, 2022

Looks like the distributed constant memory scheduling negates the need for cohorts in the NWM model case

  1. Still may be of some benefit for downstream pipelines if chunking along the grouped dimension is desired
  2. It is faster for this case but not by much, but note this is only 3 months of data.

image

EDIT: bigger time savings for 6months of data:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant