Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase minimal xugrid version to speed up xu.open_dataset() #968

Closed
6 tasks done
veenstrajelmer opened this issue Aug 20, 2024 · 3 comments · Fixed by #984
Closed
6 tasks done

increase minimal xugrid version to speed up xu.open_dataset() #968

veenstrajelmer opened this issue Aug 20, 2024 · 3 comments · Fixed by #984

Comments

@veenstrajelmer
Copy link
Collaborator

veenstrajelmer commented Aug 20, 2024

xugrid was improved so opening datasets is approx 30% faster. Also merging partitions is approx 10% faster. These improvements are mainly relevant for datasets with many variables, the example dataset contained 2410 variables. For datasets with <100 variables, the difference might not be noteworthy.

Todo:

  • Improve performance of _get_topology() by not accessing dataArray each time xugrid#285
  • add ds.close() to dfmt.open_partitioned_dataset() (see below) >> won't do since plotting (or any other action) fills up the memory again and it should have some performance impact when closing an re-opening all partitions again.
  • replace ds[var] or ds[varn] in dfm_tools with ds.variables[varn], e.g. in decode_default_fillvals() and remove_nan_fillvalue_attrs(), or on their uds counterparts. At least the latter has quite some impact on the dfmt.open_partitioned_dataset() timings
  • increase minimal xugrid version
  • run testbank
  • update whatsnew

Code to run with memory_profiler:

import os
import glob
from time import sleep
import dfm_tools as dfmt

dir_model = r"p:\11210284-011-nose-c-cycling\runs_fine_grid\B05_waq_2012_PCO2_ChlC_NPCratios_DenWat_stats_2023.01\B05_waq_2012_PCO2_ChlC_NPCratios_DenWat_stats_2023.01\DFM_OUTPUT_DCSM-FM_0_5nm_waq"
file_nc_pat = os.path.join(dir_model, "DCSM-FM_0_5nm_waq_0*_map.nc")
file_nc_list_all = glob.glob(file_nc_pat)
file_nc_list = file_nc_list_all[:5]

uds = dfmt.open_partitioned_dataset(file_nc_list, remove_ghost=False)
sleep(2)

Without ds.close() (peaks at around 1150 MB):
image

With ds.close() (peaks at around 700 MB):
image

Including ghostcell removal
The same, but then including removal of ghostcells
Without ds.close() and remove_ghost=True (peaks at around 1350 MB):
image

With ds.close() directly after xu.core.wrap.UgridDataset and remove_ghost=True (peaks at around 1350 MB):
image

With ds.close() after remove_ghostcells() and remove_ghost=True (peaks at around 900 MB):
image

including plot
When adding a plot, the memory reduction is again non-existent anymore:

print('>> plot single timestep: ',end='')
dtstart = dt.datetime.now()
uds["mesh2d_tureps1"].isel(time=-1, mesh2d_nInterfaces=-2).ugrid.plot()
print(f'{(dt.datetime.now()-dtstart).total_seconds():.2f} sec')

Without ds.close() (peaks at around 1200 MB):
image
Timings:

>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 31.73 sec
>> xu.merge_partitions() with 5 partition(s): 19.01 sec
>> dfmt.open_partitioned_dataset() total: 50.74 sec
>> plot single timestep: 0.41 sec

With ds.close() (peaks at around 1200 MB):
image

Timings:

>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 28.41 sec
>> xu.merge_partitions() with 5 partition(s): 18.28 sec
>> dfmt.open_partitioned_dataset() total: 46.70 sec
>> plot single timestep: 5.68 sec

including plot with h5netcdf
Without ds.close() and with engine="h5netcdf" (peaks at around 750 MB):
image

Timings:

>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 195.75 sec
>> xu.merge_partitions() with 5 partition(s): 18.21 sec
>> dfmt.open_partitioned_dataset() total: 213.96 sec
>> plot single timestep: 0.52 sec

So this shows far much more time consumption on opening the dataset (because of the pure-python implementation h5netcdf), but the memory usage is significantly less. This might be very useful when h5netcdf is more performant: h5netcdf/h5netcdf#195. Follow-up issue: #484

@veenstrajelmer
Copy link
Collaborator Author

veenstrajelmer commented Sep 3, 2024

After fixing this issue (including a new xugrid version), not only the timings have changed, but also the memory usage decreased.

Code to run with memory_profiler via mprof run python memory_profiler.py:

import os
import glob
from time import sleep
import dfm_tools as dfmt
import datetime as dt

dir_model = r"p:\11210284-011-nose-c-cycling\runs_fine_grid\B05_waq_2012_PCO2_ChlC_NPCratios_DenWat_stats_2023.01\B05_waq_2012_PCO2_ChlC_NPCratios_DenWat_stats_2023.01\DFM_OUTPUT_DCSM-FM_0_5nm_waq"
file_nc_pat = os.path.join(dir_model, "DCSM-FM_0_5nm_waq_0*_map.nc")
file_nc_list_all = glob.glob(file_nc_pat)
file_nc_list = file_nc_list_all[:5]

uds = dfmt.open_partitioned_dataset(file_nc_list, remove_ghost=False) #, engine="h5netcdf")

print('>> plot single timestep: ',end='')
dtstart = dt.datetime.now()
uds["mesh2d_tureps1"].isel(time=-1, mesh2d_nInterfaces=-2).ugrid.plot()
print(f'{(dt.datetime.now()-dtstart).total_seconds():.2f} sec')
sleep(2)

Memory usage including plot (peaks at around 900 MB, this was 1200 MB before):
image

>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 17.65 sec
>> xu.merge_partitions() with 5 partition(s): 16.65 sec
>> dfmt.open_partitioned_dataset() total: 34.30 sec
>> plot single timestep: 0.42 sec

xu.open_dataset() is twice as fast and xu.merge_partitions() takes almost no additional memory anymore. These are great improvements.

Memory usage including plot and engine="h5netcdf" (peaks at around 480 MB, this was 750 MB before):
image

>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 171.22 sec
>> xu.merge_partitions() with 5 partition(s): 15.89 sec
>> dfmt.open_partitioned_dataset() total: 187.12 sec
>> plot single timestep: 0.44 sec

xu.open_dataset() is >20 seconds faster and xu.merge_partitions() takes almost no additional memory anymore. So also for h5netcdf these improvements are great.

@Huite
Copy link

Huite commented Sep 3, 2024

This almost certainly this change: https://docs.xarray.dev/en/stable/whats-new.html#performance

Relevant discussion: pydata/xarray#9058
Relevant PR: https://github.com/pydata/xarray/pull/9067/files

(Funny, the discussion has even worse datasets with 8500 variables!)

Excellent timing!

@veenstrajelmer
Copy link
Collaborator Author

veenstrajelmer commented Sep 3, 2024

I am not sure whether this is the case, when I downgrade dfm_tools, xugrid, xarray, dask and netcdf4 to versions prior to this fix, I still see no memory increase upon merging the grid. I am a bit puzzled, but won't put additional time in it at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants