Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to perform compute_Sv with dask inputs #1212

Closed
lsetiawan opened this issue Nov 7, 2023 · 8 comments
Closed

Failure to perform compute_Sv with dask inputs #1212

lsetiawan opened this issue Nov 7, 2023 · 8 comments
Assignees
Milestone

Comments

@lsetiawan
Copy link
Member

Overview

Currently, compute_Sv does not work with a merged Sv file (raw_converted_combined/2013/x0097_2_wt_20130824_232034_f0031.zarr) when opened as dask arrays. Worker memory get's overwhelmed and even spilling to disk by dask won't cut it. This is the non descriptive dask traceback:

2023-11-06 17:00:16,536 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 11.37 GiB -- Worker memory limit: 16.00 GiB
2023-11-06 17:00:16,878 - distributed.worker.memory - WARNING - Worker is at 80% memory usage. Pausing worker.  Process memory: 12.85 GiB -- Worker memory limit: 16.00 GiB
2023-11-06 17:00:16,922 - distributed.worker.memory - WARNING - Worker is at 79% memory usage. Resuming worker. Process memory: 12.70 GiB -- Worker memory limit: 16.00 GiB
...
RuntimeError: Cannot remove replica of ('transpose-4cd487e9eb8037d494b448b470058650', 0, 0, 0) while ('rechunk-split-rechunk-merge-78880722c4e511b3105b467cdecc2000', 0, 22, 2) in state 'executing'.
RuntimeError: Cannot remove replica of ('transpose-4cd487e9eb8037d494b448b470058650', 0, 0, 0) while ('rechunk-split-rechunk-merge-78880722c4e511b3105b467cdecc2000', 0, 22, 2) in state 'executing'.

The above traceback is not very useful, so I tried to perform compute_Sv without using dask and that seemed to work and I was able to profile the memory usage. It seems like there are a few spots that huge spikes in memory occurs.

Memory Line by Line Profile
Filename: /Users/lsetiawan/Repos/SSEC/echopype/echopype/calibrate/calibrate_ek.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   137    222.6 MiB    222.6 MiB           1       def __init__(self, echodata: EchoData, env_params, cal_params, ecs_file, **kwargs):
   138    222.6 MiB      0.0 MiB           1           super().__init__(echodata, env_params, cal_params, ecs_file)
   139                                         
   140                                                 # Set sonar_type and waveform/encode mode
   141    222.6 MiB      0.0 MiB           1           self.sonar_type = "EK60"
   142                                         
   143                                                 # Set cal type
   144    222.6 MiB      0.0 MiB           1           self.waveform_mode = "CW"
   145    222.6 MiB      0.0 MiB           1           self.encode_mode = "power"
   146                                         
   147                                                 # Get the right ed_beam_group for CW power samples
   148    222.6 MiB      0.0 MiB           2           self.ed_beam_group = retrieve_correct_beam_group(
   149    222.6 MiB      0.0 MiB           1               echodata=self.echodata, waveform_mode=self.waveform_mode, encode_mode=self.encode_mode
   150                                                 )
   151                                         
   152                                                 # Set the channels to calibrate
   153                                                 # For EK60 this is all channels
   154    222.6 MiB      0.0 MiB           1           self.chan_sel = self.echodata[self.ed_beam_group]["channel"]
   155    222.6 MiB      0.0 MiB           1           beam = self.echodata[self.ed_beam_group]
   156                                         
   157                                                 # Convert env_params and cal_params if self.ecs_file exists
   158                                                 # Note a warning if thrown out in CalibrateBase.__init__
   159                                                 # to let user know cal_params and env_params are ignored if ecs_file is provided
   160    222.6 MiB      0.0 MiB           1           if self.ecs_file is not None:  # also means self.ecs_dict != {}
   161                                                     ds_env_tmp, ds_cal_tmp, _ = ecs_ev2ep(self.ecs_dict, "EK60")
   162                                                     self.cal_params = ecs_ds2dict(
   163                                                         conform_channel_order(ds_cal_tmp, beam["frequency_nominal"])
   164                                                     )
   165                                                     self.env_params = ecs_ds2dict(
   166                                                         conform_channel_order(ds_env_tmp, beam["frequency_nominal"])
   167                                                     )
   168                                         
   169                                                 # Regardless of the source cal and env params,
   170                                                 # go through the same sanitization and organization process
   171    229.0 MiB      6.4 MiB           2           self.env_params = get_env_params_EK(
   172    222.6 MiB      0.0 MiB           1               sonar_type=self.sonar_type,
   173    222.6 MiB      0.0 MiB           1               beam=self.echodata[self.ed_beam_group],
   174    222.6 MiB      0.0 MiB           1               env=self.echodata["Environment"],
   175    222.6 MiB      0.0 MiB           1               user_dict=self.env_params,
   176                                                 )
   177    280.5 MiB     51.5 MiB           2           self.cal_params = get_cal_params_EK(
   178    229.0 MiB      0.0 MiB           1               waveform_mode=self.waveform_mode,
   179    229.0 MiB      0.0 MiB           1               freq_center=beam["frequency_nominal"],
   180    229.0 MiB      0.0 MiB           1               beam=beam,
   181    229.0 MiB      0.0 MiB           1               vend=self.echodata["Vendor_specific"],
   182    229.0 MiB      0.0 MiB           1               user_dict=self.cal_params,
   183    229.0 MiB      0.0 MiB           1               sonar_type=self.sonar_type,
   184                                                 )
   185                                         
   186                                                 # Compute range
   187    929.8 MiB    929.8 MiB           1           self.compute_echo_range()


Filename: /Users/lsetiawan/Repos/SSEC/echopype/echopype/calibrate/range.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    99    280.5 MiB    280.5 MiB           1   def compute_range_EK(
   100                                             echodata: EchoData,
   101                                             env_params: Dict,
   102                                             waveform_mode: str = "CW",
   103                                             encode_mode: str = "power",
   104                                             chan_sel=None,
   105                                         ):
   106                                             """
   107                                             Computes the range (``echo_range``) of EK backscatter data in meters.
   108                                         
   109                                             Parameters
   110                                             ----------
   111                                             echodata : EchoData
   112                                                 An EchoData object holding data from an AZFP echosounder
   113                                             env_params : dict
   114                                                 A dictionary holding environmental parameters needed for computing range
   115                                                 See echopype.calibrate.env_params.get_env_params_EK
   116                                             waveform_mode : {"CW", "BB"}
   117                                                 Type of transmit waveform.
   118                                                 Required only for data from the EK80 echosounder.
   119                                         
   120                                                 - `"CW"` for narrowband transmission,
   121                                                     returned echoes recorded either as complex or power/angle samples
   122                                                 - `"BB"` for broadband transmission,
   123                                                     returned echoes recorded as complex samples
   124                                         
   125                                             encode_mode : {"complex", "power"}
   126                                                 Type of encoded data format.
   127                                                 Required only for data from the EK80 echosounder.
   128                                         
   129                                                 - `"complex"` for complex samples
   130                                                 - `"power"` for power/angle samples, only allowed when
   131                                                   the echosounder is configured for narrowband transmission
   132                                         
   133                                             Returns
   134                                             -------
   135                                             xr.DataArray
   136                                                 The range (``echo_range``) of the data in meters.
   137                                         
   138                                             Notes
   139                                             -----
   140                                             The EK80 echosounder can be configured to transmit
   141                                             either broadband (``waveform_mode="BB"``) or narrowband (``waveform_mode="CW"``) signals.
   142                                             When transmitting in broadband mode, the returned echoes must be
   143                                             encoded as complex samples (``encode_mode="complex"``).
   144                                             When transmitting in narrowband mode, the returned echoes can be encoded
   145                                             either as complex samples (``encode_mode="complex"``)
   146                                             or as power/angle combinations (``encode_mode="power"``) in a format
   147                                             similar to those recorded by EK60 echosounders (the "power/angle" format).
   148                                             """
   149                                             # sound_speed should exist already
   150    280.5 MiB      0.0 MiB           1       if echodata.sonar_model in ("EK60", "ES70"):
   151    280.5 MiB      0.0 MiB           1           ek_str = "EK60"
   152                                             elif echodata.sonar_model in ("EK80", "ES80", "EA640"):
   153                                                 ek_str = "EK80"
   154                                             else:
   155                                                 raise ValueError("The specified sonar_model is not supported!")
   156                                         
   157    280.5 MiB      0.0 MiB           1       if "sound_speed" not in env_params:
   158                                                 raise RuntimeError(
   159                                                     "sounds_speed not included in env_params, "
   160                                                     f"use echopype.calibrate.env_params.get_env_params_{ek_str}() to compute env_params "
   161                                                 )
   162                                             else:
   163    280.5 MiB      0.0 MiB           1           sound_speed = env_params["sound_speed"]
   164                                         
   165                                             # Get the right Sonar/Beam_groupX group according to encode_mode
   166    280.5 MiB      0.0 MiB           1       ed_beam_group = retrieve_correct_beam_group(echodata, waveform_mode, encode_mode)
   167    280.5 MiB      0.0 MiB           1       beam = (
   168    280.5 MiB      0.0 MiB           1           echodata[ed_beam_group]
   169    280.5 MiB      0.0 MiB           1           if chan_sel is None
   170                                                 else echodata[ed_beam_group].sel(channel=chan_sel)
   171                                             )
   172                                         
   173                                             # Harmonize sound_speed time1 and Beam_groupX ping_time
   174    280.5 MiB      0.0 MiB           2       sound_speed = harmonize_env_param_time(
   175    280.5 MiB      0.0 MiB           1           p=sound_speed,
   176    280.5 MiB      0.0 MiB           1           ping_time=beam.ping_time,
   177                                             )
   178                                         
   179                                             # Range in meters, not modified for TVG compensation
   180   1820.9 MiB   1540.4 MiB           1       range_meter = beam["range_sample"] * beam["sample_interval"] * sound_speed / 2
   181                                             # make order of dims conform with the order of backscatter data
   182   1822.0 MiB      1.0 MiB           1       range_meter = range_meter.transpose(*DIMENSION_ORDER)
   183                                         
   184                                             # set entries with NaN backscatter data to NaN
   185   1822.1 MiB      0.2 MiB           1       if "beam" in beam["backscatter_r"].dims:
   186                                                 # Drop beam because echo_range should not have a beam dimension
   187                                                 valid_idx = ~beam["backscatter_r"].isel(beam=0).drop("beam").isnull()
   188                                             else:
   189   6415.5 MiB   4593.4 MiB           1           valid_idx = ~beam["backscatter_r"].isnull()
   190   1511.1 MiB  -4904.3 MiB           1       range_meter = range_meter.where(valid_idx)
   191                                         
   192                                             # remove time1 if exists as a coordinate
   193   1511.5 MiB      0.3 MiB           1       if "time1" in range_meter.coords:
   194                                                 range_meter = range_meter.drop("time1")
   195                                         
   196                                             # add name to facilitate xr.merge
   197   1511.5 MiB      0.0 MiB           1       range_meter.name = "echo_range"
   198                                         
   199   1511.5 MiB      0.0 MiB           1       return range_meter


Filename: /Users/lsetiawan/Repos/SSEC/echopype/echopype/calibrate/calibrate_ek.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    50    930.2 MiB    930.2 MiB           1       def _cal_power_samples(self, cal_type: str) -> xr.Dataset:
    51                                                 """Calibrate power data from EK60 and EK80.
    52                                         
    53                                                 Parameters
    54                                                 ----------
    55                                                 cal_type: str
    56                                                     'Sv' for calculating volume backscattering strength, or
    57                                                     'TS' for calculating target strength
    58                                         
    59                                                 Returns
    60                                                 -------
    61                                                 xr.Dataset
    62                                                     The calibrated dataset containing Sv or TS
    63                                                 """
    64                                                 # Select source of backscatter data
    65    933.4 MiB      3.1 MiB           1           beam = self.echodata[self.ed_beam_group]
    66                                         
    67                                                 # Derived params
    68    946.3 MiB     12.9 MiB           1           wavelength = self.env_params["sound_speed"] / beam["frequency_nominal"]  # wavelength
    69                                                 # range_meter = self.range_meter
    70                                         
    71                                                 # TVG compensation with modified range
    72    946.3 MiB      0.0 MiB           1           sound_speed = self.env_params["sound_speed"]
    73    946.3 MiB      0.0 MiB           1           absorption = self.env_params["sound_absorption"]
    74   7130.0 MiB   7130.0 MiB           2           tvg_mod_range = range_mod_TVG_EK(
    75    946.3 MiB      0.0 MiB           1               self.echodata, self.ed_beam_group, self.range_meter, sound_speed
    76                                                 )
    77    912.2 MiB  -6217.8 MiB           1           tvg_mod_range = tvg_mod_range.where(tvg_mod_range > 0, np.nan)
    78                                         
    79    652.9 MiB   -259.4 MiB           1           spreading_loss = 20 * np.log10(tvg_mod_range)
    80   6107.8 MiB   5454.9 MiB           1           absorption_loss = 2 * absorption * tvg_mod_range
    81                                         
    82   6107.9 MiB      0.1 MiB           1           if cal_type == "Sv":
    83                                                     # Calc gain
    84   6083.9 MiB    -47.3 MiB           1               CSv = (
    85   6131.2 MiB    -25.9 MiB           4                   10 * np.log10(beam["transmit_power"])
    86   6125.3 MiB      2.0 MiB           1                   + 2 * self.cal_params["gain_correction"]
    87   6129.4 MiB      0.0 MiB           1                   + self.cal_params["equivalent_beam_angle"]
    88   6131.2 MiB    -29.2 MiB           2                   + 10
    89   6131.2 MiB    -32.5 MiB           2                   * np.log10(
    90   6138.5 MiB    -15.1 MiB           4                       wavelength**2
    91   6134.9 MiB      0.0 MiB           1                       * beam["transmit_duration_nominal"]
    92   6138.5 MiB      0.0 MiB           1                       * self.env_params["sound_speed"]
    93   6116.4 MiB    -22.1 MiB           1                       / (32 * np.pi**2)
    94                                                         )
    95                                                     )
    96                                         
    97                                                     # Calibration and echo integration
    98    704.2 MiB  -5379.7 MiB           1               out = (
    99   6083.9 MiB  -5384.3 MiB           5                   beam["backscatter_r"]  # has beam dim
   100   6083.9 MiB      0.0 MiB           1                   + spreading_loss
   101   5978.0 MiB   -105.9 MiB           1                   + absorption_loss
   102   2955.6 MiB  -3128.4 MiB           1                   - CSv
   103    725.3 MiB  -5358.6 MiB           1                   - 2 * self.cal_params["sa_correction"]
   104                                                     )
   105    704.2 MiB      0.0 MiB           1               out.name = "Sv"
   106                                         
   107                                                 elif cal_type == "TS":
   108                                                     # Calc gain
   109                                                     CSp = (
   110                                                         10 * np.log10(beam["transmit_power"])
   111                                                         + 2 * self.cal_params["gain_correction"]
   112                                                         + 10 * np.log10(wavelength**2 / (16 * np.pi**2))
   113                                                     )
   114                                         
   115                                                     # Calibration and echo integration
   116                                                     out = beam["backscatter_r"] + spreading_loss * 2 + absorption_loss - CSp
   117                                                     out.name = "TS"
   118                                         
   119                                                 # Attach calculated range (with units meter) into data set
   120    704.9 MiB      0.7 MiB           1           out = out.to_dataset()
   121    708.0 MiB      3.1 MiB           1           out = out.merge(self.range_meter)
   122                                         
   123                                                 # Add frequency_nominal to data set
   124    708.5 MiB      0.5 MiB           1           out["frequency_nominal"] = beam["frequency_nominal"]
   125                                         
   126                                                 # Add env and cal parameters
   127    710.0 MiB      1.5 MiB           1           out = self._add_params_to_output(out)
   128                                         
   129                                                 # Remove time1 if exist as a coordinate
   130    710.0 MiB      0.0 MiB           1           if "time1" in out.coords:
   131                                                     out = out.drop("time1")
   132                                         
   133    710.0 MiB      0.0 MiB           1           return out


Filename: /Users/lsetiawan/Repos/SSEC/echopype/echopype/calibrate/range.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   202    946.4 MiB    946.4 MiB           1   def range_mod_TVG_EK(
   203                                             echodata: EchoData, ed_beam_group: str, range_meter: xr.DataArray, sound_speed: xr.DataArray
   204                                         ) -> xr.DataArray:
   205                                             """
   206                                             Modify range for TVG calculation.
   207                                         
   208                                             TVG correction factor changes depending when the echo recording starts
   209                                             wrt when the transmit signal is sent out.
   210                                             This depends on whether it is Ex60 or Ex80 style hardware
   211                                             ref: https://github.com/CI-CMG/pyEcholab/blob/RHT-EK80-Svf/echolab2/instruments/EK80.py#L4297-L4308  # noqa
   212                                             """
   213                                         
   214    946.4 MiB      0.0 MiB           2       def mod_Ex60():
   215                                                 # 2-sample shift in the beginning
   216    955.3 MiB      8.9 MiB           1           return 2 * beam["sample_interval"] * sound_speed / 2  # [frequency x range_sample]
   217                                         
   218    946.4 MiB      0.0 MiB           1       def mod_Ex80():
   219                                                 mod = sound_speed * beam["transmit_duration_nominal"] / 4
   220                                                 if isinstance(mod, xr.DataArray) and "time1" in mod.coords:
   221                                                     mod = mod.squeeze().drop("time1")
   222                                                 return mod
   223                                         
   224    946.4 MiB      0.0 MiB           1       beam = echodata[ed_beam_group]
   225    946.4 MiB      0.0 MiB           1       vend = echodata["Vendor_specific"]
   226                                         
   227                                             # If EK60
   228    946.4 MiB      0.0 MiB           1       if echodata.sonar_model in ["EK60", "ES70"]:
   229   7129.7 MiB   6174.4 MiB           1           range_meter = range_meter - mod_Ex60()
   230                                         
   231                                             # If EK80:
   232                                             # - compute range first assuming all channels have Ex80 style hardware
   233                                             # - change range for channels with Ex60 style hardware (GPT)
   234                                             elif echodata.sonar_model in ["EK80", "ES80", "EA640"]:
   235                                                 range_meter = range_meter - mod_Ex80()
   236                                         
   237                                                 # Change range for all channels with GPT
   238                                                 if "GPT" in vend["transceiver_type"]:
   239                                                     ch_GPT = vend["transceiver_type"] == "GPT"
   240                                                     range_meter.loc[dict(channel=ch_GPT)] = range_meter.sel(channel=ch_GPT) - mod_Ex60()
   241                                         
   242   7129.8 MiB      0.1 MiB           1       return range_meter

Analysis

echopype/calibrate/calibrate_ek.py::init

187    929.8 MiB    929.8 MiB           1           self.compute_echo_range()

We can see that during the instantiation of the EK60 Calibrator Class, compute_echo_range has the most increase in memory usage. In this instance, it's 929.8 MiB.

echopype/calibrate/range.py::compute_range_EK

180   1820.9 MiB   1540.4 MiB           1       range_meter = beam["range_sample"] * beam["sample_interval"] * sound_speed / 2

Digging into the compute_echo_range method of the EK60 Calibrator Class, compute_range_EK is actually being called. From the result above, the creation of range_meter caused quite a spike in memory usage of 1540.4 MiB.

A broadcasting operation is happening here as the shapes are:

  • range_sample: (3957,)
  • sample_interval: (5, 46187)
  • sound_speed: (5, 46187)
189   6415.5 MiB   4593.4 MiB           1           valid_idx = ~beam["backscatter_r"].isnull()

Additionally, the operation of getting non-null values spiked the memory usage again! This time, about 4X increase.

190   1511.1 MiB  -4904.3 MiB           1       range_meter = range_meter.where(valid_idx)

This is then cleaned up by a where operation on the range_meter variable calculated above.

echopype/calibrate/calibrate_ek.py::_cal_power_samples

74   7130.0 MiB   7130.0 MiB           2           tvg_mod_range = range_mod_TVG_EK(
75    946.3 MiB      0.0 MiB           1               self.echodata, self.ed_beam_group, self.range_meter, sound_speed
76                                                 )
...
80   6107.8 MiB   5454.9 MiB           1           absorption_loss = 2 * absorption * tvg_mod_range

Once compute_range is finished, the process continues to calling _cal_power_samples to perform the Sv computation. As seen in the snippet above, range_mod_TVG_EK increased the memory usage 7130.0 MiB... that's around 7GB! Then another spike of 5454.9 MiB happens again during the computation above.

echopype/calibrate/range.py::range_mod_TVG_EK

229   7129.7 MiB   6174.4 MiB           1           range_meter = range_meter - mod_Ex60()

Digging into range_mod_TVG_EK function, we can see that the substraction b/w range_meter and mod_Ex60 function result blows up the memory here.

@lsetiawan
Copy link
Member Author

Investigation

After some initial investigate, I found that the first computation of beam["range_sample"] * beam["sample_interval"] * sound_speed / 2 is really problematic. We can re-create a very simple example with just a few scaffolding:

from distributed import Client
import dask.array
import xarray as xr
import numpy as np

# Use 2 workers so each gets more memory
client = Client(n_workers=2)

# Same shape as the actual data
ch, pt, rs = 5, 46187, 3957
sample_interval = xr.DataArray(dask.array.ones((ch, pt)), name="sample_interval", coords={"ch": np.arange(ch), "pt": np.arange(pt)})
sound_speed = xr.DataArray(dask.array.ones((ch, pt)) * 1053, name="sound_speed", coords={"ch": np.arange(ch), "pt": np.arange(pt)})
range_sample = xr.DataArray(dask.array.from_array(np.arange(rs)), name="range_sample", coords={"rs": np.arange(rs)})
res = range_sample * sample_interval * sound_speed / 2
res.compute()

The above code WILL FAIL with similar error and memory spilling and blowup seen in the dask dashboard:

2023-11-06 17:17:59,276 - distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 11.84 GiB -- Worker memory limit: 16.00 GiB
2023-11-06 17:17:59,495 - distributed.worker.memory - WARNING - Worker is at 83% memory usage. Pausing worker.  Process memory: 13.36 GiB -- Worker memory limit: 16.00 GiB
2023-11-06 17:17:59,663 - distributed.worker.memory - WARNING - Worker is at 43% memory usage. Resuming worker. Process memory: 6.91 GiB -- Worker memory limit: 16.00 GiB

If we perform the same snippet of code with the following shape:

ch, pt, rs = 5, 20000, 2000

the computation will finish just fine.

@ctuguinay
Copy link
Collaborator

ctuguinay commented May 24, 2024

So the most obvious problem here comes from range_sample multiplied by either sound_speed or sample_interval. This creates an array that has memory approximately 3975 times bigger than that of both sample_interval and sound_speed.

Also, I suspect that there's some excessive broadcasting going on that may overload worker memory. I'm still looking into this.

@ctuguinay
Copy link
Collaborator

ctuguinay commented Jun 5, 2024

Using two workers doesn't seem to be helpful, since most of the memory will stick to just one worker here

Chunking range_sample into two equal chunks crashed my 8 CPU 30 GiB RAM VM

@leewujung
Copy link
Member

I would suggest chunking along ping_time. range_sample is typically shorter for large dataset, since the depth setting usually does not get fiddled much, but ping_time can continue to grow.

Maybe look at arithmetic things that could simplify the ops would be helpful? like is it possible to avoid the expansion, etc.

@ctuguinay
Copy link
Collaborator

ctuguinay commented Jun 5, 2024

Yeah, I tried just chunking on ping_time, and that seemed to scale well, speed-wise, but there's a large subset of task outputs that are stored in RAM and this builds up over time. So it seems as though something towards the end of the task graph needs a lot of the task outputs to compute, meaning there's a bottleneck.

And yeah, I can also look for stuff that avoids expansion altogether.

@leewujung
Copy link
Member

Yeah, separating out the steps and looking at the graphs built for each step would likely be useful for figuring out exactly where the bottleneck is -- though I think you're already doing it! :)

@ctuguinay
Copy link
Collaborator

I'll put the (potential) fix for this issue into a separate issue since the idea is extensive enough that it needs its own separate thing. To summarize, this issue is caused by the accumulation of task products that build up due to Dask waiting for every chunk to compute before setting the end array of res.compute(). To fix this accumulation problem, we can aim the computation at a Zarr store.

@ctuguinay
Copy link
Collaborator

Closed by merging of #1331

@github-project-automation github-project-automation bot moved this from Todo to Done in Echopype Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants