Excessive memory consumption of `np.nansum` #380

tensionhead · 2022-11-15T21:38:21Z

Update

Ok, so it looks like it is not a real leak.. we just underestimated how much RAM np.sum and np.nansum actually use. Here I profiled np.ndarray and h5pyDataSet creation and summation in the combinations indicated by function name. Note that a single array has size of 256MB, so we expect two arrays to cover slightly more than 500MB. For the summing operations total temporary RAM usage goes up to over 1.2GB:

From this ad-hoc analysis, I conclude we need at least 4 times the memory size of a single trial, and not just 2 which is only true for static memory usage.

Repeating with 1024MB single-trial size, we see that np.nansum needs pretty much one extra trial in memory:

So this means, that just by chance.. switching from np.nansum to np.sum kept the memory usage below what the machine I used to find this had to offer (32GB RAM vs. 8GB single trial size).

Synopsis

Thanks to @kajal5888 we now got some real life problem with data having trials of being around 8GB. When trying to process the data Syncopy consistently crashed (OOM Kill) with keeptrials=False and with or without parallel=True and with or without ACME on a machine with 32GB of RAM available. So this means that the basic promise: as long as 2 trials fit into memory you are fine did not hold up to reality.

@pantaray I guess no one (me inclusive) really tested this promise with seriously sized data before?

The Culprit and Quick Fix

After some searching I found the suspect np.nansum, which gets used here:

syncopy/syncopy/shared/computational_routine.py

Line 910 in 36b7005

target[()] = np.nansum([target, res], axis=0)

Here is a relevant issue:

pydata/bottleneck#201

Apparently when combining np.ndarray and other dataypes within a np.nansum([obj1, obj2]) call, they mention list and pd.DataFrame, there are memory leaks.

Switching to np.sum for testing in the sequential branch of the CR fixed the OOM killing and the processing succeeded for said dataset.

What Next

The issue above was of course recognized as being quite serious but got apparently already fixed January 2019:

pydata/bottleneck#202

We need to 1st check if that fix made it into our environments. If yes, then maybe combining h5py.Dataset and np.ndarray still has issues, in that case we have to escalate that further.

The text was updated successfully, but these errors were encountered:

- the quick fix outlined in #380 Changes to be committed: modified: syncopy/shared/computational_routine.py modified: syncopy/shared/kwarg_decorators.py

tensionhead · 2022-11-17T11:00:26Z

Ok, so the real culprit was this somewhat odd

syncopy/syncopy/shared/computational_routine.py

Line 910 in 36b7005

target[()] = np.nansum([target, res], axis=0)

Where first both operands would be put into a list [target, res] and then np.nansum would be called with an axis parameter. Which actually makes sense, as this list internally has to be converted to (an apparently new) array before applying the np.sum. If we just replace with the ordinary + operation, everything looks as it should:

So I will replace both appearance of that memory hungry operation with a simple target + res, this can also handle np.nan. Which I think wasn't the case with earlier numpy versions.

- this old np.sum([target, res], axis=0) construction increased the memory footprint 2-3 fold - ordinary `+` operator is memory sparing AND can also deal with NaNs - fixes #380 Changes to be committed: modified: syncopy/shared/computational_routine.py modified: syncopy/shared/kwarg_decorators.py

pantaray · 2022-11-17T16:02:58Z

Hi! Nice catch! Simply using + does not only avoid the memory leak but is also just much easier readable. And the fact that np.nans are handled correctly is awesome!

@pantaray I guess no one (me inclusive) really tested this promise with seriously sized data before?

We have tested it (ages ago) but really focused on the parallel computing case (so the above leak in compute_sequential went by unnoticed). That said, back then the only meta-functions we had implemented were parts of freqanalysis and selectdata for which the "if the trial fits in memory everything's fine" promise is much easier to keep than, say, for a Granger causality calculation (where lots and lots of additional stuff is going on on top of each trial). Depending on where things go, I guess a reliable memory estimation would need to take into account what operation is performed by the user.

tensionhead · 2022-11-17T17:06:25Z

Thanks, I also think that's a nice step forward 🚀

We have tested it (ages ago) but really focused on the parallel computing case (so the above leak in compute_sequential went by unnoticed).

The parallel case was actually no different, the same problematic line was within the IO decorator which springs into action for compute_parallel:

syncopy/syncopy/shared/kwarg_decorators.py

Line 659 in 36b7005

target[()] = np.nansum([target, res], axis=0)

So what you probably mean is that only now due to the cross spectral cFs this memory blow up became apparent 😉

But yeah I agree, proper profiling for different use cases will remain a challenging task ahead.

KatharineShapcott · 2022-11-22T12:50:51Z

Just summarizing my chat with Gregor here, if we would use more in place operations the memory usage would stay smaller. In this case you could do:

res += target
target[()]=res

I did a simple test of the nan handling, this is how it behaves:

If we want the nans replaced by 0 we can do the following without wasting much memory, I don;t know about time though:

res[np.isnan(res)] = 0
target[np.where(np.isnan(target))[0]] = 0
res += target
target[()]=res

tensionhead · 2022-11-22T14:20:38Z

Yes, thx to @KatharineShapcott for coming up with the best solution! One operand has to be in memory, yet the result can be written directly into the dataset via the += assignment operator:

target[()] += res

leading to a memory footprint of less than 2 trials for that operation:

Note that the 1st memory consumption peak/plateau of every function profiled, is the creation of either 2 arrays or 1 array and 1 dataset

As a bonus, at least for this test here it's even slightly faster than the previous winner target[()] = res + target

It's great that the standard trial-average (that's what it's all about: keeptrials=False) now is very very memory safe 🚀

@KatharineShapcott

- thx to @KatharineShapcott ! - addresses #380 Changes to be committed: modified: syncopy/shared/computational_routine.py modified: syncopy/shared/kwarg_decorators.py

tensionhead · 2022-11-28T11:15:32Z

I think we can close this with 06a2991

- see #380 for details Changes to be committed: modified: syncopy/datatype/methods/statistics.py

tensionhead added Critical Bug A critical error that either complete breaks (parts of) the package or causes erroneous calculation Performance Improve the number crunching labels Nov 15, 2022

tensionhead added a commit that referenced this issue Nov 16, 2022

FIX: Use np.sum instead of np.nansum

960094c

- the quick fix outlined in #380 Changes to be committed: modified: syncopy/shared/computational_routine.py modified: syncopy/shared/kwarg_decorators.py

tensionhead self-assigned this Nov 17, 2022

KatharineShapcott closed this as completed Nov 23, 2022

KatharineShapcott reopened this Nov 23, 2022

tensionhead closed this as completed Nov 28, 2022

dfsp-spirit changed the title ~~Memory leak due to np.nansum~~ Excessive memory consumption of np.nansum Nov 29, 2022

tensionhead added a commit that referenced this issue Dec 7, 2022

CHG: Improve memory footprint of statistical operations

fa871c1

- see #380 for details Changes to be committed: modified: syncopy/datatype/methods/statistics.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive memory consumption of `np.nansum` #380

Excessive memory consumption of `np.nansum` #380

tensionhead commented Nov 15, 2022 •

edited

Loading

tensionhead commented Nov 17, 2022

pantaray commented Nov 17, 2022

tensionhead commented Nov 17, 2022

KatharineShapcott commented Nov 22, 2022 •

edited

Loading

tensionhead commented Nov 22, 2022

tensionhead commented Nov 28, 2022

Excessive memory consumption of np.nansum #380

Excessive memory consumption of np.nansum #380

Comments

tensionhead commented Nov 15, 2022 • edited Loading

Update

Synopsis

The Culprit and Quick Fix

What Next

tensionhead commented Nov 17, 2022

pantaray commented Nov 17, 2022

tensionhead commented Nov 17, 2022

KatharineShapcott commented Nov 22, 2022 • edited Loading

tensionhead commented Nov 22, 2022

tensionhead commented Nov 28, 2022

Excessive memory consumption of `np.nansum` #380

Excessive memory consumption of `np.nansum` #380

tensionhead commented Nov 15, 2022 •

edited

Loading

KatharineShapcott commented Nov 22, 2022 •

edited

Loading