Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory consumption of np.nansum #380

Closed
tensionhead opened this issue Nov 15, 2022 · 6 comments
Closed

Excessive memory consumption of np.nansum #380

tensionhead opened this issue Nov 15, 2022 · 6 comments
Assignees
Labels
Critical Bug A critical error that either complete breaks (parts of) the package or causes erroneous calculation Performance Improve the number crunching

Comments

@tensionhead
Copy link
Contributor

tensionhead commented Nov 15, 2022

Update

Ok, so it looks like it is not a real leak.. we just underestimated how much RAM np.sum and np.nansum actually use. Here I profiled np.ndarray and h5pyDataSet creation and summation in the combinations indicated by function name. Note that a single array has size of 256MB, so we expect two arrays to cover slightly more than 500MB. For the summing operations total temporary RAM usage goes up to over 1.2GB:

Screenshot from 2022-11-16 13:59:41

From this ad-hoc analysis, I conclude we need at least 4 times the memory size of a single trial, and not just 2 which is only true for static memory usage.

Repeating with 1024MB single-trial size, we see that np.nansum needs pretty much one extra trial in memory:

Screenshot from 2022-11-16 16:49:49

So this means, that just by chance.. switching from np.nansum to np.sum kept the memory usage below what the machine I used to find this had to offer (32GB RAM vs. 8GB single trial size).

Synopsis

Thanks to @kajal5888 we now got some real life problem with data having trials of being around 8GB. When trying to process the data Syncopy consistently crashed (OOM Kill) with keeptrials=False and with or without parallel=True and with or without ACME on a machine with 32GB of RAM available. So this means that the basic promise: as long as 2 trials fit into memory you are fine did not hold up to reality.

@pantaray I guess no one (me inclusive) really tested this promise with seriously sized data before?

The Culprit and Quick Fix

After some searching I found the suspect np.nansum, which gets used here:

target[()] = np.nansum([target, res], axis=0)

Here is a relevant issue:

pydata/bottleneck#201

Apparently when combining np.ndarray and other dataypes within a np.nansum([obj1, obj2]) call, they mention list and pd.DataFrame, there are memory leaks.

Switching to np.sum for testing in the sequential branch of the CR fixed the OOM killing and the processing succeeded for said dataset.

What Next

The issue above was of course recognized as being quite serious but got apparently already fixed January 2019:

pydata/bottleneck#202

We need to 1st check if that fix made it into our environments. If yes, then maybe combining h5py.Dataset and np.ndarray still has issues, in that case we have to escalate that further.

@tensionhead tensionhead added Critical Bug A critical error that either complete breaks (parts of) the package or causes erroneous calculation Performance Improve the number crunching labels Nov 15, 2022
tensionhead added a commit that referenced this issue Nov 16, 2022
- the quick fix outlined in #380

Changes to be committed:
	modified:   syncopy/shared/computational_routine.py
	modified:   syncopy/shared/kwarg_decorators.py
@tensionhead
Copy link
Contributor Author

Ok, so the real culprit was this somewhat odd

target[()] = np.nansum([target, res], axis=0)

Where first both operands would be put into a list [target, res] and then np.nansum would be called with an axis parameter. Which actually makes sense, as this list internally has to be converted to (an apparently new) array before applying the np.sum. If we just replace with the ordinary + operation, everything looks as it should:

Screenshot from 2022-11-17 11:42:44

So I will replace both appearance of that memory hungry operation with a simple target + res, this can also handle np.nan. Which I think wasn't the case with earlier numpy versions.

tensionhead added a commit that referenced this issue Nov 17, 2022
- this old np.sum([target, res], axis=0) construction increased the memory footprint 2-3
fold
- ordinary `+` operator is memory sparing AND can also deal with NaNs
- fixes #380

Changes to be committed:
	modified:   syncopy/shared/computational_routine.py
	modified:   syncopy/shared/kwarg_decorators.py
@tensionhead tensionhead self-assigned this Nov 17, 2022
@pantaray
Copy link
Member

Hi! Nice catch! Simply using + does not only avoid the memory leak but is also just much easier readable. And the fact that np.nans are handled correctly is awesome!

@pantaray I guess no one (me inclusive) really tested this promise with seriously sized data before?

We have tested it (ages ago) but really focused on the parallel computing case (so the above leak in compute_sequential went by unnoticed). That said, back then the only meta-functions we had implemented were parts of freqanalysis and selectdata for which the "if the trial fits in memory everything's fine" promise is much easier to keep than, say, for a Granger causality calculation (where lots and lots of additional stuff is going on on top of each trial). Depending on where things go, I guess a reliable memory estimation would need to take into account what operation is performed by the user.

@tensionhead
Copy link
Contributor Author

Thanks, I also think that's a nice step forward 🚀

We have tested it (ages ago) but really focused on the parallel computing case (so the above leak in compute_sequential went by unnoticed).

The parallel case was actually no different, the same problematic line was within the IO decorator which springs into action for compute_parallel:

target[()] = np.nansum([target, res], axis=0)

So what you probably mean is that only now due to the cross spectral cFs this memory blow up became apparent 😉

But yeah I agree, proper profiling for different use cases will remain a challenging task ahead.

@KatharineShapcott
Copy link
Contributor

KatharineShapcott commented Nov 22, 2022

Just summarizing my chat with Gregor here, if we would use more in place operations the memory usage would stay smaller. In this case you could do:

res += target
target[()]=res

I did a simple test of the nan handling, this is how it behaves:

image

If we want the nans replaced by 0 we can do the following without wasting much memory, I don;t know about time though:

res[np.isnan(res)] = 0
target[np.where(np.isnan(target))[0]] = 0
res += target
target[()]=res

@tensionhead
Copy link
Contributor Author

Yes, thx to @KatharineShapcott for coming up with the best solution! One operand has to be in memory, yet the result can be written directly into the dataset via the += assignment operator:

target[()] += res

leading to a memory footprint of less than 2 trials for that operation:

Screenshot from 2022-11-22 15:11:41

Note that the 1st memory consumption peak/plateau of every function profiled, is the creation of either 2 arrays or 1 array and 1 dataset

As a bonus, at least for this test here it's even slightly faster than the previous winner target[()] = res + target

It's great that the standard trial-average (that's what it's all about: keeptrials=False) now is very very memory safe 🚀

tensionhead added a commit that referenced this issue Nov 22, 2022
- thx to @KatharineShapcott !
- addresses #380

Changes to be committed:
	modified:   syncopy/shared/computational_routine.py
	modified:   syncopy/shared/kwarg_decorators.py
@tensionhead
Copy link
Contributor Author

I think we can close this with 06a2991

@dfsp-spirit dfsp-spirit changed the title Memory leak due to np.nansum Excessive memory consumption of np.nansum Nov 29, 2022
tensionhead added a commit that referenced this issue Dec 7, 2022
- see #380 for details

Changes to be committed:
	modified:   syncopy/datatype/methods/statistics.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Critical Bug A critical error that either complete breaks (parts of) the package or causes erroneous calculation Performance Improve the number crunching
Projects
None yet
Development

No branches or pull requests

3 participants