Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak peaksplitting #309

Merged

Conversation

JoranAngevaare
Copy link
Contributor

What is the problem / what does the code in this PR do
The way hitlets were now using the peak splitting induced a serious memory leak. Also it utterly degraded strax' performance.

Can you briefly describe how it works?
I removed the associated changes in #275 that induced this bug.

Can you describe the symptoms?
When running straxer (e.g. like this):

python -m memory_profiler straxer.py 170206_1355 --target peaklets --context xenon1t_dali --build_lowlevel

Would yield something like:

Got 1525 items. Now 629.4 sec / 17.5% into the run. Using 7034.7 MB RAM. ETA 26021.48 sec.
....
Got 1936 items. Now 756.6 sec / 21.0% into the run. Using 8234.6 MB RAM. ETA 25669.28 sec.

Until the memory usage went up to 45 GB! Also the processing speed by a factor of ~5.

Copy link
Collaborator

@WenzDaniel WenzDaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one more idea which prevents us from copy-pasting the code. I would like try this first before we use your suggestion. Its a bit inspired by your solution. Can you forward me the memory and performance tests you did to so I can compare?

# is computed
r['dt'] = orig_dt
r['length'] = (split_i - prev_split_i) * p['dt'] / orig_dt
r['max_gap'] = -1 # Too lazy to compute this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hitlets does not support 'max_gap' so simply remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 17b4c4e

@WenzDaniel WenzDaniel mentioned this pull request Aug 29, 2020
@WenzDaniel
Copy link
Collaborator

Here is a different kind of solution #310 which avoids the copy and pasting of the code.

_result_buffer=None, result_dtype=None):
"""Loop over peaks, pass waveforms to algorithm, construct
new peaks if and where a split occurs.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add here some warning that changes in this function might also be applied in _split_peaks

Copy link
Contributor Author

@JoranAngevaare JoranAngevaare Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daniel, did so in 17b4c4e

@JoranAngevaare
Copy link
Contributor Author

Thanks Daniel, I do think we do indeed have a different opinion on this one. (see #310)

I just addressed your feedback, would you mind verifying if this line is okay for some nVeto data (I don't have any at hand):

@strax.growing_result(dtype=strax.hitlet_dtype(), chunk_size=int(1e4))

else:
raise TypeError(f'Unknown data_type. "{data_type}" is not supported.')
new_peaks = self._split_peaks(
new_peaks = split_function[data_type](
# Numba doesn't like self as argument, but it's ok with functions...
split_finder=self.find_split_points,
peaks=peaks,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to change these lines into arguments, since peaks is called hits in _split_hitlets

Copy link
Collaborator

@WenzDaniel WenzDaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine go ahead

@WenzDaniel WenzDaniel merged commit e94b1c7 into AxFoundation:master Aug 31, 2020
@JoranAngevaare JoranAngevaare deleted the fix_memory_leak_peaksplitting branch August 31, 2020 18:34
@JoranAngevaare JoranAngevaare mentioned this pull request Apr 29, 2021
4 tasks
@WenzDaniel WenzDaniel mentioned this pull request May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants