-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak peaksplitting #309
Fix memory leak peaksplitting #309
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one more idea which prevents us from copy-pasting the code. I would like try this first before we use your suggestion. Its a bit inspired by your solution. Can you forward me the memory and performance tests you did to so I can compare?
strax/processing/peak_splitting.py
Outdated
# is computed | ||
r['dt'] = orig_dt | ||
r['length'] = (split_i - prev_split_i) * p['dt'] / orig_dt | ||
r['max_gap'] = -1 # Too lazy to compute this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hitlets does not support 'max_gap' so simply remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 17b4c4e
Here is a different kind of solution #310 which avoids the copy and pasting of the code. |
_result_buffer=None, result_dtype=None): | ||
"""Loop over peaks, pass waveforms to algorithm, construct | ||
new peaks if and where a split occurs. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add here some warning that changes in this function might also be applied in _split_peaks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Daniel, did so in 17b4c4e
Thanks Daniel, I do think we do indeed have a different opinion on this one. (see #310) I just addressed your feedback, would you mind verifying if this line is okay for some nVeto data (I don't have any at hand): strax/strax/processing/peak_splitting.py Line 176 in 17b4c4e
|
else: | ||
raise TypeError(f'Unknown data_type. "{data_type}" is not supported.') | ||
new_peaks = self._split_peaks( | ||
new_peaks = split_function[data_type]( | ||
# Numba doesn't like self as argument, but it's ok with functions... | ||
split_finder=self.find_split_points, | ||
peaks=peaks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to change these lines into arguments, since peaks
is called hits
in _split_hitlets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine go ahead
What is the problem / what does the code in this PR do
The way hitlets were now using the peak splitting induced a serious memory leak. Also it utterly degraded strax' performance.
Can you briefly describe how it works?
I removed the associated changes in #275 that induced this bug.
Can you describe the symptoms?
When running straxer (e.g. like this):
Would yield something like:
Until the memory usage went up to 45 GB! Also the processing speed by a factor of ~5.