Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary chunk splitting #518

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jmosbacher
Copy link
Contributor

@jmosbacher jmosbacher commented Aug 28, 2021

This is an implementation of issue #431, adds option to split the data by overlap with the two target chunks instead of full containment. The overlapping data is automatically trimmed on concatenation. This will reduce complexity of chunk alignment for plugins with multiple dependencies and allow for parallel processing of subclasses of OverlapWindowPlugin.

Can you briefly describe how it works?

  • added optional allow_overlap in Chunk.split method which enables the splitting on overlap.
  • added strict_bounds property to chunk to mark whether the chunk bounds (start,end) fully contain all its data.
  • chunk overlaps are trimmed on concatenation.
    Can you give a minimal working example (or illustrate with a figure)?
import strax
import straxen

st = straxen.contexts.demo()
c = next(st.get_iter( '180423_1021','raw_records',))
idx = len(c.data)//2  # not important but lets split approximately at the center
row = c.data[idx] 
t = row['time'] + row['dt']//2 # select a time that falls within the record interval
try:
    c1,c2 = c.split(t)
except strax.CannotSplit:
    print("Previous splitting logic fails.")

c1,c2 = c.split(t, allow_overlap=True) # after setting allow_overlap to True the split will succeed
assert c1.end == c2.start == t # split is done exactly at requested point in time.
assert c1.data['time'][-1]>c2.data['time'][0] # the two resulting chunks will overlap each other

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.003%) to 85.889% when pulling c5a96b1 on jmosbacher:arbitrary_chunk_splitting into d3608ef on AxFoundation:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants