Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple targets to be computed simultaneously #408

Merged
merged 6 commits into from
Mar 12, 2021

Conversation

JoranAngevaare
Copy link
Contributor

@JoranAngevaare JoranAngevaare commented Mar 12, 2021

What is the problem / what does the code in this PR do

On the eventbuilders we need to compute targets that are not always depending on one another. For example event info and online_peak_monitor don't depend on each other and don't have the same datakind. This makes it impossible to use st.make on these target.

So far, I've been using a https://github.com/XENONnT/straxen/blob/master/bin/bootstrax#L327-L335 of registering a temporary plugin for merging these non-unify-able targets. However, this is causing quite some issues in the merging step especially given the fact that the online-peak monitor is a monolithic event per chunk (i.e. unable to split to get a nice input buffer). This is causing issues in the online processing and can be expected to get worse when the process is also dealing with other datakinds like the NV/MV/HE datastreams. We need something smarter from strax.

Can you briefly describe how it works?

We ditch the paradigm of allowing only multiple targets when asking for the processing components. This means the components can comprise multiple targets that can be computed simultaneously. However, we need to be careful when to allow this, we don't want people to use this functionality in st.get_array or st.get_df but only for mass producing datatypes (e.g. using st.make).

One thing I did notice is that we should be careful not to start multiple targets for the mailboxes, I spend a good hour or four trying to get that done properly but I got lost and also figured it is not the scope I'm aiming for. Instead, I allow only the first target to be subscribed. That should suffice.

Can you give a minimal working example (or illustrate with a figure)?

I tested it using a very slightly altered version of this notebook on live data and all works fine. I even went completely nuts by stetting the targets to all plugins in _plugin_class_registry (if parallel!='process') and it worked just fine starting from the real live data.

Tests

In b5c9474 I added a test, while writing this, I noticed that we should check that the mailbox subscribed plugin is not depended on, otherwise we are going to get a screwed up chain and infinite hangs. Got to love tests for figuring this out.

@JoranAngevaare JoranAngevaare added the enhancement New feature or request label Mar 12, 2021
@JoranAngevaare JoranAngevaare marked this pull request as ready for review March 12, 2021 11:06
Copy link
Contributor

@darrylmasson darrylmasson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a very useful idea to support mass production.

strax/context.py Outdated
@@ -1115,6 +1117,7 @@ def get_array(self, run_id: ty.Union[str, tuple, list],
targets,
save=save,
max_workers=max_workers,
allow_multiple=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this argument isn't allowed in get_array or get_df, why is it here?

Copy link
Contributor Author

@JoranAngevaare JoranAngevaare Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make sure that it is not passed on as a **kwarg. It is already in the doc string that people shouldn't do it but I decided it's better to double check that it is not somehow ending up here and causing infinite problems.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm okay I see the problem, but I would prefer if we add in the first line of get_array as kwargs['allow_multiple'] = False to avoid error messages like "allow_multiple has been specified twice" which would for sure lead to confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid point, I changed it.

Copy link
Collaborator

@WenzDaniel WenzDaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look indeed very simple, but I can imagine that it took much time. To me it looks fine although you are much deeper in processing then I am. Just one small comment about the same thing as Darryl commented.

strax/context.py Outdated
@@ -1115,6 +1117,7 @@ def get_array(self, run_id: ty.Union[str, tuple, list],
targets,
save=save,
max_workers=max_workers,
allow_multiple=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm okay I see the problem, but I would prefer if we add in the first line of get_array as kwargs['allow_multiple'] = False to avoid error messages like "allow_multiple has been specified twice" which would for sure lead to confusion.

@JoranAngevaare JoranAngevaare removed the request for review from JelleAalbers March 12, 2021 13:44
@JoranAngevaare
Copy link
Contributor Author

Thanks for the fast review, I changed it according to your feedback: 4250ae5

@JoranAngevaare JoranAngevaare merged commit 5e7d568 into AxFoundation:master Mar 12, 2021
@JoranAngevaare JoranAngevaare deleted the multiple_targets branch March 12, 2021 17:52
@JelleAalbers
Copy link
Member

Nice to see the first steps towards multi-target processing! I'm pleasantly surprised this works without any changes to ThreadedMailboxProcessor; probably because you are still just passing one target (the final_plugin) to that part of strax.

You probably want to disable 'lazy mode' (pass allow_lazy=False to ThreadedMailboxProcessor in Context.get_iter) when _get_end_targets returns more than one plugin. Since you're still only pulling on only one "main target"s mailbox, non-dependencies of the main target will not get their fetch permissions under lazy mode. (If I remember this stuff correctly.) Lazy mode is already disabled when multiprocessing, so this all won't matter for mass production on the ebs.

Even nicer would be to let all non-dependencies of the final target be 'freely flowing', i.e. add them to the list here https://github.com/AxFoundation/strax/blob/master/strax/processor.py#L111. This excludes those datatypes from lazy mode's fetch permission system. It's still a good idea to keep your final target selection code even after this; letting things flow freely isn't great for memory management.

I'm not sure what happens if there is an exception on the non-main-target-part of the processing graph; there is at least one ominous comment here https://github.com/AxFoundation/strax/blob/master/strax/processor.py#L276. Maybe it's all fine, but you could get a hang if the exception isn't properly propagated back to the main chain.

@JoranAngevaare
Copy link
Contributor Author

Nice to see the first steps towards multi-target processing! I'm pleasantly surprised this works without any changes to ThreadedMailboxProcessor; probably because you are still just passing one target (the final_plugin) to that part of strax.

Thanks for the feedback, it should indeed be the first step, further tweaks are certainly required.

You probably want to disable 'lazy mode' (pass allow_lazy=False to ThreadedMailboxProcessor in Context.get_iter) when _get_end_targets returns more than one plugin. Since you're still only pulling on only one "main target"s mailbox, non-dependencies of the main target will not get their fetch permissions under lazy mode. (If I remember this stuff correctly.) Lazy mode is already disabled when multiprocessing, so this all won't matter for mass production on the ebs.

This is very useful indeed, will file a followup PR soon.

Even nicer would be to let all non-dependencies of the final target be 'freely flowing', i.e. add them to the list here https://github.com/AxFoundation/strax/blob/master/strax/processor.py#L111. This excludes those datatypes from lazy mode's fetch permission system. It's still a good idea to keep your final target selection code even after this; letting things flow freely isn't great for memory management.
I'm not sure what happens if there is an exception on the non-main-target-part of the processing graph; there is at least one ominous comment here https://github.com/AxFoundation/strax/blob/master/strax/processor.py#L276. Maybe it's all fine, but you could get a hang if the exception isn't properly propagated back to the main chain.

Thanks, this is something I hadn't realized. I guess for now we need to be careful what we specify as the first target because this indeed only gets caught in the final plugin. Especially in the lazy mode (long timeout) we are not going to like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants