Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for user ordered operations in aggregator #39

Closed
chuckha opened this issue Aug 8, 2017 · 10 comments
Closed

Allow for user ordered operations in aggregator #39

chuckha opened this issue Aug 8, 2017 · 10 comments

Comments

@chuckha
Copy link
Contributor

chuckha commented Aug 8, 2017

Allow for collections to occur in a specific order A->B->C possibly by making everything a plugin.

@timothysc timothysc added kind/enhancement New or improved functionality p3-low labels Aug 8, 2017
@timothysc timothysc self-assigned this Aug 8, 2017
@timothysc
Copy link
Contributor

Ordered operations make it excellent for pluggable test experiments in a controlled and reproducible fashion.

Run A
Run B
Collect C

This is super helpful for diagnostics when we're filling out the corpus of plugin(s).
It's not table stakes, but it is a shiny gloss coating.

@timothysc timothysc added this to the 0.10.0 milestone Oct 20, 2017
@timothysc
Copy link
Contributor

Stretch goal.

@timothysc timothysc removed this from the 0.10.0 milestone Nov 8, 2017
@timothysc timothysc added this to the v0.11.0 milestone Jan 12, 2018
@timothysc timothysc removed this from the v0.11.0 milestone Feb 12, 2018
@johnSchnake
Copy link
Contributor

So we've talked in the past about this potentially being a very extensive/robust DAG runner. I've even wondered whether not it would be useful to have plugins able to get results from the other plugins. So you could do things like run plugin A & B, then put both of those results into plugin C and C will consume those.

However, I think that all the possible "what ifs" slow down this feature (ordered operations) and may not be necessary for almost any case.

I think the easy version of this, simply ordered operations, should be much easier to implement. If we just add a new field to track some sort of "order", the aggregator should have everything it needs to do this. We just have to more carefully look at the plugin list and only run a plugin of order X if all plugins of order < X are done.

By implementing this we:

  1. unlock lots of new possible uses
  2. get faster feedback as to how/how much this is used
  3. can make queries their own plugin rather than a custom step which further clarifies the purpose of sonobuoy
  4. Something like this should be completely backwards compatible; if all the plugins dont specify the field then it should just be able to run all the plugins at the same time like we do now.

@stevesloka What do you think?

@stevesloka
Copy link
Contributor

Could also remove the reference to plugins in the config.json and follow what kubectl does by ordering the plugins with 01-, 02- which tells the execution order. Then Sonobuoy just looks at the files on disk.

Given your proposal, what if all plugins specify 2 as the order? Are they in parallel?

@johnSchnake
Copy link
Contributor

Given your proposal, what if all plugins specify 2 as the order? Are they in parallel?

Yes they'd just all be in parallel. I put up a demo PR and you can see that I just ensure that to run a plugin of order X, there needs to be no plugins of order Y such that Y<X.

@johnSchnake
Copy link
Contributor

@chuckha What do you think about this? Does the narrow implementation described above above have any particular downsides/shortcomings such that we were holding out for a more robust solution first? Or was this just not implemented due to time constraints?

I put up a PR which implemented this and its only ~80 lines (its a draft and needs some polishing & tests though).

Just trying to think this through so that we dont implement/rollback/reimplement logic like this.

@chuckha
Copy link
Contributor Author

chuckha commented Mar 12, 2019

Running plugins in some order would be nice but not super necessary for this -- I think what I called operations in the main ticket was a little misleading.

A little history to try and explain this: this feature request came from the scanner use case where we run one plugin, e2e, and 3 operations => run e2e, upload results, cleanup. An operation blocked based on a Donefile that gets written via shared mounts.

See this yaml for reference, https://scanner.heptio.com/7897558f1ff2547b17ab1cc16ed90f2d/yaml/.

I wanted a way to do this maybe without explicitly setting the shared mounts or maybe even having to write the logic

while donefile doesn't exist:
    sleep for a bit

actual operation action

@johnSchnake
Copy link
Contributor

OK; I think I understand. Tell me if I have this right:

Firstly, there are two separate issues. #432 mentions the necessity of a DAG runner in order to extract the queries that sonobuoy does to their own plugin. I thought this was the ticket for that logic and I was mistaken. The DAG runner, and my previous comments, are still addressed in the PR in question.

Secondly, looking at that yaml and the scanner flow, it is related (in that you want some ordering logic), but distinct because you want the steps to be able to communicate (ie the e2e results get sent to the forwarder and presumably something from the e2e or the forwarder gets sent to the cleanup routine). So currently those are separate containers but run as sidecars where you had to just write the order logic yourself. Does that sound like an accurate representation of the issue? Yours is also unique from plugin ordering in that, as written, you are modifying the aggregator and not the plugins.

That seems to overlap a bit with what I thought the intent of a DAG would be; each container would then actually be its own plugin but there could be data sharing between them. It is a hard problem to solve though if we are trying to support complex dependencies (which may never be used) and sharing of data between them. I worried about spending too much time on those features which may get little use and complicate things.

Also, the idea of extra steps within the aggregator was discussed in #51. At the time it seems that was even before the status annotation so it was just considering how to pass basic data about the run to the user. The ideas described there (e.g. webhook or some other sidecar to the aggregator) could be really useful so people could automatically do things with the results.

@chuckha
Copy link
Contributor Author

chuckha commented Mar 13, 2019

that's perfect!

@johnSchnake johnSchnake changed the title Allow for user ordered operations Allow for user ordered operations in aggregator Mar 13, 2019
@stale
Copy link

stale bot commented Nov 5, 2020

There has not been much activity here. We'll be closing this issue if there are no follow-ups within 15 days.

@stale stale bot added the misc/wontfix label Nov 5, 2020
@stale stale bot closed this as completed Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants