-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ansible event catcher - mark event_monitor_runnning when there are no events at startup #13903
Conversation
@@ -22,6 +23,7 @@ def poll | |||
catch(:stop_polling) do | |||
begin | |||
loop do | |||
@before_poll.call if @before_poll && @before_poll.respond_to?(:call) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil.respond_to?(:call)
is false, so code can be simplified to:
@before_poll.call if @before_poll.respond_to?(:call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually don't prefer try
but this time I actually like @before_poll.try(:call)
Note, @before_poll
doesn't scream "proc/lambda" in my head. I wonder if it makes sense to make this interface more clear by calling it @before_poll_proc
or @pre_poll_proc
4124a67
to
2946ab9
Compare
@chessbyte @jrafanie thanks for the review. Added your suggestions. Hopefully this kind of micro setup is not needed anymore once the event catcher is refactored |
Why is there a need to have all of this proc stuff? Can you not just call heartbeat directly? All event catchers should have a heartbeat method implemented, and I can't see why it wouldn't be necessary for every provider. |
2946ab9
to
fc7b613
Compare
Usually In case there is a steady stream of events, this would be called once in a while. But if there are no events coming in, then the Runner thread would not heartbeat.
Looks like at least google, hawkular and vmware have the same problem :) @agrare am I missing something here? What happens when the vmware event catcher has no events coming in? Will it still heartbeat? |
@durandom I don't think we do anything special for heartbeat, its possible vmware is just verbose enough with events that it isn't an issue :) We do have a timeout on |
That wont do I guess, because an empty set will not reach the heartbeat here: def process_events(events)
events.to_miq_a.each do |event|
heartbeat
process_event(event)
Thread.pass
end
end |
I wonder if it makes sense making all workers have a dedicated thread just for heartbeating. We used to do this years ago before rails/activerecord was threadsafe. I think it would be something we could finally do. Although, we've had issues with the main thread getting "stuck" on a really long work item, such as processing a huge report, and I don't know that heartbeating in a dedicated thread would help there since we'd be still reporting to be alive but not actually able to do new work or respond to requests. 😕 So, yeah, I'm torn. I don't know what a dedicated heartbeat thread buys us since we'd still have a way to get the status or interrupt the busy thread from the heartbeat thread. I don't know. |
We could always heartbeat before entering the loop if we go that direction (aka not a dedicated heartbeat thread) |
re "dedicated heartbeat thread" If the consuming thread (the one that fetches events from the provider) gets stale, because of network stuff - the worker should be restarted. If the dispatching thread (the one that puts events on our miq queue) get stale, because of? crazyness - the worker should be restarted. So, actually my solution here could lead to a false positive, where one thread still heartbeats and the other got stale. 😭 Threading is wicked. I dont know if a dedicated thread makes sense at all, doesnt it defy the idea of heartbeating? Then you have a thread that just heartbeats and the other two are stale?! (just what @jrafanie said in the second paragraph) I guess, what we have now is probably best. Restart the worker if no events are coming in... ...and in the future I'd like to re-visit the whole threading approach here, because I actually think it brings more complexity into this than it actually solves a problem here (green threads) close this? |
Maybe I'm missing something very fundamental here, but we heartbeat in the If I'm reading this correctly, each time the If the worker doesn't get any events for a long time, then based on what I'm reading, the |
fc7b613
to
23da6c9
Compare
@blomquisg you are absolutely right - thanks for joining in and clarifying. This made me re-visit this whole situation. The reason the EventCatcher would not start was that it did not send Nevertheless, this is an indicator for how fragile this whole setup is - not just looking for cheap excuses 😄 |
@jrafanie could you re-review in the light of @blomquisg and my latest comment? If you want I can also close this PR and open a new one with just my last comment as description |
@durandom So... this statement in the description of the PR:
And this from @blomquisg seem to contradict each other:
This latter statement is true. Each iteration of the do_worker_loop will heartbeat as long as we didn't override this method. So, what are we solving in this PR again? Can you reclarify the description? It appears we are going from heartbeating with each event to each call to monitor_events. What is this solving? |
@jrafanie sure. Re-wrote the description and title. Sorry for the confusion, this was a long way coming. And thanks to @blomquisg for the correct comment |
No worries, it's easy to have PR comments/descriptions go stale so I try to ask dumb questions to see if we can help future "us" avoid confusion. |
This pull request is not mergeable. Please rebase and repush. |
in case there are no events when the event catcher starts up it would not send event_monitor_running
23da6c9
to
e7a344e
Compare
Checked commit durandom@e7a344e with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 |
@jrafanie merge? |
Thanks @durandom, I didn't know the conflict was resolved. 👍 |
@durandom can you mark euwe/darga? I believe this is euwe/no, darga/no... |
@miq-bot add_label darga/no |
The reason the EventCatcher would not start was that it did not send
event_monitor_running
when there are no events coming in.So I moved that up, before the loop.
@miq-bot add_labels providers/ansible_tower, bug
@miq-bot assign @jrafanie
@jrafanie please review and merge - because you are the worker tamer 🦁