-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move signal handling into the MiqServer object #15206
Conversation
This will let them be properly handled, rather than attempting a reconnect.
app/models/miq_server.rb
Outdated
@@ -369,6 +369,13 @@ def monitor_loop | |||
_log.info "Server Monitoring Complete - Timings: #{timings.inspect}" unless timings[:total_time] < server_log_timings_threshold | |||
sleep monitor_poll | |||
end | |||
rescue Interrupt => e | |||
_log.info("Recieved #{e.message} signal, killing server") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo... Recieved
=> Received
(Same on the one below)
app/models/miq_server.rb
Outdated
rescue Interrupt => e | ||
_log.info("Recieved #{e.message} signal, killing server") | ||
self.class.kill | ||
exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's really strange for a regular method on a model to exit 1, however that being said, this is no normal method. We really need to rip out these "MiqServer" classes into standalone classes.
This insures the workers and workers_lock instance variables are still accessible when trying to communicate messages to the workers for a clean shut down. Previously the process survived, but the MiqServer instance was no longer in scope. This caused the workers and workers_lock objects used for message passing via drb to be nil, ensuring that the workers never got the message to exit.
7589f00
to
9944176
Compare
rescue Interrupt => e | ||
_log.info("Received #{e.message} signal, killing server") | ||
self.class.kill | ||
exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This got squashed from @Fryguy
I think it's really strange for a regular method on a model to exit 1, however that being said, this is no normal method. We really need to rip out these "MiqServer" classes into standalone classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fryguy I agree, the alternative was to create a separate method (like #kill_and_exit
or something) and just call it from here.
I thought that was a bit overkill, but I have no real opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just raise instead of exit 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, to be fair, we were exiting before so I guess that's unchanged behavior.
Checked commits carbonin/manageiq@a4504ac~...9944176 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 app/models/miq_server.rb
|
_log.info("Received #{e.message} signal, killing server") | ||
self.class.kill | ||
exit 1 | ||
rescue SignalException => e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, we're now handling other soft signals I believe. I don't recall why we were only handling term, usr1, and usr2 as soft signals before. Maybe it doesn't matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going with "it doesn't matter". Plus I think this is objectively better behavior. The alternative would be to exit the server process with the exception (it was being re-raised previously) and have the workers go down with DRb connection errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let's see what happens and make it more complicated if we need to
|
||
it "unhandled signal SIGALRM" do | ||
allow(MiqServer).to receive(:start).and_raise(SignalException, "SIGALRM") | ||
expect { server.start }.to raise_error(SignalException, "SIGALRM") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, does this type matter? We're now treating them as soft signals where we weren't before...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above, we're going to let these other signals be treated differently when we find it's a problem.
This insures the
@workers
and@workers_lock
instance variables are still accessible when trying to communicate messages to the workers for a clean shut down.Previously the process survived, but the MiqServer instance was no longer in scope. This caused the
@workers
and@workers_lock
objects used for message passing via drb to be nil, ensuring that the workers never got the message to exit.