Move signal handling into the MiqServer object #15206

carbonin · 2017-05-23T18:40:37Z

This insures the @workers and @workers_lock instance variables are still accessible when trying to communicate messages to the workers for a clean shut down.

Previously the process survived, but the MiqServer instance was no longer in scope. This caused the @workers and @workers_lock objects used for message passing via drb to be nil, ensuring that the workers never got the message to exit.

This will let them be properly handled, rather than attempting a reconnect.

Fryguy · 2017-05-23T18:57:41Z

app/models/miq_server.rb

@@ -369,6 +369,13 @@ def monitor_loop
      _log.info "Server Monitoring Complete - Timings: #{timings.inspect}" unless timings[:total_time] < server_log_timings_threshold
      sleep monitor_poll
    end
+  rescue Interrupt => e
+    _log.info("Recieved #{e.message} signal, killing server")


Minor typo... Recieved => Received (Same on the one below)

Fryguy · 2017-05-23T18:58:22Z

app/models/miq_server.rb

+  rescue Interrupt => e
+    _log.info("Recieved #{e.message} signal, killing server")
+    self.class.kill
+    exit 1


I think it's really strange for a regular method on a model to exit 1, however that being said, this is no normal method. We really need to rip out these "MiqServer" classes into standalone classes.

This insures the workers and workers_lock instance variables are still accessible when trying to communicate messages to the workers for a clean shut down. Previously the process survived, but the MiqServer instance was no longer in scope. This caused the workers and workers_lock objects used for message passing via drb to be nil, ensuring that the workers never got the message to exit.

carbonin · 2017-05-23T19:00:16Z

app/models/miq_server.rb

+  rescue Interrupt => e
+    _log.info("Received #{e.message} signal, killing server")
+    self.class.kill
+    exit 1


This got squashed from @Fryguy

I think it's really strange for a regular method on a model to exit 1, however that being said, this is no normal method. We really need to rip out these "MiqServer" classes into standalone classes.

@Fryguy I agree, the alternative was to create a separate method (like #kill_and_exit or something) and just call it from here.

I thought that was a bit overkill, but I have no real opinion.

Why not just raise instead of exit 1?

Although, to be fair, we were exiting before so I guess that's unchanged behavior.

miq-bot · 2017-05-23T19:03:44Z

Checked commits carbonin/manageiq@a4504ac~...9944176 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
3 files checked, 1 offense detected

app/models/miq_server.rb

❗ - Line 375, Col 5 - Rails/Exit - Do not use exit in Rails applications.

jrafanie · 2017-05-23T19:11:18Z

app/models/miq_server.rb

+    _log.info("Received #{e.message} signal, killing server")
+    self.class.kill
+    exit 1
+  rescue SignalException => e


Note, we're now handling other soft signals I believe. I don't recall why we were only handling term, usr1, and usr2 as soft signals before. Maybe it doesn't matter?

I was going with "it doesn't matter". Plus I think this is objectively better behavior. The alternative would be to exit the server process with the exception (it was being re-raised previously) and have the workers go down with DRb connection errors.

yeah, let's see what happens and make it more complicated if we need to

jrafanie · 2017-05-23T19:11:56Z

spec/lib/workers/evm_server_spec.rb

-
-    it "unhandled signal SIGALRM" do
-      allow(MiqServer).to receive(:start).and_raise(SignalException, "SIGALRM")
-      expect { server.start }.to raise_error(SignalException, "SIGALRM")


Yeah, does this type matter? We're now treating them as soft signals where we weren't before...

see above, we're going to let these other signals be treated differently when we find it's a problem.

Reraise SignalExceptions raised in MiqServer#monitor

a4504ac

This will let them be properly handled, rather than attempting a reconnect.

carbonin added bug core/workers labels May 23, 2017

carbonin assigned jrafanie May 23, 2017

carbonin requested a review from Fryguy May 23, 2017 18:40

Fryguy approved these changes May 23, 2017

View reviewed changes

Fryguy reviewed May 23, 2017

View reviewed changes

carbonin added 2 commits May 23, 2017 14:58

Move signal handling specs from evm_server_spec.rb to miq_server_spec.rb

9944176

carbonin force-pushed the wait_for_stop_on_sigterm branch from 7589f00 to 9944176 Compare May 23, 2017 18:58

carbonin commented May 23, 2017

View reviewed changes

jrafanie reviewed May 23, 2017

View reviewed changes

carbonin mentioned this pull request May 23, 2017

Remove systemd in favor of dumb-init for process management ManageIQ/manageiq-pods#140

Merged

jrafanie merged commit 972cec5 into ManageIQ:master May 23, 2017

jrafanie added this to the Sprint 62 Ending Jun 5, 2017 milestone May 23, 2017

carbonin deleted the wait_for_stop_on_sigterm branch October 13, 2017 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move signal handling into the MiqServer object #15206

Move signal handling into the MiqServer object #15206

carbonin commented May 23, 2017

Fryguy May 23, 2017

Fryguy May 23, 2017

carbonin May 23, 2017

carbonin May 23, 2017 •

edited

Loading

jrafanie May 23, 2017

jrafanie May 23, 2017

miq-bot commented May 23, 2017

jrafanie May 23, 2017

carbonin May 23, 2017

jrafanie May 23, 2017

jrafanie May 23, 2017

jrafanie May 23, 2017

Move signal handling into the MiqServer object #15206

Move signal handling into the MiqServer object #15206

Conversation

carbonin commented May 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carbonin May 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miq-bot commented May 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carbonin May 23, 2017 •

edited

Loading