-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle SIGTERM in run_single_worker.rb #15818
Handle SIGTERM in run_single_worker.rb #15818
Conversation
Termination of workers via the MiqServer::WorkerManager currently happens by sending a exit message over DRB to the worker that it will process in between doing jobs (so nothing is halfway in flight). The MiqServer will wait for a specific amount of time (timeout defined in the settings, but defaults to 5.minutes or so usually) before killing it (at this point, it is assumed the worker is stuck). Currently, SIGTERMS are trapped and handled by the worker runner class to run the `do_exit` method, but they don't allow the work currently in progress to finish. The addition of the `#setup_sigterm_trap` method will introduce Kernel traps that properly inform the worker that once they have finished with their next round of work, they should exit.
@miq-bot assign @gtanzillo |
17845c9
to
7776e35
Compare
Instead of the run_single_worker.rb script calling `.start_worker` on the Runner class, have it instantiate a new class, call `#setup_sigterm_trap` on that instance, and then call start (effectively `start_worker`, just adds the `#setup_sigterm_trap` call in the middle)
Checked commits NickLaMuro/manageiq@514af2a~...7776e35 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0 |
LGTM, merging. @gtanzillo you might want to take a look. |
@@ -341,6 +341,10 @@ def do_work_loop | |||
@backoff = nil | |||
end | |||
|
|||
# Should be caught by the rescue in `#start` and will run do_exit from | |||
# there. | |||
raise Interrupt if @sigterm_recieved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL, received... I noticed the spelling mistake after merging 😢
Does it make sense to also initialize @sigterm_received = false
somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrafanie good catch on both of those. Accessing a instance var will cause warnings when in verbose mode (so these probably exist in in the test output from travis), so yeah, those should be fixed.
Will open up a follow up PR shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YOLO MERGE FTW
…gterm_bugfix Fix minor typos/issues with PR #15818
Termination of workers via the
MiqServer::WorkerManager
currently happens by sending a exit message over DRB to the worker that it will process in between doing jobs (so nothing is halfway in flight). TheMiqServer
will wait for a specific amount of time (timeout defined in the settings, but defaults to5.minutes
or so usually) before killing it (at this point, it is assumed the worker is stuck).Currently,
SIGTERMS
are trapped and handled by the worker runner class to run thedo_exit
method, but they don't allow the work currently in progress to finish. The addition of the#setup_sigterm_trap
method will introduceKernel
traps that properly inform the worker that once they have finished with their next round of work, they should exit.Steps for Testing/QA
Apply the following diff (needed becuase the VimBroker won't be present, and adds some handy debugging output):
Create some long running work for your worker to act on (the time it sleeps in
:args
can be adjusted as needed):$ bin/rails runner 'MiqQueue.put :class_name => "Kernel", :method_name => "sleep", :args => 200, :zone => MiqServer.my_server.zone.name, :queue_name => "generic"'
Start a worker using the
run_single_worker.rb
script:$ ruby lib/workers/bin/run_single_worker.rb MiqGenericWorker
Wait for the worker to start processing the job, then interrupt it (Crtl-C).
The worker should output
done!
before exiting.