Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lock to retire_now start #17280

Merged
merged 3 commits into from
Apr 19, 2018

Conversation

d-m-u
Copy link
Contributor

@d-m-u d-m-u commented Apr 10, 2018

This is a solution for the fact that in a multi-appliance setup we can have multiple workers start retirement for the same service or vm.

(this needs to be backported but the following doesn't):

DO NOT BACKPORT THE RELATED CONTENT PR PLEASE

related to:
ManageIQ/manageiq-content#281

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1563627

@miq-bot miq-bot added the wip label Apr 10, 2018
@d-m-u d-m-u force-pushed the adding_lock_to_retire_now branch 3 times, most recently from c3248ab to 8612ffa Compare April 10, 2018 19:45
@d-m-u
Copy link
Contributor Author

d-m-u commented Apr 10, 2018

@tinaafitz

@d-m-u d-m-u changed the title [WIP] Add lock to retire_now start Add lock to retire_now start Apr 10, 2018
@d-m-u
Copy link
Contributor Author

d-m-u commented Apr 10, 2018

@miq-bot add_label retirement

@d-m-u
Copy link
Contributor Author

d-m-u commented Apr 10, 2018

@miq_bot assign @gmcculloug

@gmcculloug gmcculloug self-assigned this Apr 10, 2018
@gmcculloug gmcculloug requested review from tinaafitz and mkanoor April 10, 2018 21:26
end
else
_log.info("#{retirement_object_title}: [#{name}] has already started retirement")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u
Should we log what the current state is, it might help us debugging.
Can we add some spec for the retirement_state = initializing

@d-m-u d-m-u force-pushed the adding_lock_to_retire_now branch from 8612ffa to ff94364 Compare April 10, 2018 21:50
raise_retirement_event(event_name, requester)
rescue => err
_log.log_backtrace(err)
elsif retirement_state != "initializing"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tinaafitz doesn't this check also need to test for all the other states -- like retiring -- that a vm/service that is in the process of being retired could be?

Copy link
Contributor Author

@d-m-u d-m-u Apr 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retiring needs check here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if retired is already handled at the top of this condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u Right, as we discussed, "retiring" needs to be checked here. We should add a constant "initializing" with the other retirement_states.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that if we are adding the "initializing" state it should be a constant. In addition, the retiring? method should be updated to account for both RETIRING and RETIREMENT_INITIALIZING

There are also two methods that call retired? that should also be checking for retiring?: retirement_check and start_retirement.

The current logic around the lock is not preventing multiple from running since it is not reloading and checking the retirement_state after the lock is freed.

I would purpose a flow like this:

if retired?
  log existing "already retired" message
elsif retiring?
  log retirement in-progress message
else
  lock do
    reload object # in case object was updated in separate process while waiting for lock to be released
    if retirement_state is blank or "error"
      process retirement
    else
      log that retirement_state was updated while waiting to lock to be released and log current retirement_state value.  No further action.
    end
  end
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmcculloug Yes, I agree that:

  1. The retiring? method should include both the RETIRING and RETIREMENT_INITIALIZING retirement_states.
  2. The retirement_check and start_retirement methods should check for retiring?

I propose that we add one more method, retirement_initialized to check for the retirement_state of RETIREMENT_INITIALIZING for the start_retirement Automate method to check that the object has been initialized for retirement. The start_retirement Automate method currently checks for retired? and retiring?. The retiring? check should be replaced with retirement_initialized method to prevent the retirement state machine from processing an object that was not "initialized" for retirement.(This could happen if someone directly called the retirement state machine without having gone through retire_now.)

@d-m-u d-m-u changed the title Add lock to retire_now start [WIP] Add lock to retire_now start Apr 11, 2018
@miq-bot miq-bot added the wip label Apr 11, 2018
Copy link
Member

@tinaafitz tinaafitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u We'll have to modify the start_retirement Automate methods to check for the retirement_state of "initializing".

@@ -27,6 +27,7 @@
expect(@service.retirement_state).to be_nil
expect(MiqEvent).to receive(:raise_evm_event).once
@service.retire_now
expect(@service.retirement_state).to eq('initializing')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u Can we add more tests for the various possible retirement_states?

@@ -29,6 +29,7 @@
expect(MiqEvent).to receive(:raise_evm_event).once

@vm.retire_now
expect(@vm.retirement_state).to eq('initializing')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u same here.

raise_retirement_event(event_name, requester)
rescue => err
_log.log_backtrace(err)
elsif retirement_state != "initializing"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u Right, as we discussed, "retiring" needs to be checked here. We should add a constant "initializing" with the other retirement_states.

@@ -110,7 +111,7 @@ def raise_retire_audit_event(message)
end

def retirement_check
return if self.retired?
return if retired? || retiring?
Copy link
Member

@tinaafitz tinaafitz Apr 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u We should check for retirement_initialized here as well.

@d-m-u d-m-u force-pushed the adding_lock_to_retire_now branch from 9f9fca9 to 4cc59b1 Compare April 16, 2018 18:09
RETIRED = 'retired'.freeze
RETIRING = 'retiring'.freeze
ERROR_RETIRING = 'error'.freeze
RETIREMENT_INITIALIZING = 'initializing'.freeze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these constants get used outside of this module? If not it would be nice to standardize on naming and sort them. I would suggest something like:

RETIREMENT_ERROR = 'error'.freeze
RETIREMENT_INITIALIZING = 'initializing'.freeze
RETIREMENT_RETIRED  = 'retired'.freeze
RETIREMENT_RETIRING = 'retiring'.freeze

@@ -170,6 +180,10 @@ def retirement_base_model_name
@retirement_base_model_name ||= self.class.base_model.name
end

def retirement_initialized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be def retirement_initialized?

_log.log_backtrace(err)
lock do
reload
if retirement_state == '' || retirement_state == 'error' || retirement_state.nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify to: if error_retiring? || retirement_state.blank?

@d-m-u d-m-u force-pushed the adding_lock_to_retire_now branch from 4cc59b1 to 10be4c6 Compare April 16, 2018 20:29
@d-m-u d-m-u changed the title [WIP] Add lock to retire_now start Add lock to retire_now start Apr 16, 2018
@miq-bot miq-bot removed the wip label Apr 16, 2018
@d-m-u
Copy link
Contributor Author

d-m-u commented Apr 17, 2018

@tinaafitz can you re-review please?

Copy link
Member

@tinaafitz tinaafitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-m-u Looks good.
@mkanoor Please review.

@d-m-u
Copy link
Contributor Author

d-m-u commented Apr 18, 2018

@gmcculloug can you give this another 👀 please?

expect(MiqEvent).to receive(:raise_evm_event).once
@service.retire_now
@service.retire_now
@service.retire_now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I be a pain and ask this to be 3.times { @service.retire_now } ? 😬


@vm.retire_now
@vm.retire_now
@vm.retire_now
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@vm.update_attributes(:retirement_state => 'retiring')
expect(MiqEvent).to receive(:raise_evm_event).exactly(0).times
@vm.retire_now
@vm.reload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these reload calls needed at the end of many of these tests?

@d-m-u d-m-u force-pushed the adding_lock_to_retire_now branch from 10be4c6 to d6b63e1 Compare April 19, 2018 12:13
@miq-bot
Copy link
Member

miq-bot commented Apr 19, 2018

Checked commits d-m-u/manageiq@7bbd902~...d6b63e1 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
3 files checked, 1 offense detected

app/models/mixins/retirement_mixin.rb

@gmcculloug gmcculloug merged commit 2deffa5 into ManageIQ:master Apr 19, 2018
@gmcculloug gmcculloug added this to the Sprint 84 Ending Apr 23, 2018 milestone Apr 19, 2018
@d-m-u d-m-u deleted the adding_lock_to_retire_now branch April 19, 2018 12:48
@JPrause
Copy link
Member

JPrause commented Apr 19, 2018

@gmcculloug can you add the gaprindashvili/yes and fine/yes labels if this can be back-ported.
Ref: #17280

@d-m-u
Copy link
Contributor Author

d-m-u commented Apr 19, 2018

@miq-bot add_label gaprindashvili/yes
@miq-bot add_label fine/yes

simaishi pushed a commit that referenced this pull request Apr 23, 2018
@simaishi
Copy link
Contributor

Gaprindashvili backport details:

$ git log -1
commit b8bc7bed2cd1a3fdbd81454cb8e176c7ec5c02d5
Author: Greg McCullough <[email protected]>
Date:   Thu Apr 19 08:47:41 2018 -0400

    Merge pull request #17280 from d-m-u/adding_lock_to_retire_now
    
    Add lock to retire_now start
    (cherry picked from commit 2deffa58399461396e3bbd57b93fc744f0c64e88)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1570950

simaishi pushed a commit that referenced this pull request Apr 23, 2018
@simaishi
Copy link
Contributor

Fine backport details:

$ git log -1
commit 611b2a414d562f2ea9ef9ed4a88a3d0652683bb8
Author: Greg McCullough <[email protected]>
Date:   Thu Apr 19 08:47:41 2018 -0400

    Merge pull request #17280 from d-m-u/adding_lock_to_retire_now
    
    Add lock to retire_now start
    (cherry picked from commit 2deffa58399461396e3bbd57b93fc744f0c64e88)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1570951

d-m-u pushed a commit to d-m-u/manageiq that referenced this pull request Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants