Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug where salt-minion process does not restart during upgrade #3281

Merged
merged 6 commits into from
Apr 12, 2021

Conversation

TeddyAndrieux
Copy link
Collaborator

Component:

'salt', 'upgrade'

Context:

#3247

Summary:

  • Backport 2caba9e
  • Backport 73835be
  • Backport 21d679a
  • Remove all logic from salt states to wait for salt-minion to be ready, instead just check it as part of the orchestrate "deploy_node"
  • Rewrite a bit init.sls for salt minion as it was a bit outdated

Fixes: #3247

By default salt master timeout is set to 5 seconds, and time to time
it's not sufficient, as pillar compute may take some time and also it
happens that some time a salt-minion take a bit of time to answer a job
listing when executing a salt states

(cherry picked from commit 2caba9e)
Time to time, especially on really slow platform, we got failure because
salt state execution timeout. Increase salt-master default timeout to 20

(cherry picked from commit 73835be)
(cherry picked from commit d6c226f)
Time to time salt-master get overloaded because he receive to much
query, for example during upgrade and one environment a bit slow some
salt states may timeout and make the upgrade fail.
To avoid that kind of issue just bump the `sock_pool_size` on salt
master (from 1 to 15) to avoid blocking waiting for zeromq communications
and also bump the `worker_threads` on salt master (from 5 to 10) to
avoid some failure if you have too many communication with the salt
master (e.g.: because of upgrade + storage operator)

Sees: saltstack/salt#53147
(cherry picked from commit 21d679a)
@TeddyAndrieux TeddyAndrieux requested a review from a team April 9, 2021 10:57
@bert-e
Copy link
Contributor

bert-e commented Apr 9, 2021

Hello teddyandrieux,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Apr 9, 2021

Conflict

A conflict has been raised during the creation of
integration branch w/2.6/bugfix/salt-minion-restart with contents from bugfix/salt-minion-restart
and development/2.6.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/2.6/bugfix/salt-minion-restart origin/development/2.6
 $ git merge origin/bugfix/salt-minion-restart
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/2.6/bugfix/salt-minion-restart

As suggested by Salt to restart Salt-minion process we should just
`service.restart` in Background and then wait for the Salt-minion to be
ready, in our case as part of the orchestrate

Sees: https://docs.saltproject.io/en/latest/faq.html#restart-using-states
Fixes: #3247
Add some explanation about all state in `init.sls` of Salt minion states
Before this commit we were only calling the "Reconfigure salt-minion"
state if we are not on first node deployment, but state may do some
changes that require a Salt minion process to restart.

For example the salt package get hold only when you call the salt minion
state when salt already installed
@TeddyAndrieux TeddyAndrieux force-pushed the bugfix/salt-minion-restart branch from 67b8978 to 14321d2 Compare April 9, 2021 10:57
@bert-e
Copy link
Contributor

bert-e commented Apr 9, 2021

Conflict

A conflict has been raised during the creation of
integration branch w/2.7/bugfix/salt-minion-restart with contents from w/2.6/bugfix/salt-minion-restart
and development/2.7.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/2.7/bugfix/salt-minion-restart origin/development/2.7
 $ git merge origin/w/2.6/bugfix/salt-minion-restart
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/2.7/bugfix/salt-minion-restart

@bert-e
Copy link
Contributor

bert-e commented Apr 9, 2021

Conflict

A conflict has been raised during the creation of
integration branch w/2.8/bugfix/salt-minion-restart with contents from w/2.7/bugfix/salt-minion-restart
and development/2.8.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/2.8/bugfix/salt-minion-restart origin/development/2.8
 $ git merge origin/w/2.7/bugfix/salt-minion-restart
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/2.8/bugfix/salt-minion-restart

@bert-e
Copy link
Contributor

bert-e commented Apr 9, 2021

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

@bert-e
Copy link
Contributor

bert-e commented Apr 9, 2021

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

@TeddyAndrieux TeddyAndrieux marked this pull request as draft April 9, 2021 11:12
@TeddyAndrieux TeddyAndrieux linked an issue Apr 9, 2021 that may be closed by this pull request
@TeddyAndrieux TeddyAndrieux marked this pull request as ready for review April 9, 2021 18:19
@TeddyAndrieux
Copy link
Collaborator Author

/approve

@bert-e
Copy link
Contributor

bert-e commented Apr 12, 2021

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/2.5

  • ✔️ development/2.6

  • ✔️ development/2.7

  • ✔️ development/2.8

  • ✔️ development/2.9

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Apr 12, 2021

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/2.5

  • ✔️ development/2.6

  • ✔️ development/2.7

  • ✔️ development/2.8

  • ✔️ development/2.9

The following branches have NOT changed:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4

Please check the status of the associated issue None.

Goodbye teddyandrieux.

@bert-e bert-e merged commit 14321d2 into development/2.5 Apr 12, 2021
@bert-e bert-e deleted the bugfix/salt-minion-restart branch April 12, 2021 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MetalK8s Upgrade may fail because salt-minion failed to restart
3 participants