-
Notifications
You must be signed in to change notification settings - Fork 519
fix: Restart=always docker systemd service #3758
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3758 +/- ##
=======================================
Coverage 73.19% 73.20%
=======================================
Files 148 148
Lines 25367 25372 +5
=======================================
+ Hits 18568 18573 +5
Misses 5663 5663
Partials 1136 1136
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/lgtm. Thanks @jackfrancis for sharing! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
f7ce667
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jackfrancis, mboersma The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
After more investigation, it appears that Restart=always is already present for the docker systemd service: $ grep Restart /lib/systemd/system/docker.service (Though not via aks-engine-delivered config). So, we should still not just rely upon that being there, and explicitly declare that config (like we're doing in this PR). However, the fact that we're still seeing terminally stopped docker systemd services in the wild suggests that we aren't as resilient as we want to in terms of guaranteeing the availability of the docker systemd service. These docs suggest that Restart=always in fact has limits: https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart= The key statement is:
We aren't specifying any non-qualifying exit codes for docker:
So there must be some systemctl stop-equivalent events that occasionally happen. In order to restart docker after that, I've added |
Reason for Change:
This PR ensures that Restart=always is configured for the docker systemd service, and that the included monitor script for both docker and kubelet has a fail-safe "start the systemd service" if the systemd job spec doesn't do so itself.
Issue Fixed:
Requirements:
Notes: