[201911] Fix snmp subagent errors in shutdown path #259
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
- What I did
Fix an error raised in snmp container stop path. The issue is seen when SIGTERM is send to SNMP supervisord, and it attepmts to stop
snmp-subagent
.There is a bug in subagent's shutdown path where below erros is seen upon receiving SIGTERM:
This issue turned out to be very costly in warmboot shutdown path. The SIGTERM sent to subagent never shutsdown the agent as it crashed. So supervisord waits for 10s before sending SIGKILL.
In some instances even kill signal to agent fails as the subagent has crashed, this makes dockerd to wait for 10 extra seconds to send SIGKILL to the container itself.
The 20s wait in the shutdown process is not only costly in terms of time spent, but it also makes dockerd to lock resources causing other
docker exec
commands to timeout.- How I did it
Add future instance to tasks list, instead of another event loop task.
- How to verify it
Tested fix on physical testbed:
Without fix:
With fix:
- Description for the changelog