Handle lock release with SIGHUP in VTGR #8472
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: crowu [email protected]
Description
VTGR relies on the lock from topo server. Most of the topo lock uses health-check of session to determine if a process is still holding the lock. This is not ideal for lock release process during a restart with SIGHUP. Take consul as an example, the session check by default uses serfHealth, which is a health check on the node level - even if the process restart because the node is healthy, consul will think the lock is still being held until the TTL. As a result, during a deploy, there could be TTL period of time that VTGR cannot grab the lock and fix the cluster.
This PR handle the SIGHUP signal by explicitly release the lock that was held to avoid the situation above.
Related Issue(s)
#8386
Checklist
Deployment Notes