node drain default behavior for -ignore-system #8622

tgross · 2020-08-10T17:04:34Z

In #8606 (comment) @jippi raised the question of whether nomad node drain should have the -ignore-system flag set by default.

Currently the default behavior is that service and batch jobs are drained, and then system jobs are drained. However, internal allocation runner post-run hooks (ex. deregistering from Consul, cleaning up disk) don't block draining the system jobs, so they may run concurrently with the system jobs being drained. With the -ignore-system flag, the system jobs are never drained, which makes them available during shutdown.

The -ignore-system flag appears to be strictly more useful than not having it set. Some example scenarios where you'd want to have a system job that runs until all other workloads have completed:

log shippers
monitoring agents
ingress proxies (ex. Nginx or HAProxy in front of web services)
CSI node plugins

Changing the default behavior would break backwards compatibility, so we want to solicit feedback from the community about whether this change would be disruptive.

The text was updated successfully, but these errors were encountered:

josh-m-sharpe · 2023-05-25T19:37:05Z

💯 💯 💯

tgross added type/enhancement stage/needs-discussion theme/drain labels Aug 10, 2020

tgross mentioned this issue Aug 10, 2020

docs: always use -ignore-system on node drain with CSI #8606

Merged

tgross mentioned this issue May 25, 2023

draining system jobs toggle should not be enabled by default #17317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node drain default behavior for -ignore-system #8622

node drain default behavior for -ignore-system #8622

tgross commented Aug 10, 2020

josh-m-sharpe commented May 25, 2023

node drain default behavior for -ignore-system #8622

node drain default behavior for -ignore-system #8622

Comments

tgross commented Aug 10, 2020

josh-m-sharpe commented May 25, 2023