-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allocrunner: prevent panic on network manager #16921
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
As far as reproduction goes, I have a strong suspicion that there are other cases of #16722 where the alloc runner and task runner aren't initializing state on all code paths, but I haven't yet figured out a way to validate that automatically without just sitting down and tracing every path 😀
Found a repro: job "example" {
group "sleep" {
task "sleep" {
driver = "exec"
config {
command = "/bin/bash"
args = ["-c", "while true; do sleep 1; done"]
}
resources {
network {
mode = "bridge"
}
}
}
}
} I missed this nomad/client/allocrunner/network_manager_linux.go Lines 56 to 59 in bef109d
So the panic happens when you have a bridge network defined at the task level with a driver that doesn't' support I added a test case for this scenario and update the CHANGELOG to better describe the conditions for the panic to happen. |
Check the task group network length before trying to access the first element.
I haven't been able to reproduce the problem but the fix seems clear enough.
Closes #16863