-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add network lifecycle management #5626
Conversation
feb9dff
to
82333ac
Compare
Adds a new Prerun and Postrun hooks to manage set up of network namespaces on linux. Work still needs to be done to make the code platform agnostic and support Docker style network initalization.
3944cf5
to
ddf125f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did one round of code comments but I'll read other networking PRs for more context before reviewing it again.
return | ||
} | ||
|
||
// Put this thread back to the orig ns, since it might get reused (pre go1.10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious how you noticed this issue? I assume you aren't running golang 1.10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This func is actually borrowed from: https://github.com/containernetworking/plugins/blob/master/pkg/testutils/netns_linux.go#L34
Though the background on this is covered pretty well here if interested: https://github.com/containernetworking/plugins/tree/master/pkg/ns#namespace-switching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome - might be nice to include a link; also may need to copy the whole file along with the copyright and license notice on top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to what @notnoop said. This code is 💥 and we shouldn't copy it without proper attribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be worth refactoring the default device manager out of the network hook and seeing if we can ship the general network support as fully crossplatform even if there's no non-linux network manager.
NewAllocRunner errors
I also have mixed opinions on adding another error return from NewAR. We were hoping to someday make NewAR/TR never return an error, but I have a feeling we're never going to get there. The current way client.go handles errors from NewAR seems fine, so it's probably not worth worrying about.
Adding a new AR/TR Validation method or hook might be useful? Or it might just complicate things more.
I think it's safe to accept your approach as-is and treat my ideas for out-of-scope of your networking efforts.
|
||
// noop for non linux clients | ||
func (ar *allocRunner) initPlatformRunnerHooks(hookLogger hclog.Logger) ([]interfaces.RunnerHook, error) { | ||
return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we detect if they're trying to use an unsupported network mode and return an error?
@@ -0,0 +1,243 @@ | |||
// +build linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docker on Windows being able to run Linux images really complicates this. #2633
Not something we need to worry about right now, but it is interesting Windows introduces new complexity where a task's OS doesn't match the host OS!
|
||
// manager is used when creating the network namespace. This defaults to | ||
// bind mounting a network namespace descritor under /var/run/netns but | ||
// can be created by a driver if nessicary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// can be created by a driver if nessicary | |
// can be created by a driver if necessary |
// alloc should only be read from | ||
alloc *structs.Allocation | ||
|
||
// spec described the network namespace and is syncronized by specLock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// spec described the network namespace and is syncronized by specLock | |
// spec described the network namespace and is syncronized by specLock |
// spec described the network namespace and is syncronized by specLock | |
// spec described the network namespace and is synchronized by specLock |
// spec described the network namespace and is syncronized by specLock | ||
spec *drivers.NetworkIsolationSpec | ||
specLock sync.Mutex | ||
logger hclog.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger hclog.Logger | |
logger hclog.Logger |
NetIsolationModeTask = NetIsolationMode("task") | ||
|
||
// NetIsolationModeNone indicates that there is no network to isolate and is | ||
// inteded to be used for tasks that the client manages remotely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// inteded to be used for tasks that the client manages remotely | |
// intended to be used for tasks that the client manages remotely |
Although maybe just remove that part as (a) it doesn't exist yet and (b) none
may be useful for other things like system batch jobs that also don't exist yet but can be faked.
@@ -280,6 +288,27 @@ message ExecTaskResponse { | |||
ExitResult result = 3; | |||
} | |||
|
|||
message CreateNetworkRequest { | |||
|
|||
// AllodID of the allocation the network is associated with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// AllodID of the allocation the network is associated with | |
// AllocID of the allocation the network is associated with |
|
||
message DestroyNetworkRequest { | ||
|
||
// AllodID of the allocation the network is associated with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// AllodID of the allocation the network is associated with | |
// AllocID of the allocation the network is associated with |
@@ -350,6 +397,10 @@ message TaskConfig { | |||
|
|||
// AllocId is the ID of the associated allocation | |||
string alloc_id = 15; | |||
|
|||
// NetworkIsolationSpec specifies the configuration for the network namespace | |||
// to use for the task. *Only supported on Linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// to use for the task. *Only supported on Linux | |
// to use for the task. Only supported on Linux |
...although that may not be true if you rearrange the hook code above? Couldn't somebody make a Windows driver that provided Create/Destroy Network?
@@ -414,6 +406,26 @@ func tasksUpdated(jobA, jobB *structs.Job, taskGroup string) bool { | |||
return false | |||
} | |||
|
|||
func networkUpdated(netA, netB []*structs.NetworkResource) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I wonder why this doesn't use https://godoc.org/github.com/hashicorp/nomad/nomad/structs#Networks.Equals
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This PR adds a few notable changes to support lifecycle management of network namespaces.
Alloc Runner Hooks
The first big change here was adding a call to initialize platform dependent hooks since the network hook is currently only supported on linux. There is new dependency here on a small packaged names
ns
that only builds on linux due to references to the syscall package that are platform dependent.The second is adding the actual network hook that includes checking if a driver needs to initialize the network.
plugins/drivers
ChangesNew proto messages and rpcs were added to support network namespaces. These aren't implemented by any driver in this PR but I did add a helper struct that can be embedded when driver network management is not needed (aka everything but docker). The typical proto to go conversion funcs are there as well.