Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/netns: NewNS() check if file name already exists #1381

Merged
merged 2 commits into from
Mar 27, 2023

Conversation

Luap99
Copy link
Member

@Luap99 Luap99 commented Mar 24, 2023

The code picks a random name but never checked if this name was already
in use. If the RNG generates the same name again we will end up
with containers sharing the same netns which can be very bad.
It is very unlikely to happen but this the only explanation I find for
containers/podman#17903.

pkg/netns: remove old pre go 1.10 workaround
Because we do not not unlock the OS thread it will be killed by the
runtime and not be reused for go 1.10+. Therefore there is no point in
joining the original netns back which never worked as rootless.

Go 1.10 is EOL for years and our code doesn't compile on such old
versions anyway so it is safe to remove.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 24, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Luap99
Copy link
Member Author

Luap99 commented Mar 24, 2023

@mheon PTAL

@Luap99 Luap99 force-pushed the netns-duplicate branch 2 times, most recently from 9112515 to 1b2e8de Compare March 24, 2023 15:24
_, err := rand.Reader.Read(b)
if err != nil {
return nil, fmt.Errorf("failed to generate random netns name: %v", err)
for i := 0; i < 1000000; i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a little lower on the iteration count here? 10,000 or so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah honestly not sure what a good number is, 10k would be fine for me as well.

Luap99 added 2 commits March 27, 2023 14:51
The code picks a random name but never checked if this name was already
in use. If the RNG generates the same name again we will end up with
containers sharing the same netns which can be very bad. It is very
unlikely to happen but this the only explanation I find for
containers/podman#17903.

Signed-off-by: Paul Holzinger <[email protected]>
Because we do not not unlock the OS thread it will be killed by the
runtime and not be reused for go 1.10+. Therefore there is no point in
joining the original netns back which never worked as rootless.

Go 1.10 is EOL for years and our code doesn't compile on such old
versions anyway so it is safe to remove.

Signed-off-by: Paul Holzinger <[email protected]>
@rhatdan
Copy link
Member

rhatdan commented Mar 27, 2023

LGTM, the chance of this happening is very slim, but over 10,000 times yikes.

@rhatdan
Copy link
Member

rhatdan commented Mar 27, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Mar 27, 2023
@openshift-merge-robot openshift-merge-robot merged commit d08155f into containers:main Mar 27, 2023
@Luap99 Luap99 deleted the netns-duplicate branch March 28, 2023 08:44
Luap99 added a commit to Luap99/libpod that referenced this pull request Apr 3, 2023
I made a change in c/common[1] to prevent duplicates in netns names.
This now causes problem in podman[2] where the rootless netns will no
longer work after the netns got invalid but the underlying path still
exists. AFAICT this happens when the podman pause process got killed and
we are now in a different user namespace.

While I do not know what causes this, this commit should make it at
least possible to recover from this situation automatically as it used
to be before[1].

the problem with that is that containers started before it will not be
able to talk to contianers started after this. A restart of the previous
container will fix it but this was also the case before.

[NO NEW TESTS NEEDED]

[1] containers/common#1381
[2] containers#17903 (comment)

Signed-off-by: Paul Holzinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants