Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flake: error reading /etc/cni/net.d/something #9041

Closed
edsantiago opened this issue Jan 20, 2021 · 4 comments · Fixed by #9046
Closed

flake: error reading /etc/cni/net.d/something #9041

edsantiago opened this issue Jan 20, 2021 · 4 comments · Fixed by #9046
Labels
flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@edsantiago
Copy link
Member

Starting to see a lot of this one:

Running: podman [options] network rm -f testNetThreeCNI2
Error: in /etc/cni/net.d/testNetSingleCNI.conflist: error reading /etc/cni/net.d/testNetSingleCNI.conflist: open /etc/cni/net.d/testNetSingleCNI.conflist: no such file or directory

...which then cascades into:

Running: podman [options] network create testNetThreeCNI2
Error: the network name testNetThreeCNI2 is already used

ISTR this being a race condition between network create and network rm, and thought there was an open issue about it already, but I can't find it now.

Podman network [It] podman inspect container two CNI networks (container not running)

@edsantiago edsantiago added flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. labels Jan 20, 2021
@Luap99
Copy link
Member

Luap99 commented Jan 20, 2021

I take a look. I think we should get rid of static network names and only use random ones.

@edsantiago
Copy link
Member Author

@Luap99 I don't think that's the issue; I think it would happen even with stringid.GenerateNonCryptoID(). This seems to be a race where something reads the conf directory, then acts on that list of files, but in the interim one of the files has disappeared.

@Luap99
Copy link
Member

Luap99 commented Jan 20, 2021

It's not the real issue but at least it would prevent a full failure in CI.
I agree with you that there is a race somewhere.

@Luap99
Copy link
Member

Luap99 commented Jan 20, 2021

OK @baude added a lockfile which should prevent these races but the lockfile is located in the tmpdir and not in the cni config dir. Therefore a second parallel test with a network create/remove call did not get the lock.

I think the best solution is to move the lockfile into the cni config directory.

Luap99 pushed a commit to Luap99/libpod that referenced this issue Jan 21, 2021
Commit(fe3faa5) introduced a lock file for network create/rm calls.
There is a problem with the location of the lock file. The lock file was
stored in the tmpdir. Running multiple podman network create/remove
commands in parallel with different tmpdirs made the lockfile inaccessible
to the other process, and so parallel read/write operations to the cni
config directory continued to occur. This scenario happened frequently
during the e2e tests and caused some flakes.

Fixes containers#9041

Signed-off-by: Paul Holzinger <[email protected]>
iwita pushed a commit to iwita/podman that referenced this issue Jan 26, 2021
Commit(fe3faa5) introduced a lock file for network create/rm calls.
There is a problem with the location of the lock file. The lock file was
stored in the tmpdir. Running multiple podman network create/remove
commands in parallel with different tmpdirs made the lockfile inaccessible
to the other process, and so parallel read/write operations to the cni
config directory continued to occur. This scenario happened frequently
during the e2e tests and caused some flakes.

Fixes containers#9041

Signed-off-by: Paul Holzinger <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants