-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aardvark/Netavark flakes #14173
Comments
This is flaking a lot lately. Can a network person PTAL. |
I am unable to reproduce this flake on host not entirely sure but I think i know why this happens on CI one reason it below but there can be more reasons for it. One root cause could be that I think adding a backoff on netavark end should help here. Something like this: #14173 Above case is specially for the test |
PS: Also while running tests on host I am only running |
Podman run networking [It] Aardvark Test 2: Two containers, same subnet
|
This is one the redesign PR for aardvark-dns and netavark. Design proposes that double-forking will happen at aardvark-end instead of netavark. Redesign proposal * Aardvark will invoke server on the child process by double-forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
There is a bigger design change under discussion which should help in this as well as some other issue, please track it here. However it will take some time to get merged: containers/aardvark-dns#148 |
This is one the redesign PR for aardvark-dns and netavark. Design proposes that double-forking will happen at aardvark-end instead of netavark. Redesign proposal * Aardvark will invoke server on the child process by double-forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
The netavark/aardvark changes are fine but I don't think it is required to fix this flake in CI. The test already "tries" to implement a back off logic: podman/test/e2e/common_test.go Lines 1045 to 1062 in 6dffa45
This does not work because dig always returns exit code 0 even when there is no match. |
The retry logic in digshort() did not work because dig always exits with 0 even when the domain name is not found. To make it work we have to check the standard output. We work on fixing the underlying issue in aardvark/netavark but this will take more time. Fixes containers#14173 Fixes containers#14171 Signed-off-by: Paul Holzinger <[email protected]>
Netavark/aardvark change is for more scenarios like #14356 (comment) i am not able to collect all the reference but it was reported few times that on the first container run dns resolve failed intermittently. But we can close these issues early if dig output match is fixed. |
I'm still seeing flakes on this test, and I'm somewhat sure that it's after the "fix" merged. Error message seems different:
Shall I file a new issue? Podman run networking [It] Aardvark Test 2: Two containers, same subnet
|
Yeah something is wrong, I have to add more debugging information but at this point it does not look like a race. There should be more than enough timout. |
This is one the redesign PR for aardvark-dns and netavark. Design proposes that forking will happen at aardvark-end instead of netavark and aardvark will verify if servers are up before parent goes away. Redesign proposal * Aardvark will invoke server on the child process by forking. * Parent waits for child to show up and verify against a dummy DNS query to check if server is running. * Exit parent on success and deatch child. * Calling process will wait for aardvark's parent process to return. * On successful return from parent it will be assumed that aardvark is running properly One new design is implemented and merged and netavark starts using this it should close * containers/podman#14173 * containers/podman#14171 Signed-off-by: Aditya R <[email protected]>
New set of symptoms in remote f36 root:
The surprising thing about this is that a lot of tests fail, and all of them have the same presumably-random string. I would expect each test to have a unique string. Whoever groks the random-netns-name code, that might be a good starting place to look. |
Renaming & repurposing this issue as a general place to log *vark flakes |
This should not effect netavark/aardvark at all, using a custom net namespace will never call into the network backend. |
Another one, this time podman (non-remote) f36 root, same "file exists" symptom |
A friendly reminder that this issue had no activity for 30 days. |
Last seen November 10. I will assume this is fixed. |
Podman run networking [It] Aardvark Test 2: Two containers, same subnet
The text was updated successfully, but these errors were encountered: