Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: run aardvark as a daemon via forking and maintain its partial daemon like nature. #148

Merged
merged 1 commit into from
Jul 14, 2022

Conversation

flouthoc
Copy link
Collaborator

@flouthoc flouthoc commented Jun 14, 2022

This is one the redesign PR for aardvark-dns and netavark. Design
proposes that forking will happen at aardvark-end instead of
netavark and aarvark verifies validity of server before returning
control to aardvark.

Redesign proposal

  • Aardvark will invoke server on the child process by double-forking.

  • Parent waits for child to show up and verify against a dummy DNS
    query to check if server is running.

  • Exit parent on success and deatch child.

  • Calling process will wait for aardvark's parent process to return.

  • On successful return from parent it will be assumed that aardvark is
    running properly

One new design is implemented and merged and netavark starts using this
it should close

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 14, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flouthoc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

src/commands/run.rs Outdated Show resolved Hide resolved
@flouthoc flouthoc force-pushed the support-double-forking branch from 780f281 to 29eac3c Compare June 14, 2022 10:04
Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are misunderstanding what I want.
The goal is to daemonize ourself, see daemon(3). exec() is not required.

src/commands/run.rs Outdated Show resolved Hide resolved
src/commands/run.rs Outdated Show resolved Hide resolved
src/commands/run.rs Outdated Show resolved Hide resolved
src/main.rs Outdated Show resolved Hide resolved
@flouthoc
Copy link
Collaborator Author

It looks like you are misunderstanding what I want. The goal is to daemonize ourself, see daemon(3). exec() is not required.

Yeah can do that well, exec might not be needed. We can just continue with actual serve logic in the child fork.

@flouthoc flouthoc force-pushed the support-double-forking branch from 29eac3c to aa61a9c Compare June 15, 2022 10:15
@flouthoc flouthoc changed the title [WIP] main: run aardvark as a daemon via double-forking and maintain its partial daemon like nature. [WIP] main: run aardvark as a daemon via forking and maintain its partial daemon like nature. Jun 15, 2022
@flouthoc flouthoc marked this pull request as ready for review June 15, 2022 10:17
@flouthoc flouthoc changed the title [WIP] main: run aardvark as a daemon via forking and maintain its partial daemon like nature. main: run aardvark as a daemon via forking and maintain its partial daemon like nature. Jun 15, 2022
flouthoc added a commit to flouthoc/netavark that referenced this pull request Jun 15, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
flouthoc added a commit to flouthoc/netavark that referenced this pull request Jun 15, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
@flouthoc flouthoc force-pushed the support-double-forking branch 2 times, most recently from f3d9392 to b08879c Compare June 15, 2022 10:46
Ok(ForkResult::Parent { child, .. }) => {
log::debug!("starting aardvark on a child with pid {}", child);
// verify aardvark here and block till all the ip are ready
match config::parse_configs(&input_dir) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should try to query the server.

If we want to sync with the child we could create a pipe and make a blocking read in the parent and a write in the child directly before the child starts listening on the socket.

Copy link
Collaborator Author

@flouthoc flouthoc Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is again that it does not ensures e2e testing on the actual interface and does not ensures if something is taking time after it performs listening. We can use pipe but current check ensure that we actually create a DNS client and check against it, is there any problem if we use this approach for verification ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just doesn't feel right to ensure this by making a query. First this adds a hardcoded value with a special meaning which leads to unexpected behaviour when a user for some reason ends up using this name. It also adds a lot of overhead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luap99 I see but one issue is that we have different servers running in different threads and will have to ensure that each one of them is running. My intention is to only return from parent when event loop for all the servers have already started which happens after we start listening and its a blocking call so we cannot notify after that with unix pipe approach we will end up notifying parent before even event loop starts but i doubt that this delay would matter so I don't have a strict opinion here and we could switch to unix pipe approach if everyone is fine with that so no issues. @mheon Do you have any opinion here ?

Copy link
Collaborator Author

@flouthoc flouthoc Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched this a unix pipe blocking approach.

src/main.rs Outdated Show resolved Hide resolved
src/commands/run.rs Show resolved Hide resolved
@flouthoc flouthoc force-pushed the support-double-forking branch from b08879c to 7b8dd89 Compare June 15, 2022 13:01
flouthoc added a commit to flouthoc/netavark that referenced this pull request Jun 15, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
@flouthoc flouthoc force-pushed the support-double-forking branch from 7b8dd89 to 0441806 Compare June 15, 2022 13:22
src/commands/run.rs Outdated Show resolved Hide resolved
src/commands/run.rs Outdated Show resolved Hide resolved
@flouthoc flouthoc force-pushed the support-double-forking branch 3 times, most recently from 4107b83 to 383e425 Compare June 16, 2022 07:50
@flouthoc flouthoc requested a review from Luap99 June 16, 2022 07:52
@edsantiago
Copy link
Member

Since this seems to be the clearinghouse for all the different Aardvark CI flakes, here are the ones seen in the last fifteen days in podman:

Podman run networking [It] Aardvark Test 2: Two containers, same subnet

Podman run networking [It] Aardvark Test 5: Two containers on their own subnets, one container on both

Podman run networking [It] Aardvark Test 3: Two containers, same subnet w/aliases

Podman run networking [It] podman run check dnsname plugin with Netavark

Podman run networking [It] Aardvark Test 6: Three subnets, first container on 1/2 and second on 2/3, w/ network aliases

Podman run networking [It] Aardvark Test 1: One container

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit otherwise LGTM
We should definitely put this and the netavark change in the podman CI to make sure everything still works before merging either PR.

@cevich Is it possible to use netavark/aardvark build from this PR in the podman CI?

// fork and verify if server is running
// and exit parent
// setsid() ensures that there is no controlling terminal on the child process
match unsafe { fork() } {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this unsafe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nix's fork is returns unsafe most likely because it invokes fork( directly in such cases rust compiler will bark however our use-case is okay here so all such functions can be invoked under unsafe. See https://docs.rs/nix/latest/nix/unistd/fn.fork.html and this https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html

But invocation is fine for our use-case.

@cevich
Copy link
Member

cevich commented Jul 12, 2022

@cevich Is it possible to use netavark/aardvark build from this PR in the podman CI?

Yes, but only with some manual intervention[*]. Here's what I would suggest (in podman):

  1. Check out main
  2. Edit .cirrus.yml and comment out the podman_machine task (workaround a bug: fix expected in a day or less).
  3. Use hack/get_ci_vm.sh int podman fedora-36 root host (or one of the other tasks if you prefer).
  4. Do an rpm -e --nodeps aardvark-dns netavark to strip out the distro-provided files w/o also removing any dependencies.
  5. Manually download the latest netavark and aardvark builds:
    1. curl -o /tmp/netavark.zip https://api.cirrus-ci.com/v1/artifact/github/containers/netavark/success/binary.zip
    2. curl -o /tmp/aardvark-dns.zip https://api.cirrus-ci.com/v1/artifact/github/containers/aardvark-dns/success/binary.zip
  6. Unzip those files and manually copy the binaries into place.
  7. Execute the podman tests (if desired) cd $GOSRC; ./contrib/cirrus/runner.sh (should "just work").

[*]Note: If you expect to be doing this regularly, this semi-manual method will quickly become a PITA. In this case, I would propose new CI work for podman. For example, we make setup_environment.sh sensitive to a PR label, and if present perform the steps above automatically.

@Luap99
Copy link
Member

Luap99 commented Jul 12, 2022

Manually download the latest netavark and aardvark builds:

I want the one from this PR and containers/netavark#300, we should verify that this works before merging.

I don't plan on doing this regular, I just want to be sure this passes podman CI

@cevich
Copy link
Member

cevich commented Jul 12, 2022

I want the one from this PR and...

Ahh okay, this can be done in a similar way, the URL is simply different:

  1. Click on the 'success' for each, hit the 'view more details in Cirrus-CI' link.
  2. Look at the URL, it will have a "task ID" (number) at the end. Copy this.
  3. In your VM, use a URL of the form: https://api.cirrus-ci.com/v1/artifact/task/<CIRRUS_TASK_ID>/binary.zip
  4. Do the same thing to get the URL for the other repo's CI run.

Does that make sense?

(Let me know if it doesn't work, I didn't actually try it this time).

@Luap99
Copy link
Member

Luap99 commented Jul 12, 2022

Makes sense I will try it tomorrow.

This is one the redesign PR for aardvark-dns and netavark. Design
proposes that  forking will happen at aardvark-end instead of
netavark and aardvark will verify if servers are up before parent goes
away.

Redesign proposal

* Aardvark will invoke server on the child process by  forking.
* Parent waits for child to show up  and verify against a dummy DNS
  query to check if server is running.
* Exit parent on success and deatch child.

* Calling process will wait for aardvark's parent process to return.
* On successful return from parent it will be assumed that aardvark is
  running properly

One new design is implemented and merged and netavark starts using this
it should close

* containers/podman#14173
* containers/podman#14171

Signed-off-by: Aditya R <[email protected]>
@flouthoc flouthoc force-pushed the support-double-forking branch from 383e425 to 631a2b6 Compare July 13, 2022 06:19
@flouthoc
Copy link
Collaborator Author

Rebased.

flouthoc added a commit to flouthoc/netavark that referenced this pull request Jul 13, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
@Luap99
Copy link
Member

Luap99 commented Jul 13, 2022

OK I checked both the podman integration and system test with this and they passed, LGTM

@baude @mheon PTAL

@baude
Copy link
Member

baude commented Jul 14, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Jul 14, 2022
@openshift-ci openshift-ci bot merged commit 6ba6818 into containers:main Jul 14, 2022
flouthoc added a commit to flouthoc/netavark that referenced this pull request Jul 14, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
flouthoc added a commit to flouthoc/netavark that referenced this pull request Jul 15, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
flouthoc added a commit to flouthoc/netavark that referenced this pull request Jul 15, 2022
Netavark now does not double-forks aardvark-dns's server instead it
waits for aardvark-process to return back and success return means
aardvark-dns is ready to serve requests and now forking happens at
aardvark end.

This needs: containers/aardvark-dns#148

Signed-off-by: Aditya R <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants