Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns: wait for aardvark server just after spawning aardvark-dns. #300

Closed

Conversation

flouthoc
Copy link
Collaborator

Since starting DNS server is the last task for netavark's network setup so netavark must verify at least a few times if aardvark-dns is
in a state where it can serve request before passing context back to
container manager.

Potential fix: containers/podman#14173

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 20, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flouthoc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@flouthoc flouthoc requested a review from Luap99 May 20, 2022 10:13
@flouthoc flouthoc force-pushed the netavark-wait-for-aardvark branch 2 times, most recently from 34d100e to 76045b7 Compare May 20, 2022 10:54
Since starting DNS server is the last task for netavark's `network
setup` so netavark must verify at least a few times if aardvark-dns is
in a state where it can serve.

Signed-off-by: Aditya R <[email protected]>
@flouthoc flouthoc force-pushed the netavark-wait-for-aardvark branch from 76045b7 to 7b394da Compare May 20, 2022 11:57
@Luap99
Copy link
Member

Luap99 commented May 23, 2022

I don't think this is the right way. There is still no guarantee that this waits for aardvark to be ready.

@flouthoc
Copy link
Collaborator Author

flouthoc commented May 23, 2022

@Luap99 Yes as of now it only waits for while and then proceeds. I am just afraid that if we wait for infinite duration and aardvark fails this netavark would end up in blocking state.

I can think of two options. Do you have any suggestion other than these.

  • Wait till aardvark pid shows up and not break on time up limit.
  • Perform a dummy dns request from netavark to aardvark and proceed if true.

while retry_count > 0 {
let aardvark_pid = self.get_aardvark_pid();
if aardvark_pid != -1
&& signal::kill(Pid::from_raw(aardvark_pid), Signal::SIGHUP).is_ok()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid SIGHUP, send signal 0 if possible - it takes no action on the target process, just returns if the process is alive.

@mheon
Copy link
Member

mheon commented May 23, 2022

Waiting for a basic DNS request to process (with a time limit) seems like the best course of action.

{
break;
}
let duration_millis = time::Duration::from_millis(500);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500ms is a little coarse. Recommend adjusting down to 250ms and doubling retries.

@Luap99
Copy link
Member

Luap99 commented May 23, 2022

I think the best thing to do is do a fork on aardvark side. Basically netavark spawns aardvark and then waits for exit. aardvark then sets up everything and once it is ready if forks itself and the parent exits so netavark know it is ready.

This will also fix issues with the error reporting since we can keep the stderr open until the fork.

This however does not fix the issue when adding new entries. In this case we still would need a bidirectional way to update so aardvark could signal us that it is ready.

@flouthoc
Copy link
Collaborator Author

I think we could this i am trying out a spike here: containers/aardvark-dns#148 and we might need few changes on netavark end as well.

@Luap99
Copy link
Member

Luap99 commented Jul 22, 2022

we went with #308 so I am closing this one

@Luap99 Luap99 closed this Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aardvark/Netavark flakes
3 participants