Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network-online.target is reached before network is online #35567

Open
Baughn opened this issue Feb 25, 2018 · 7 comments
Open

network-online.target is reached before network is online #35567

Baughn opened this issue Feb 25, 2018 · 7 comments
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS

Comments

@Baughn
Copy link
Contributor

Baughn commented Feb 25, 2018

Issue description

This isn't fully debugged, but I'll share what I know so far:

Services which depend on reaching the internet, e.g. ZNC, should be marked as depending in systemd's network-online.target, to prevent them from coming up too early.

When there are multiple network addresses expected, it appears that the target is reached once even a single of them is brought up. For the particular case that bit me, this was a statically configured v6 address combined with a DHCP'd v4 address, with ZNC as a user unit. ZNC started the moment the v6 address was configured, multiple seconds before IPv4 access was available.

In the particular case of ZNC, once it has started, it'll never look for any network interfaces that come up later -- so if the IPv4 interface is down, it'll never be able to reach any v4-only IRC networks unless manually restarted.

I don't know what would happen if there are multiple distinct NICs on the system, but network-online should only be reached once all interfaces and addresses are fully configured.

Steps to reproduce

Here's a listing of every possibly relevant configuration element:

  networking.hostName = "madoka";
  networking.hostId = "f7fcf93e";
 # networking.defaultGateway = "138.201.133.1";
  # Doesn't work due to missing interface specification.
  #networking.defaultGateway6 = "fe80::1";
  networking.localCommands = ''
    ${pkgs.nettools}/bin/route -6 add default gw fe80::1 dev enp0s31f6 || true
  '';
  networking.nameservers = [ "8.8.8.8" "8.8.4.4" ];
  networking.interfaces.enp0s31f6 = {
    ip6 = [{
      address = "2a01:4f8:172:3065::2";
      prefixLength = 64;
    }];
  };
  networking.firewall = {
    allowPing = true;
    allowedTCPPorts = [ 
      80 443  # Web-server
      25565 25566 25567  # Minecraft
      4000  # ZNC
      12345  # JMC's ZNC
    ];
    allowedUDPPorts = [
      34197 # Factorio
    ];
  };
  networking.nat = {
    enable = true;  # For mediawiki.
    externalIP = "138.201.133.39";
    externalInterface = "enp0s31f6";
    internalInterfaces = [ "ve-eln-wiki" ];
  };

Technical details

Please run nix-shell -p nix-info --run "nix-info -m" and paste the
results.

 - system: `"x86_64-linux"`
 - host os: `Linux 4.9.81, NixOS, 18.03pre-git (Impala)`
 - multi-user?: `yes`
 - sandbox: `relaxed`
 - version: `nix-env (Nix) 1.11.16`
 - channels(root): `"nixos-18.03pre125750.a6dca042722"`
 - nixpkgs: `/etc/nix-system-pkgs`

Amusing aside

This originally manifested as the IRCCloud Android client repeatedly reconnecting to their servers. It turned out there was a bug in IRCCloud's error parsing for ZNC, causing their server to repeatedly crash whenever ZNC sent an error about being unable to reach systemnet, which of course led to never seeing that error.

So in addition to this bug, there were two others -- IRCCloud's, and ZNC's -- combining forces to throw me out of IRC. Debugging this was fun.

@abbradar
Copy link
Member

cc @fpletz. Related: #35141

@fpletz
Copy link
Member

fpletz commented Feb 26, 2018

This is the default behaviour of dhcpcd -w. It will wait for any address to be configured. Back when I added support for network-online.target I thought that this was the best solution because the host could either have IPv4 or IPv6 connectivity. If there are multiple NICs it will also wait for only one address to be configured.

dhcpcd also has the --waitip [4|6] option to wait for either a IPv4 or IPv6 address. As we don't know which protocol is expected from the current options in the module system it may wait indefinitely or fail due to a systemd timeout. This will also not work for multiple interfaces.

networkd on the other hand will wait for all interfaces it manages to be configured before activating network-online.target. It is also possible to configure interfaces to be ignored for network-online.target. (RequiredForOnline=)

@fpletz fpletz added 0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS labels Feb 26, 2018
@fpletz fpletz added this to the 18.03 milestone Feb 26, 2018
@fpletz
Copy link
Member

fpletz commented Feb 26, 2018

Regarding your config: Note that there is also networking.defaultGateway6.address and networking.defaultGateway6.interface.

@matthewbauer matthewbauer modified the milestones: 18.03, 18.09 Apr 17, 2018
@matthewbauer matthewbauer modified the milestones: 18.09, 19.03 Nov 5, 2018
@matthewbauer matthewbauer modified the milestones: 19.03, 19.09 May 27, 2019
@fpletz fpletz modified the milestones: 19.09, 20.03 Nov 17, 2019
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/boot-faster-by-disabling-udev-settle-and-nm-wait-online/6339/1

@stale
Copy link

stale bot commented Sep 17, 2020

Hello, I'm a bot and I thank you in the name of the community for opening this issue.

To help our human contributors focus on the most-relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human.

The community would appreciate your effort in checking if the issue is still valid. If it isn't, please close it.

If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me". If you'd like it to get more attention, you can ask for help by searching for maintainers and people that previously touched related code and @ mention them in a comment. You can use Git blame or GitHub's web interface on the relevant files to find them.

Lastly, you can always ask for help at our Discourse Forum or at #nixos' IRC channel.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 17, 2020
@veprbl veprbl removed this from the 20.03 milestone May 31, 2021
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label May 31, 2021
@Baughn
Copy link
Contributor Author

Baughn commented Feb 12, 2022

I've since come to believe that this is really a bug in ZNC, and there's no way I can think of to solve it at the NixOS level. So I'd suggest closing as infeasible.

@AleXoundOS
Copy link
Contributor

As I remember, the workaround is to call systemctl try-reload-or-restart for needed services in something like networking.dhcpcd.runHook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS
Projects
None yet
Development

No branches or pull requests

7 participants