Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restarting socket units #73853

Open
domenkozar opened this issue Nov 21, 2019 · 11 comments
Open

Restarting socket units #73853

domenkozar opened this issue Nov 21, 2019 · 11 comments
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS

Comments

@domenkozar
Copy link
Member

As mentioned in d37458a#diff-0a057d6ff3f6f83f68b859178484f4feR233,
I'd argue the current behavior of restarting sockets is wrong.

Motivation

One of the features of socket activation (per Lennart):

The service's IP socket stays continously connectable, no connection attempt ever fails, and all connects will be processed promptly.

Existing behavior

If unit changes, socket is restarted along side the unit.

Expected behavior

If unit changes, socket is not restarted.

cc @abbradar

@domenkozar domenkozar added 0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS labels Nov 21, 2019
@pasqui23
Copy link
Contributor

Wouldn't this,in long running systems, let obsolete (and potentially vulnerable) daemon running even if the user is thinking that their system is updated?

@pasqui23
Copy link
Contributor

Also,what software this is breaking exactly?

@domenkozar
Copy link
Member Author

domenkozar commented Nov 21, 2019

Wouldn't this,in long running systems, let obsolete (and potentially vulnerable) daemon running even if the user is thinking that their system is updated?

I'm not sure I understand the question. Daemon (unit) would restart, only the socket wouldn't.
There won't be anything outdated. There's another issue restarting socket if it changes, but that's not in scope for this bug.

Also,what software this is breaking exactly?

Any software that wants to have 100% uptime across changes.

@stale
Copy link

stale bot commented Jun 1, 2020

Thank you for your contributions.
This has been automatically marked as stale because it has had no activity for 180 days.
If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.
Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the
    related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse. 3. Ask on the #nixos channel on
    irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 1, 2020
@pennae
Copy link
Contributor

pennae commented Oct 22, 2021

@dasJ just to confirm: #141192 fixes exactly this, right?

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Oct 22, 2021
@dasJ
Copy link
Member

dasJ commented Oct 22, 2021

With a lot of bells and whistles, yes. I'm not exactly doing what Domen is suggesting here but rather I'm restarting sockets and stopping associated services, both conditionally on what changed and how the units are configured.

@dasJ
Copy link
Member

dasJ commented Oct 22, 2021

Now that I have a proper keyboard, it's roughly:

  • Changing the socket restarts it
  • Changing the socket while the service is active restarts it and stops the service
  • Changing the service does nothing when the service is not active
  • Changing the service while it is active stops it but leaves the socket untouched
  • Changing both while the service is active stops the service and restarts the socket

This may not work because there is a race in systemd that we cannot fix but we output a message if that happens.
The race happens when the service should be stopped and the socket should be restarted. If the service gets re-activated between these two steps, restarting the socket fails.

@pennae
Copy link
Contributor

pennae commented Oct 22, 2021

thanks for explaining! :) probably better to keep this open then until the race can be resolved somehow.

@dasJ
Copy link
Member

dasJ commented Oct 22, 2021

The dbus api design of systemd makes it feel like this will never be resolved but we can wait for the stale bot to close this 🤷‍♂️

@stale
Copy link

stale bot commented Apr 25, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 25, 2022
@c4710n
Copy link
Contributor

c4710n commented May 24, 2023

As @domenkozar said, it indeed has a problem when restarting socket-activated services.

The meaning of socket activation is:

If a service dies, its listening socket stays around, not losing a single message. After a restart of the crashed service it can continue right where it left off. If a service is upgraded we can restart the service while keeping around its sockets, thus ensuring the service is continuously responsive. Not a single connection is lost during the upgrade.
-- http://0pointer.de/blog/projects/socket-activation.html

If we restart the *.socket when restarting *.service, then the messages in the listening socket are dropped. Then, We get no benefit from socket-activation services.

If I understand this code right, we are restarting all the *.service related *.socket when the *.service is changed.

IMHO, a better strategy should be:

  1. If a *.socket is added to current configuration, then start it.
  2. If a *.socket is changed, then restart it.
  3. If a *.socket is not changed, then do nothing.
  4. If a *.socket is removed from current configuration, then stop it.

Currently, I use a workaround to keep *.socket running —— creating *.socket and *.service with different name, for example:

Generated my-app-persistent.socket:

# ...

[Socket]
Service=my-app.service   # link to the service

# ...

Generated my-app.service:

# ...
[Unit]
After=network.target my-app-persistent.socket      # link to the socket
BindsTo=my-app-persistent.socket                   # link to the socket
Requires=network.target my-app-persistent.socket   # link to the socket
# ...

The basic idea is to decouple the socket and service, so that Nix cannot directly analyze their
relationship(hence Nix can do nothing), and then establish the relationship manually.

This is ugly but works.

For Nix, I am still a beginner and I hope to have the ability to fix this issue one day.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS
Projects
None yet
Development

No branches or pull requests

5 participants