Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative (non-CLI) Socket Handover Mechanisms #2

Open
Ferdi265 opened this issue Jul 22, 2024 · 22 comments
Open

Alternative (non-CLI) Socket Handover Mechanisms #2

Ferdi265 opened this issue Jul 22, 2024 · 22 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@Ferdi265
Copy link
Owner

Ferdi265 commented Jul 22, 2024

The current handover mechanism is based on the CLI options that Kwin uses for socket handover and not on any agreed-upon standard. These CLI arguments are inconsistently named (--socket vs --wayland-fd) and also have a CLI option format that not all compositors share (River uses single dash CLI options, cosmic-comp currently has no CLI parsing at all).

This means that there is a need for a different socket handover mechanism. Options include:

  • CLI-based, keep the Kwin option naming
    • pro: Kwin and Hyprland already have support
    • con: inconsistent naming
    • con: forces a specific kind of CLI argument format onto compositors
  • CLI-based, but with more consistent naming (--wayland-socket and --wayland-fd)
    • pro: consistent naming
    • con: forces a specific kind of CLI argument format onto compositors
  • Environment-based, Systemd Socket Activation style: LISTEN_PID, LISTEN_FDS, LISTEN_FDNAMES with the fd as file descriptor 3.
    • pro: potential future integration with systemd possible
    • pro: simple to implement even without systemd
    • pro: these specific env vars are starting to become a kind of standard, with varlink also using them
    • con: very generic env var name
    • con: unclear how to pass the wayland socket name (LISTEN_FDNAMES is also used by other fd passing and we should use a generic name like wayland)
  • Environment-based, but with custom env vars: WAYLAND_SOCKET_NAME, WAYLAND_SOCKET_FD
    • pro: even simpler to implement
    • con: different from systemd mechanism
@0x5a4
Copy link

0x5a4 commented Jul 23, 2024

Heres my 2ct on this:

  1. I feel like maintaining KDE Compatibility isnt really necessary, as they have their own tool and anyone using KDE will propably just stick with that. Hyprlands options are so young that breaking them might not be that big of a deal.
  2. I like this, but in regards to River and cosmic its kind of a non-option?
  3. I've spent around 5 minutes researching systemd socket activation and in my understanding its meant for on-demand kind of services (like pipewire). I just dont think this is the case with wayland compositors. Maybe if someone runs firefox from the tty and then the compositor would start on demand but I feel like that is a very very uncommon situation. (Also the naming is to generic IMO)
  4. I like this.

Even though I dislike environment variables for their implicit behaviour, I think the WAYLAND_SOCKET_NAME and WAYLAND_SOCKET_FD ones are the best option here.

@Ferdi265
Copy link
Owner Author

Ferdi265 commented Jul 23, 2024

Heres my 2ct on this:

1. I feel like maintaining KDE Compatibility isnt really necessary, as they have their own tool and anyone using KDE will propably just stick with that. Hyprlands options are so young that breaking them might not be that big of a deal.

Right, makes sense. I would still support this as an alternative to not break compatibility so early.

2. I like this, but in regards to River and cosmic its kind of a non-option?

Yeah, the options are nice, but it doesn't really work in their cases. I would only go for this if some compositors are absolutely against env vars, in which case I would support both env vars and CLI options.

3. I've spent around 5 minutes researching systemd socket activation and in my understanding its meant for on-demand kind of services (like pipewire). I just dont think this is the case with wayland compositors. Maybe if someone runs firefox from the tty and then the compositor would start on demand but I feel like that is a very very uncommon situation. (Also the naming is to generic IMO)

Socket activation can also be used to allow dependent services to start up in parallel. I tried to launch Sway as a systemd user service once, and had an issue where sometimes services like kanshi or mako failed to start up because they launched before the wayland socket existed. Socket activation can make sure the socket exists earlier and "buffer" connections until the compositor is ready. I need to look into the specifics, but potentially the socket unit can be made to only be active after a session is created (e.g. a user logs in)

4. I like this.

Even though I dislike environment variables for their implicit behaviour, I think the WAYLAND_SOCKET_NAME and WAYLAND_SOCKET_FD ones are the best option here.

Yes, these env vars are basically a direct translation of the CLI options to env vars. Main disadvantage with this over the CLI is that if the compositor doesn't support it, those env vars (and the socket FD!) will leak to all clients in the session. Unknown CLI options are much less likely to be silently ignored (counterexample: cosmic-comp completely ignores its args and the socket would leak even if the CLI option is used if cosmic doesn't support socket handover).


Thanks for the feedback! I'm still quite interested in what compositor developers have to say to this, since they are ultimately who will have to consume this API.

@stefonarch
Copy link

1. I feel like maintaining KDE Compatibility isnt really necessary, as they have their own tool and anyone using KDE will propably just stick with that. Hyprlands options are so young that breaking them might not be that big of a deal.

Right, makes sense. I would still support this as an alternative to not break compatibility so early.

It could be useful when kwin_wayland is used outside plasma as with lxqt-wayland-session. I tested the feature when it came out and it didn't recover from crash. And there are several methods to make it crash always (using hot corners for example).

@Ferdi265
Copy link
Owner Author

Ferdi265 commented Jul 23, 2024

1. I feel like maintaining KDE Compatibility isnt really necessary, as they have their own tool and anyone using KDE will propably just stick with that. Hyprlands options are so young that breaking them might not be that big of a deal.

Right, makes sense. I would still support this as an alternative to not break compatibility so early.

It could be useful when kwin_wayland is used outside plasma as with lxqt-wayland-session. I tested the feature when it came out and it didn't recover from crash. And there are several methods to make it crash always (using hot corners for example).

I recently tested LXQt + Kwin with both wl-restart and kwin_wayland_wrapper, and the basics work with both for clients that support recovering (e.g. konsole) if the start script manually exports QT_WAYLAND_RECONNECT=1 before starting Kwin.

@stefonarch
Copy link

I recently tested LXQt + Kwin with both wl-restart and kwin_wayland_wrapper, and the basics work with both for clients that support recovering (e.g. konsole) if the start script manually exports QT_WAYLAND_RECONNECT=1 before starting Kwin.

Yes, tested again for good. lxqt-panel doesn't recover, pcmanfm-qt's desktop most of the time, featherpad, telegram and qterminal always.

@stefonarch
Copy link

Tested now with wl-restart kwin_wayland and it works fine too. The issue with kwin is that the argument lxqt-session will be executed again, autostarting all applications again.

But using hot corners on kwin_wayland to make it crash can lead also to a complete block of any input and the need of a hard reset. Happened twice now...

@emersion
Copy link

I agree the naming of the CLI flags is not super consistent (--wayland-display would make more sense than --socket…). LISTEN_FDS is great because it's the closest to a standard as we're going to get. My suggestion would be to define a listen FD with the name "wayland" and leave it up to the parent to set WAYLAND_DISPLAY accordingly.

@Ferdi265
Copy link
Owner Author

I agree the naming of the CLI flags is not super consistent (--wayland-display would make more sense than --socket…). LISTEN_FDS is great because it's the closest to a standard as we're going to get. My suggestion would be to define a listen FD with the name "wayland" and leave it up to the parent to set WAYLAND_DISPLAY accordingly.

Now that you mention it, you're right, LISTEN_FDS seems to be quite popular currently (varlink also uses it).

One issue with having the parent set WAYLAND_DISPLAY is that it makes it hard for the compositor to be run in nested (Wayland in Wayland) mode, since the parent (wl-restart in this case) already overwrote the old WAYLAND_DISPLAY. Not a very common case, but should potentially be considered.

@emersion
Copy link

Hm right that's a good point… Other options would include:

  • Include the WAYLAND_DISPLAY value inside the FD name ("wayland:", e.g. "wayland:wayland-1", or is there a common pattern used for this situation by other software already?)
  • Pick a new env var for this

@davidedmundson
Copy link

A challenge to think about is FD's from wayland security-contexts. We need not just one FD, but many, with metadata and dynamically changing.

There's a bunch of options to fixing this, we (kwin) were trying a port to Systemd's FD store. It's basically an API to dynamically give it an FD and a name, then get them back. I'm not sure how well that fits into here.

It means you have a standard third party program has all the FDs, but I don't think it'll share them if you swap what you're launching.

@Ferdi265
Copy link
Owner Author

A challenge to think about is FD's from wayland security-contexts. We need not just one FD, but many, with metadata and dynamically changing.

There's a bunch of options to fixing this, we (kwin) were trying a port to Systemd's FD store. It's basically an API to dynamically give it an FD and a name, then get them back. I'm not sure how well that fits into here.

It means you have a standard third party program has all the FDs, but I don't think it'll share them if you swap what you're launching.

Thanks for the insight! Security contexts are something I didn't think about when making this.

@emersion
Copy link

I think the security-context stuff is less of an issue because it's all contained inside each compositor impl, no? IOW: when being restarted it's up to the compositor to grab any previous security-context FD from the systemd store or anywhere else, and tools such as wl-restart don't need to do anything special.

@Ferdi265
Copy link
Owner Author

Ferdi265 commented Oct 19, 2024

  • Include the WAYLAND_DISPLAY value inside the FD name ("wayland:", e.g. "wayland:wayland-1", or is there a common pattern used for this situation by other software already?)

: is not allowed in file descriptor names, since it is used to separate names for multiple fds passed to the same process. Other ASCII punctuation is allowed though. A simple solution is to just just wayland-1, wayland-2, etc.. as the FD name, as long as it starts with wayland, since I currently don't know of any Wayland compositors that use Wayland socket names that start with anything other than wayland.

There's a bunch of options to fixing this, we (kwin) were trying a port to Systemd's FD store. It's basically an API to dynamically give it an FD and a name, then get them back. I'm not sure how well that fits into here.

I think the security-context stuff is less of an issue because it's all contained inside each compositor impl, no? IOW: when being restarted it's up to the compositor to grab any previous security-context FD from the systemd store or anywhere else, and tools such as wl-restart don't need to do anything special.

Regarding FD store: IIUC file descriptors previously stored in the systemd FD store are passed back to the application by systemd on restart using extra entries in LISTEN_FDS, which means any compositor using the systemd FD store likely can't use a generic restart wrapper such as wl-restart, since the stored FDs would only be passed back on a full service restart (and not when only the compositor dies and wl-restart restarts it). At least I know of no active way for a service to poll the FD store, only adding FDs to it and receiving them on start via LISTEN_FDS.

wl-restart with the LISTEN_FDS socket passing protocol could be useful though to give compositors some form of restart recovery without them being tied to a specific service manager such as systemd, and would at the same time increase systemd compatibility for them (by having them support LISTEN_FDS). This is likely interesting for both the systemd and non-systemd crowds and would reduce friction. Maybe wl-restart could grow a standalone implementation of a systemd-compatible FD store in the future as well if the need arises.

@Ferdi265
Copy link
Owner Author

So my takeaway from the discussion up until now is that there are significant reasons to choose LISTEN_FDS and co, and most compositors seem to prefer env vars over CLI args.

@emersion
Copy link

I currently don't know of any Wayland compositors that use Wayland socket names that start with anything other than wayland

gamescope uses gamescope-0, gamescope-1, etc.

@emersion
Copy link

emersion commented Oct 20, 2024

any compositor using the systemd FD store likely can't use a generic restart wrapper such as wl-restart, since the stored FDs would only be passed back on a full service restart

Couldn't the wrapper:

  • Create the socket FD and push it to the fdstore
  • exec() the compositor (ie, exit before starting the compositor)
  • Passthrough all other LISTEN_FDS it's been started with

@Ferdi265
Copy link
Owner Author

any compositor using the systemd FD store likely can't use a generic restart wrapper such as wl-restart, since the stored FDs would only be passed back on a full service restart

Couldn't the wrapper:

* Create the socket FD and push it to the fdstore

* `exec()` the compositor (ie, exit before starting the compositor)

* Passthrough all other `LISTEN_FDS` it's been started with

In that case there'd be no need for a wrapper, since its whole purpose is restarting the compositor, which it can't do if it exec()s itself away. Just creating the wayland socket and moving it to the FD store is certainly a thing that can be done (maybe potentially needed for the lock file?), but I'm not sure if it's worth the complexity, since it would rely on systemd anyway.

I see two (maybe three) use-cases:

  • full systemd: compositor is started and restarted by a systemd service, wayland socket is handled by a systemd socket unit, FD store is managed by systemd
    • this is for compositors and distros that fully embrace systemd
  • no systemd: wl-restart is started manually or via some other mechanism, compositor is started and restarted by wl-restart, FD store either doesn't exist or is implemented compatibly by wl-restart
    • this is for compositors or distros/users that don't want all to be systemd, or devs that want to start a systemd FD store compositor manually, outside systemd for development
  • both: wl-restart is started by systemd service, everything else happens as in 'no systemd', FD store could be transparently passed through to systemd
    • this is weird, but technically possible, not sure why I listed this

Not sure if I interpret this correctly, I'm open to suggestions and corrections.

@Ferdi265
Copy link
Owner Author

I currently don't know of any Wayland compositors that use Wayland socket names that start with anything other than wayland

gamescope uses gamescope-0, gamescope-1, etc.

Thanks, didn't know about that one. Another possibility would be fd names wayland-wayland-1 and wayland-gamescope-1 or use some other separator character?

@davidedmundson
Copy link

Create the socket FD and push it to the fdstore

I have a WIP doing that. https://invent.kde.org/plasma/kwin/-/merge_requests/6270
It solves the problem of sandboxed clients and keeping the X connection alive so XWayland survives restart, so there's a bunch of advantages and a direction I would like to head.

There's a few tiny annoying details that makes implementations not trivial:

  • you need to not just have the FD for wayland-X but also know the name of the socket in case wayland-0 is taken.
    Apparently a common path here is to have an FD which is just a blob of random other metadata.

  • the lock file needs keeping alive / recreating in case someone starts a nested session.

@Ferdi265
Copy link
Owner Author

Ferdi265 commented Oct 20, 2024

So as I gather the main questions with LISTEN_FDS passing are:

  • how do we pass the Wayland socket name? (preferrably in a way that doesn't overwrite WAYLAND_DISPLAY so that wl-restart can also be used to start a nested session)
    • we could use another env var like WAYLAND_SOCKET_NAME (simple, also easy to add to a systemd service file if going systemd only)
      • from the perspective of a compositor it would get the wayland FD as e.g. LISTEN_FDS=1 (FD3 is then the socket fd), LISTEN_FDNAMES=wayland and WAYLAND_SOCKET_NAME=wayland-2
    • we could encode the wayland socket name into the LISTEN_FDNAME with a common prefix (IMO an ugly solution since the string parsing is ugly enough with : separation already)
    • pass another metadata FD (IMO also ugly, since it's complex and I'd rather not impose too much on compositor implementations; if compositors want to use a metadata FD and an FD store, it's their choice how they represent their data)
  • how do we deal with the lock file?
    • wl-restart currently locks the lock file after creating the socket, which works, but this doesn't work for systemd only
    • a simpler wrapper (call it wl-sd-socket) could also find/create a lock/socket, and then pass that socket and lock on to the systemd FD store before exec()ing into the compositor
    • for systemd only, a separate unit that locks the lock file that the socket unit binds to could be created (this ensures the socket is only ever bound when the lock could be acquired). This only works though if the wayland display number is hardcoded into the unit, which is a bummer.
    • from the perspective of a compositor, receiving a socket FD from systemd or a wrapper means the socket is already locked and bound, so it doesn't have to do anything at all. Locks for other things can be kept alive by adding them to the FD store (if I understand flock() correctly?).

@Ferdi265
Copy link
Owner Author

FYI: I just published wl-restart 0.3.0 with added support for the WAYLAND_SOCKET_NAME/WAYLAND_SOCKET_FD environment-based mechanism, since it was already merged into the main branch.

I will likely work on integrating and fleshing out LISTEN_FDS-style socket passing for 0.4, and intend to try and implement some form of compatibility or handling of systemd fd stores for 0.5 or later.

@WhyNotHugo
Copy link

A simple solution for this situation:

usage: wl-restart --fd FD [[options] --] compositor-cmd <compositor-args>

FD is the number of the file descriptor for the socket that is being passed to the compositor.

Example usage:

wl-restart --fd 4 -- my-compositor --socket 4

--fd 4 tells wl-restart to pass the socket-fd as fd 4. --socket 4 tells my-compositor that it's receiving an fd for the socket as fd=4:

Pros:

  • Doesn't impose a specific naming or argument to compositors; a compositor just needs some flag where the fd can be specified.
  • Socket activation can still be implemented separately in an intermediate command in the chain of execution. It can also be done with a command further up in the chain of execution.

Cons:

  • Requires substantial glue code to work with inetd-style socket activation.
  • The change in API is a breaking change for wl-restart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants