Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPN-4798: Linux cgroup application matching heuristics #8801

Merged
merged 19 commits into from
Dec 18, 2023

Conversation

oskirby
Copy link
Collaborator

@oskirby oskirby commented Dec 12, 2023

Description

App exclusions on Linux mostly make use of the GTK Launched signal over D-Bus to kick off application detection by connecting a *.desktop file and a control group, and this has worked okay-ish for the most part, but it leaves a lot of edge cases where applications can fall through the cracks if the PID that we get from the GTK launch event dies before the following control group is created. And it just so happens that this has become the case for Snap-packaged applications as of Ubuntu/23.04 Lunar.

To fix this, I think it's finally time that we get off our butts and do some proper heuristic matching between the control group name and typical launcher commands used for Linux desktop environments. To make this work, we first convert the *.desktop file into a desktop file ID and then try to parse a matching desktop file ID out of the cgroup path name.

So far this supports applications launched via the Gnome shell launcher, Snaps and Flatpaks. Perhaps with works with more desktop environments too? Only testing will tell!

Platforms tested thus far:

  • Ubuntu 20.04 native apps
  • Ubuntu 20.04 snaps
  • Ubuntu 23.04 native apps
  • Ubuntu 23.04 snaps
  • Ubuntu 23.04 flatpaks
  • Debian 12 native apps on Gnome shell
  • Debian 12 native apps on KDE Plasma
  • Fedora 38 native apps
  • Fedora 38 flatpaks

Reference

Github issue #6917 (VPN-4798)
Github issue #8016 (VPN-5535)
Freedesktop spec: https://specifications.freedesktop.org/desktop-entry-spec/desktop-entry-spec-latest.html#desktop-file-id

Checklist

  • My code follows the style guidelines for this project
  • I have not added any packages that contain high risk or unknown licenses (GPL, LGPL, MPL, etc. consult with DevOps if in question)
  • I have performed a self review of my own code
  • I have commented my code PARTICULARLY in hard to understand areas
  • I have added thorough tests where needed

@oskirby oskirby requested a review from strseb December 13, 2023 20:34
@oskirby oskirby marked this pull request as ready for review December 13, 2023 20:34
@vinocher
Copy link
Contributor

cracks if the PID that we get from the GTK launch event dies before the following control group is created

Trying to understand this better: Why does this always happen for snap-packaged apps? It sounds like this is not an edge case for snap apps, is that right?


// Find the application dir in the path.
const QString dirComponent("/applications/");
qsizetype index = path.lastIndexOf(dirComponent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this search be CaseInsensitive?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, path names in Linux are case sensitive, so I think the answer is no.

// swapped from a dot to a dash. Which makes the parsing a bit of a pain.
//
// See: https://github.com/snapcore/snapd/blob/master/sandbox/cgroup/scanning.go
QString AppTracker::snapDesktopId(const QString& scope) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these string manipulation functions be moved to LinuxDependencies or some String utils file since they don't need any class state? Also, consider adding some tests for the string manipulation functions, now or later.

// HACK: Quick and dirty split tunnelling.
// TODO: Apply filtering to currently-running apps too.
if (m_excludedApps.contains(appId)) {
// Remove any firewall rules applied to this control group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this app also be removed from m_excludedApps?

Copy link
Collaborator Author

@oskirby oskirby Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the gist of this, is that m_excludedApps holds the applications that the user wishes to exclude, so it kind of hangs around as a persistent setting, but m_excludedCgroups holds the matched applications that are currently running.

This signal occurs when the application closes, in which case the cgroup has been destroyed, but we still need to remember the setting in case the user re-opens it.

Copy link
Contributor

@vinocher vinocher Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any risk of desktopId re-use that can impact m_excludedApps, because the ID is not removed when the app is terminated? For example, could an excluded app be uninstalled and another app installed with the same desktopId while the VPN is on? Or is this an unlikely edge case that is difficult to handle?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this scenario would be commonly encountered if a user updates software on their device while the VPN is active, and I think the correct behavior in that case is to treat the newly-installed application identically to the previous version.

The recommendation in the freedesktop specification is for application names to follow a reverse-DNS naming convention, in which case they are expected to be a kind of globally unique identifier.

src/platforms/linux/daemon/dbusservice.cpp Outdated Show resolved Hide resolved
@@ -156,3 +156,31 @@ QString LinuxDependencies::kdeFrameworkVersion() {

return QString();
}

// static
QString LinuxDependencies::desktopFileId(const QString& path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method seems to return a non-empty string even if '/applications/' is not in the path. Should it try to detect error conditions and return an empty string in those cases, or is that not possible? Callers seem to pass on the result to other methods without additional checks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear how to handle that edge case, but I wanted to try and err on the side of caution and make sure that we always return something suitable for cgroup matching. I guess taking the file name and using that might be a more suitable fallback for that case.

return m_runningApps.value(cgroup);
}
QStringList findByDesktopId(const QString& desktopId) const {
return m_runningApps.keys(desktopId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the cgroup path <--> desktopId mapping 1:1? Or can there be multiple cgroup paths that map to a desktopId?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The desktopId -> cgroup mapping is one-to-many. Each cgroup matches at most one desktopId, but a desktopId can match multiple cgroups.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It may be useful to add this information as a comment above the m_runningApps definition. I'm a bit confused because the cgroups man page says:
A cgroup is a collection of processes that are bound to a set of limits or parameters defined via the cgroup filesystem

src/platforms/linux/daemon/apptracker.cpp Outdated Show resolved Hide resolved
src/platforms/linux/daemon/dbusservice.h Outdated Show resolved Hide resolved
Comment on lines 284 to 292
if (m_excludedCgroups.contains(cgroup)) {
m_wgutils->resetCgroup(cgroup);
}
m_excludedCgroups[cgroup] = state;
if (state == Excluded) {
// Excluded control groups are given special netfilter rules to direct
// their traffic outside of the VPN tunnel.
m_wgutils->excludeCgroup(cgroup);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if the cgroup state is already in the expected state before doing this work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That felt like an unnecessary edge case to handle. If the cgroup is already in the expected state then this will simply remove the firewall rules for it, and then re-add them.

@oskirby
Copy link
Collaborator Author

oskirby commented Dec 15, 2023

Trying to understand this better: Why does this always happen for snap-packaged apps? It sounds like this is not an edge case for snap apps, is that right?

I think this breakage occurs on snap because applications are launched as a two-stage process. The first step is that snaps create a .desktop file that the desktop environment can see, which contains a command like /usr/bin/snap <target app>. The GTK Launched event does successfully report the creation of this process, and we can split-tunnel it just fine, but that command is just a wrapper that creates a sandbox environment for the target app (including a new cgroup) and launches the target app within it. The detection of that new cgroup has always been brittle because it depends on matching the PID from the GTK Launched event and there is no real guarantee that PIDs will stay running for any length of time after that signal, which opens us up to race conditions.

So, the real fix happening here is the addition of the AppTracker::snapDesktopId() to try and figure out the desktopId of a snap without any dependency on the lifetime of the contained process IDs, and then converting the AppTracker to match apps by desktopId rather than the full file path. The rest of this PR is just trying to clean up and modernize some of my old code.

And given that we are now using desktopIds instead of files, we can also do the same for Gnome and Flatpak apps, which also embed the desktopId in the cgroup path name - which makes the GTK Launched signal unnecessary.

src/platforms/linux/daemon/apptracker.h Outdated Show resolved Hide resolved
src/platforms/linux/daemon/apptracker.h Outdated Show resolved Hide resolved
return m_runningApps.value(cgroup);
}
QStringList findByDesktopId(const QString& desktopId) const {
return m_runningApps.keys(desktopId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It may be useful to add this information as a comment above the m_runningApps definition. I'm a bit confused because the cgroups man page says:
A cgroup is a collection of processes that are bound to a set of limits or parameters defined via the cgroup filesystem

src/platforms/linux/daemon/apptracker.h Outdated Show resolved Hide resolved
result.replace(start, match.capturedLength(0), QString(code));
offset = start + 1;
} else {
offset = match.capturedEnd(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checking that if the match is at the end of the original string, captureEnd will return an offset beyond the end of the string. Is that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a problem, the loop will terminate in such a case because the loop condition requires the offset to be contained within the string.

src/platforms/linux/daemon/apptracker.cpp Show resolved Hide resolved
data->rootpid = m_lastLaunchPid;
break;
// Make an attempt to resolve the desktop ID from a cgroup scope.
QString AppTracker::findDesktopId(const QString& cgroup) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding some examples of cgroup scope names, to make it easier for the reader to understand this method. Examples for various names that start with app-gnome-, app-flatpak-, gnome-launched- etc. would be helpful.

@vinocher
Copy link
Contributor

I think this breakage occurs on snap because applications are launched as a two-stage process. The first step is that snaps create a .desktop file that the desktop environment can see, which contains a command like /usr/bin/snap <target app>. ...

Thanks for the explanation. It may be helpful to add some of this information as a class header comment in AppTracker.

@oskirby oskirby requested a review from vinocher December 18, 2023 21:38
Copy link
Contributor

@vinocher vinocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Comments are very helpful!

@oskirby oskirby enabled auto-merge (squash) December 18, 2023 22:15
@github-actions github-actions bot added the 🛬 Landing This PR is marked as "auto-merge" label Dec 18, 2023
@oskirby oskirby merged commit caf70fc into main Dec 18, 2023
125 of 127 checks passed
@oskirby oskirby deleted the vpn-4798-cgroup-heuristic-matching branch December 18, 2023 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🛬 Landing This PR is marked as "auto-merge"
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants