Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[teleport-update] Isolated installation suffix #49364

Merged
merged 22 commits into from
Dec 3, 2024

Conversation

sclevine
Copy link
Member

@sclevine sclevine commented Nov 22, 2024

This PR adds support for isolated installations of agents that can operate and be updated independently on the same Linux machine.

Any time teleport-update is run with the --install-suffix flag, it modifies the paths to use isolated directories and suffixes.

Note that the non-isolated paths have changed significantly as well. See this comment.

System package install:

Binaries: /opt/teleport/system/bin/* -> /usr/local/bin/*
Service: /opt/teleport/system/lib/systemd/system/teleport.service -> /lib/systemd/system/teleport.service
(others same as below)

Non-isolated, auto-updates:

Binaries: /opt/teleport/default/versions/1.2.3/bin/* -> /usr/local/bin/*
Service: /opt/teleport/default/versions/1.2.3/lib/systemd/system/teleport.service ->
/lib/systemd/system/teleport.service
Lock: /opt/teleport/default/update.lock
Data dir: /var/lib/teleport
PID: /run/teleport.pid
Config: /etc/teleport.yaml
Defaults: /etc/defaults/teleport
Updater service: /etc/systemd/system/teleport-update.service

Isolated, auto-updates:

Binaries: /opt/teleport/mycluster/versions/1.2.3/bin/* -> /opt/teleport/mycluster/bin/*
Service: /opt/teleport/mycluster/versions/1.2.3/lib/systemd/system/teleport.service ->
/etc/systemd/system/teleport_mycluster.service
Lock: /opt/teleport/mycluster/update.lock
Data dir: /var/lib/teleport_mycluster
PID: /run/teleport_mycluster.pid
Config: /etc/teleport_mycluster.yaml
Defaults: /etc/defaults/teleport (shared)
Updater service: /etc/systemd/system/teleport-update_mycluster.service

Note that the non-isolated installation uses system package paths for the systemd service, config, PID file, and data dir to maintain compatibility with existing installation methods, such as the package.

Also note that the underlying struct is called Namespace. I plan to change this to InstallSuffix in the future, but I have several PRs chained on top of this one, so I want to leave it as-is for now.


The teleport-update binary will be used to enable, disable, and trigger automatic Teleport agent updates.

RFD: #47126
Goal (internal): https://github.com/gravitational/cloud/issues/10289

Example:

ubuntu@legendary-mite:~$ sudo ./teleport-update enable --install-suffix mycluster --proxy=proxy.example.com
2024-11-22T21:52:21Z WARN [UPDATER]   Custom namespace is specified. Teleport's data_dir must be configured accordingly. data_dir:/var/lib/teleport_mycluster path:/opt/teleport/mycluster/bin config:/etc/teleport_mycluster.yaml service:teleport_mycluster.service pid:/run/teleport_mycluster.pid teleport-update/main.go:284
2024-11-22T21:52:22Z INFO [UPDATER]   Initiating installation. target_version:16.4.7 active_version: agent/updater.go:331
2024-11-22T21:52:22Z INFO [UPDATER]   Version already present. version:16.4.7 agent/installer.go:150
2024-11-22T21:52:22Z INFO [UPDATER]   Executing new teleport-update binary to update configuration. agent/updater.go:150
2024-11-22T21:52:23Z INFO [UPDATER]   Systemd configuration synced. unit:teleport-update_mycluster.timer agent/process.go:259
2024-11-22T21:52:23Z INFO [UPDATER]   [stderr] Created symlink /etc/systemd/system/teleport_mycluster.service.wants/teleport-update_mycluster.timer → /etc/systemd/system/teleport-update_mycluster.timer. agent/process.go:392
2024-11-22T21:52:23Z INFO [UPDATER]   Service enabled. unit:teleport-update_mycluster.timer agent/process.go:276
2024-11-22T21:52:23Z INFO [UPDATER]   Finished executing new teleport-update binary. agent/updater.go:156
2024-11-22T21:52:23Z INFO [UPDATER]   Target version successfully installed. target_version:16.4.7 agent/updater.go:702
2024-11-22T21:52:23Z WARN [UPDATER]   Systemd service not running. unit:teleport_mycluster.service agent/process.go:84
2024-11-22T21:52:23Z INFO [UPDATER]   Configuration updated. agent/updater.go:347
ubuntu@legendary-mite:~$ ls -la /opt/teleport/mycluster/bin/
lrwxrwxrwx 1 root root   66 Nov 22 21:08 fdpass-teleport -> /opt/teleport/mycluster/versions/16.4.7/bin/fdpass-teleport
lrwxrwxrwx 1 root root   55 Nov 22 21:08 tbot -> /opt/teleport/mycluster/versions/16.4.7/bin/tbot
lrwxrwxrwx 1 root root   55 Nov 22 21:08 tctl -> /opt/teleport/mycluster/versions/16.4.7/bin/tctl
lrwxrwxrwx 1 root root   59 Nov 22 21:08 teleport -> /opt/teleport/mycluster/versions/16.4.7/bin/teleport
lrwxrwxrwx 1 root root   65 Nov 22 21:10 teleport-update -> /home/ubuntu/mounts/teleport/tool/teleport-update/teleport-update
lrwxrwxrwx 1 root root   54 Nov 22 21:08 tsh -> /opt/teleport/mycluster/versions/16.4.7/bin/tsh
ubuntu@legendary-mite:~$ ls -la /etc/systemd/system/teleport*
-rw-r--r-- 1 root root  138 Nov 22 00:23 teleport-update.service
-rw-r--r-- 1 root root  171 Nov 22 00:23 teleport-update.timer
-rw-r--r-- 1 root root  203 Nov 22 21:52 teleport-update_mycluster.service
-rw-r--r-- 1 root root  205 Nov 22 21:52 teleport-update_mycluster.timer
-rw-r--r-- 1 root root  504 Nov 22 21:52 teleport_mycluster.service
ubuntu@legendary-mite:~$ ls -la /etc/systemd/system/teleport.service 
-rw-r--r-- 1 root root 435 Nov 22 00:03 /etc/systemd/system/teleport.service
ubuntu@legendary-mite:~$ ls -la /opt/teleport/mycluster/
drwxr-xr-x 5 root root 4096 Nov 22 21:08 16.4.7
-rw-r--r-- 1 root root  177 Nov 22 21:55 update.yaml

@sclevine
Copy link
Member Author

@hugoShaka FYI, decided to ship this sooner since initial users of the time-based schedule had asked for it. Let me know if the UX seems reasonable. Everything gets a --namespace flag, namespaces must be alphanumeric + -, and _ is used to separate the name and namespace.

@sclevine sclevine force-pushed the sclevine/teleport-update-uninstall branch from b417dcb to f9d3e81 Compare November 22, 2024 20:38
@sclevine sclevine force-pushed the sclevine/teleport-update-namespace branch from 186a738 to ab388e0 Compare November 22, 2024 20:42
@sclevine sclevine marked this pull request as ready for review November 22, 2024 21:57
@sclevine sclevine added the no-changelog Indicates that a PR does not require a changelog entry label Nov 22, 2024
Base automatically changed from sclevine/teleport-update-uninstall to master November 22, 2024 22:25
@sclevine sclevine force-pushed the sclevine/teleport-update-namespace branch 2 times, most recently from 1cf971b to eeaab53 Compare November 23, 2024 01:25
@sclevine sclevine force-pushed the sclevine/teleport-update-namespace branch from eeaab53 to 9f5117e Compare November 23, 2024 02:32
@sclevine
Copy link
Member Author

sclevine commented Nov 25, 2024

After testing this out over the weekend, and researching prior art for similar packages, I'd like to recommend an alternative set of paths, for both namespaced and non-namespaced installations.

I found that while it's common to keep package-sized data in /var/lib, keeping executables there can cause problems for CIS Benchmarks that require certain subdirectories of /var to be mounted noexec, such as CCE-82008-4. While it's possible to work around these requirements, there are other issues that make /var/lib/teleport less ideal:

  1. Existing automated backups for /var/lib/teleport will unexpectedly increase in size
  2. Attempting to reset the agent installation by deleting /var/lib/teleport will leave broken symlinks
  3. /var may be more likely to need selinux rule changes than other locations.

Examples of issues caused by noexec on /var:

Additionally, while it's somewhat common for packages to create /usr/local/<package>, such as /usr/local/ssl for some installations of OpenSSL, it is technically not FHS-compliant. FHS is clear that /usr/local should not have non-FHS defined subdirectories.

On top that, I noticed that /var/lib/teleport is intended to have 700 permissions by default, which is incompatible with path traversal for running Teleport binaries. I would prefer not to change the default permissions on the data directory.

For namespaces specifically, I think it would be more convenient to create namespaced data directories outside of the primary data dir /var/lib/teleport, so that they can easily be stored on separate volumes and backed-up separately. Adding other data directories (and Teleport packages) to /var/lib/teleport also makes the `--data-dir argument confusing.

Finally, all systemd service files that are not installed by packages should be installed in /etc/systemd/system. I cannot find any examples of non-packaged software installing into /lib. For example, K3s installs its services into /etc/systemd/system, and also supports auto-updates.

My new proposal is this:

System package install:

Binaries: /opt/teleport/system/bin/* -> /usr/local/bin/*
Service: /opt/teleport/system/lib/systemd/system/teleport.service -> /lib/systemd/system/teleport.service
(others same as below)

Non-namespaced, auto-updates:

Binaries: /opt/teleport/default/versions/1.2.3/bin/* -> /usr/local/bin/*
Service: /opt/teleport/default/versions/1.2.3/lib/systemd/system/teleport.service ->
/lib/systemd/system/teleport.service
Lock: /opt/teleport/default/update.lock
Data dir: /var/lib/teleport
PID: /run/teleport.pid
Config: /etc/teleport.yaml
Defaults: /etc/defaults/teleport
Updater service: /etc/systemd/system/teleport-update.service

Namespaced, auto-updates:

Binaries: /opt/teleport/mycluster/versions/1.2.3/bin/* -> /opt/teleport/mycluster/bin/*
Service: /opt/teleport/mycluster/versions/1.2.3/lib/systemd/system/teleport.service ->
/etc/systemd/system/teleport_mycluster.service
Lock: /opt/teleport/mycluster/update.lock
Data dir: /var/lib/teleport_mycluster
PID: /run/teleport_mycluster.pid
Config: /etc/teleport_mycluster.yaml
Defaults: /etc/defaults/teleport (shared)
Updater service: /etc/systemd/system/teleport-update_mycluster.service

Let me know if this seems better. I have updated the PR description accordingly, and I'm working on updating the PR.

@sclevine
Copy link
Member Author

CC: @vapopov re: moving the system package paths to /opt/teleport/system

@sclevine sclevine changed the title [teleport-update] Namespaces [teleport-update] Isolated installation suffix Nov 26, 2024
@sclevine
Copy link
Member Author

sclevine commented Dec 2, 2024

This is blocking a few other PRs, looking for reviews when anyone has a chance 🙂

dataDir: dataDir,
linkDir: linkDir,
versionsDir: filepath.Join(namespaceDir(name), versionsDirName),
serviceFile: filepath.Join(systemdAdminDir, prefix+".service"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that we define a name of the existing namespace? should we validate this case in constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewNamespace is intended to be used with existing namespaces. It's just used to generate the appropriate set of paths the Updater to read/write.

@sclevine
Copy link
Member Author

sclevine commented Dec 3, 2024

@timothyb89 @flyinghermit @hugoShaka looking for one more review 🙂

Copy link
Collaborator

@zmb3 zmb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to replace the trace.Errorf calls, but LGTM otherwise.

@sclevine
Copy link
Member Author

sclevine commented Dec 3, 2024

@zmb3 thanks, #49388 is based on this PR, so these will get fixed after rebasing 🙂

@sclevine sclevine added this pull request to the merge queue Dec 3, 2024
Merged via the queue into master with commit 08f8e55 Dec 3, 2024
43 of 45 checks passed
@sclevine sclevine deleted the sclevine/teleport-update-namespace branch December 3, 2024 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants