-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Heartbeat] Unpack beats to enable capabilities inside container #30200
[Heartbeat] Unpack beats to enable capabilities inside container #30200
Conversation
This pull request does not have a backport label. Could you fix it @emilioalvap? 🙏
NOTE: |
dev-tools/packaging/packages.yml
Outdated
@@ -482,6 +482,8 @@ shared: | |||
user: '{{ .BeatName }}' | |||
linux_capabilities: '' | |||
image_name: '' | |||
unpack_beats: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're saying this is the default behavior, do we need this option, or should we just always do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make this the default and remove this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, removed the option and it's now unpacking by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks acceptable, just some open questions and comments first.
dev-tools/packaging/packages.yml
Outdated
@@ -482,6 +482,8 @@ shared: | |||
user: '{{ .BeatName }}' | |||
linux_capabilities: '' | |||
image_name: '' | |||
unpack_beats: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make this the default and remove this option.
RUN mkdir -p {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \ | ||
for beatPath in {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/downloads/*.tar.gz; do \ | ||
tar xf $beatPath -C {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \ | ||
rm $beatPath; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not remove this $beatPath. It's not going to save any space in the image anyway because it is included in a previous layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True that, removed
tar xf $beatPath -C {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \ | ||
rm $beatPath; \ | ||
done && \ | ||
chown -R {{ .user }} {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this leave the group as?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Group was root
, changed that to be elastic-agent
too.
rm $beatPath; \ | ||
done && \ | ||
chown -R {{ .user }} {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \ | ||
setcap cap_net_raw,cap_setuid+p {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/heartbeat-*/heartbeat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the cap_setuid+p for? Why is that needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heartbeat requires it for switching user id at runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add some detail here @blakerouse heartbeat runs node as a subprocess, and node hates to run as root. It's also just a best practice to setuid out of root if you can regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. It was just more of a question so I can understand. Thanks for the explanation.
32348ba
to
d732b36
Compare
@emilioalvap What is the status of this PR? Can we get it up for full review? We believe this will also fix https://github.com/elastic/observability-dev/issues/1935. |
chown -R root:root {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/*/*.yml && \ | ||
chmod 0644 {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/*/*.yml && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blakerouse @andrewvc These two are workarounds for a couple of issues I've found, I'd welcome your suggestions on how to fix those.
First one is related to permission checking in libbeat. This method is checking that config file owner is either root
or executing user, which is not the case if container is extracted as elastic-agent
and executed as root
. At the moment, k8s templates for elastic-agent
containers in docs portal use root
user to execute pods, possibly for ECK templates too. The workaround is to change config file ownership to root.
The second one is related to this post-build check. Config files extracted from beat .tar
s do not follow the specified permission mask:
package_test.go:258: file usr/share/elastic-agent/data/elastic-agent-d732b3/install/osquerybeat-8.2.0-linux-x86_64/osquerybeat.yml has wrong permissions: expected=-rw-r--r-- actual=-rw-------
I wasn't sure what the rationale is behind this check and what the impact could be for another beats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emilioalvap I think the first workaround is okay, as Elastic Agent never writes this file.
As for the post-build check I see you do a chmod 0644
for the *.yml
, can that be changed to chmod 0600
or would that break it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blakerouse Some of the beats' *.yml
files come with 0600
and that is actually what breaks the test. 0644
seems to be the permission mask applied to *.yml
files outside zipped beats. So now that we are unpacking beats, the test that used to check elastic.-agent
*.yml
files is also checking other beats config files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emilioalvap Still confused? So are you saying that its working with the change and the test is passing or we need to change either one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blakerouse Sure, let me clarify. I added the following to make the test pass:
chmod 0644 {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/*/*.yml
But I wanted to make sure that this change wouldn't impact other beats in a way that I cannot foresee, as its changing file permission for all beats unpacked in the build phase.
What does this PR do?
This PR enables unpacking of beats inside the container at build time, so that required
cap_net_raw, cap_setuid
capabilities can be assigned to the binary.Why is it important?
Without the required capabilities, heartbeat cannot execute ICMP pings or setuid calls. As it is now, agent is unpacking beats at runtime, most likely with a user that doesn't have permission to assign capabilities.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
x-pack/elastic-agent
run:env PLATFORMS="+all linux/amd64" mage dev:package
Related issues
Use cases
Screenshots
Logs