Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Unpack beats to enable capabilities inside container #30200

Closed

Conversation

emilioalvap
Copy link
Collaborator

@emilioalvap emilioalvap commented Feb 3, 2022

What does this PR do?

This PR enables unpacking of beats inside the container at build time, so that required cap_net_raw, cap_setuid capabilities can be assigned to the binary.

Why is it important?

Without the required capabilities, heartbeat cannot execute ICMP pings or setuid calls. As it is now, agent is unpacking beats at runtime, most likely with a user that doesn't have permission to assign capabilities.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

  • Build elastic-agent containers, from x-pack/elastic-agent run:
    env PLATFORMS="+all linux/amd64" mage dev:package
  • Run one of the built containers and provide some heartbeat configuration:
docker run --name agent -it -u root --env FLEET_ENROLL=1 --env \
FLEET_URL=<url> --env \ 
FLEET_ENROLLMENT_TOKEN=<token> \ 
docker.elastic.co/beats/elastic-agent:8.1.0-SNAPSHOT

Related issues

Use cases

Screenshots

Logs

@emilioalvap emilioalvap added bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Feb 3, 2022
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Feb 3, 2022
@mergify
Copy link
Contributor

mergify bot commented Feb 3, 2022

This pull request does not have a backport label. Could you fix it @emilioalvap? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Feb 3, 2022
@elasticmachine
Copy link
Collaborator

elasticmachine commented Feb 3, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-02-24T14:42:42.541+0000

  • Duration: 229 min 34 sec

Test stats 🧪

Test Results
Failed 0
Passed 38335
Skipped 3320
Total 41655

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@@ -482,6 +482,8 @@ shared:
user: '{{ .BeatName }}'
linux_capabilities: ''
image_name: ''
unpack_beats: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're saying this is the default behavior, do we need this option, or should we just always do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this the default and remove this option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, removed the option and it's now unpacking by default

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks acceptable, just some open questions and comments first.

@@ -482,6 +482,8 @@ shared:
user: '{{ .BeatName }}'
linux_capabilities: ''
image_name: ''
unpack_beats: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this the default and remove this option.

RUN mkdir -p {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \
for beatPath in {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/downloads/*.tar.gz; do \
tar xf $beatPath -C {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \
rm $beatPath; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not remove this $beatPath. It's not going to save any space in the image anyway because it is included in a previous layer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True that, removed

tar xf $beatPath -C {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \
rm $beatPath; \
done && \
chown -R {{ .user }} {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this leave the group as?

Copy link
Collaborator Author

@emilioalvap emilioalvap Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group was root, changed that to be elastic-agent too.

rm $beatPath; \
done && \
chown -R {{ .user }} {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }} && \
setcap cap_net_raw,cap_setuid+p {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/heartbeat-*/heartbeat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the cap_setuid+p for? Why is that needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heartbeat requires it for switching user id at runtime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add some detail here @blakerouse heartbeat runs node as a subprocess, and node hates to run as root. It's also just a best practice to setuid out of root if you can regardless.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. It was just more of a question so I can understand. Thanks for the explanation.

@emilioalvap emilioalvap force-pushed the heartbeat-agent-capabilities branch from 32348ba to d732b36 Compare February 24, 2022 14:42
@blakerouse
Copy link
Contributor

@emilioalvap What is the status of this PR? Can we get it up for full review?

We believe this will also fix https://github.com/elastic/observability-dev/issues/1935.

Comment on lines +190 to +191
chown -R root:root {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/*/*.yml && \
chmod 0644 {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/*/*.yml && \
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakerouse @andrewvc These two are workarounds for a couple of issues I've found, I'd welcome your suggestions on how to fix those.

First one is related to permission checking in libbeat. This method is checking that config file owner is either root or executing user, which is not the case if container is extracted as elastic-agent and executed as root. At the moment, k8s templates for elastic-agent containers in docs portal use root user to execute pods, possibly for ECK templates too. The workaround is to change config file ownership to root.

The second one is related to this post-build check. Config files extracted from beat .tars do not follow the specified permission mask:

package_test.go:258: file usr/share/elastic-agent/data/elastic-agent-d732b3/install/osquerybeat-8.2.0-linux-x86_64/osquerybeat.yml has wrong permissions: expected=-rw-r--r-- actual=-rw-------

I wasn't sure what the rationale is behind this check and what the impact could be for another beats.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilioalvap I think the first workaround is okay, as Elastic Agent never writes this file.

As for the post-build check I see you do a chmod 0644 for the *.yml, can that be changed to chmod 0600 or would that break it?

Copy link
Collaborator Author

@emilioalvap emilioalvap Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakerouse Some of the beats' *.yml files come with 0600 and that is actually what breaks the test. 0644 seems to be the permission mask applied to *.yml files outside zipped beats. So now that we are unpacking beats, the test that used to check elastic.-agent *.yml files is also checking other beats config files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilioalvap Still confused? So are you saying that its working with the change and the test is passing or we need to change either one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakerouse Sure, let me clarify. I added the following to make the test pass:

chmod 0644 {{ $beatHome }}/data/{{.BeatName}}-{{ commit_short }}/{{ .beats_install_path }}/*/*.yml

But I wanted to make sure that this change wouldn't impact other beats in a way that I cannot foresee, as its changing file permission for all beats unpacked in the build phase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants