Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Osquerybeat #24456

Merged
merged 22 commits into from
Mar 26, 2021
Merged

Osquerybeat #24456

merged 22 commits into from
Mar 26, 2021

Conversation

aleksmaus
Copy link
Contributor

What does this PR do?

This PR is the first cut of the agent actions with osquerybeat. This allows agent to run osquerybeat, dispatch the ad-hoc queries received with "agent actions" and send the results back.
The osquerybeat was migrated from https://github.com/elastic/osquerybeat, the original POC we worked on awhile back.

Here is how the actions work with this implementation:

  1. The agent starts the apps (osquerybeat in particular)
  2. The osquerybeat receives the configuration from the agent
  3. The osquerybeat parses configuration and sends the supported input types and osqueryd version back to the agent as a payload with the status update. Along with the payload it also sends the version of osqueryd binary that is shipped with the agent. The requirement for the product is to be able to surface this information to the end-users.
  4. The osquerybeat also registers the action handlers with the rpc client for it's input types (in this case the input type is "osquery")
  5. The agent received the input types in the status update payload and updates inputTypes set of the corresponding ApplicationState
  6. When the new action is received from the Fleet Server, the Agent finds the matching ApplicationState for the input type and dispatches the action.
  7. The result of the action is forwarded back to Fleet Server. The query result data is send directly to elasticsearch by osquerybeat.

Packaging:

The osquerybeat build downloads the official distro from https://osquery.io/downloads/official and caches it locally for incremental builds.
The official distro binaries are provided as:

  • .tar.gz for linux
  • .pkg for MacOS
  • .msi for Winderz

For osquerybeat package the osqueryd binary is extracted from the official distro(the only thing that we need) for the linux package. For windows and Mac the .pkg or .msi file is included into the package.
When osquerybeat if ran for the first time, depending on the platform it either unpacks the the osqueryd from the distro on that platform first or starts osqueryd as a child process right away.

Disclaimer:

This is my first attempt on digging through the agent code, so there is a possibility that the lines of responsibilities for different pieces of the code are not exactly where they intended to be as designed by the original contributors. Would need some feedback.
Also not sure I covered all the bases with CI build for osquerybeat to get it correctly published to artifactory etc.
Might need some advice this.
Fully expect to iterate on osquerybeat customization further after this.

Why is it important?

This allows the security assets management team to ship osquery integration in the upcoming release.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Prerequisites:

  1. The osquerybeat-8.0.0-SNAPSHOT-[platform]-x86_64.tar.gz and corresponding .sha512 files needs to be dropped into the agent download folder, until the oquerybeat is published to artifactory.
  2. The Osquery Elastic Managed integration needs to be added to the agent policy.

Screen Shot 2021-03-09 at 10 00 02 PM

  1. The agent needs to enrolled with the latest Fleet Server (under developement)

Related issues

Sample action document with the query:
Screen Shot 2021-03-09 at 10 07 29 PM

Corresponding action result document:
Screen Shot 2021-03-09 at 6 09 39 PM

Corresponding query result datastream:
Screen Shot 2021-03-09 at 6 10 19 PM

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label Team:Ingest Management labels Mar 10, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 10, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 10, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #24456 updated

  • Start Time: 2021-03-26T12:51:42.844+0000

  • Duration: 136 min 32 sec

  • Commit: a3de591

Test stats 🧪

Test Results
Failed 0
Passed 46429
Skipped 5104
Total 51533

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 46429
Skipped 5104
Total 51533

@ruflin ruflin requested a review from urso March 10, 2021 13:16
@urso
Copy link

urso commented Mar 11, 2021

Is it possible to split the PR such that we can review the libbeat management changes independent of osquerybeat, and also independent of the changes to packaging?

@aleksmaus
Copy link
Contributor Author

Is it possible to split the PR such that we can review the libbeat management changes independent of osquerybeat, and also independent of the changes to packaging?

Sure. I did this in the same PR initially to make it easier to change if I need to move more of the stuff into libbeat based on the feedback. Will create two new PRs referring to this one.

Copy link

@scunningham scunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Couple minor security questions.

x-pack/elastic-agent/pkg/core/server/server.go Outdated Show resolved Hide resolved
x-pack/osquerybeat/beater/osquerybeat.go Show resolved Hide resolved
// Create temp directory for socket and possibly other things
// The unix domain socker path is limited to 108 chars and would
// not always be able to create in subdirectory
tmpdir, removeTmpDir, err := createTempDir()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to use unguessable dir name with minimal permissions; eg. uuidv4.

x-pack/osquerybeat/beater/osquerybeat.go Outdated Show resolved Hide resolved
x-pack/osquerybeat/beater/osquerybeat.go Show resolved Hide resolved
x-pack/osquerybeat/beater/osquerybeat.go Outdated Show resolved Hide resolved

func (s *Scheduler) isCancelled() bool {
select {
case <-s.ctx.Done():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can also do:

return s.ctx.Err() != nil

x-pack/osquerybeat/internal/distro/distro.go Outdated Show resolved Hide resolved
x-pack/osquerybeat/internal/fetch/fetch.go Show resolved Hide resolved
x-pack/osquerybeat/internal/osqueryd/osqueryd.go Outdated Show resolved Hide resolved
@ph ph added Team:Elastic-Agent Label for the Agent team and removed Team:Ingest Management labels Mar 15, 2021
Based on conversation on PR made the change so now the input type are
not passed back to the agent. Instead they are derived by the agent from
the application spec.

This is the additional property that was added to the spec to declare
accepted action_input_types:

action_input_types:
- osquery
* Try to use /var/run directory for a socket, fallback on platform
  specific temp dir if access denied (running as non-root)
* Upgrade to build with osquery 4.7.0
@aleksmaus aleksmaus requested a review from a team as a code owner March 23, 2021 01:39
@botelastic botelastic bot added the Team:Automation Label for the Observability productivity team label Mar 23, 2021
@@ -0,0 +1,67 @@
when:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required ARM support? If so, the ARM stage is missing similar to the packaging stage.

Copy link
Contributor Author

@aleksmaus aleksmaus Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently uses the official distro of osquery, and at the time this project started there were no ARM official distro. The linux arm distro is now available in beta https://osquery.io/blog/linux-arm64-beta-support, could add it later.

Another and possibly better option for GA is to build osquery for all the platforms from source and host in our artifactory. This would allow us to avoid packaging official distros .msi or .pkg for windows and mac into the distribution and build any flavor we need possibly earlier:
https://osquery.readthedocs.io/en/latest/development/building/

Anybody can assist with settings this up on our CI infrastructure?

One of the downsides of building osquery ourselves: we would have to figure out any issues for platforms they don't officially support, we would have to sign the binaries ourselves, possible extra efforts with testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anybody can assist with settings this up on our CI infrastructure?

If the signing is required, the unified release process supports such a service, though we need to run the build in another CI. But if required, we can start a conversation with the team unified release team about where to host those binaries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleksmaus I think we can skip arm64 for 7.13 since it's only in beta on osquery and add it later when it's out of beta.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's the plan: to land what we have and then iterate.

@v1v
Copy link
Member

v1v commented Mar 23, 2021

Build&Test / metricbeat-goIntegTest / TestData – github.com/elastic/beats/v7/metricbeat/module/kafka/broker

As far as I see those are unrelated to these changes, though it was triggered as a consequence of changing some files that forced to run the whole pipeline similar to the one triggered in any of the branches.

Copy link
Member

@v1v v1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from the CI point of view, I cannot say anything else about the other changes

@aleksmaus
Copy link
Contributor Author

aleksmaus commented Mar 23, 2021

Build&Test / metricbeat-goIntegTest / TestData – github.com/elastic/beats/v7/metricbeat/module/kafka/broker

As far as I see those are unrelated to these changes, though it was triggered as a consequence of changing some files that forced to run the whole pipeline similar to the one triggered in any of the branches.

So for initial release we are going to rely on the official osquery binaries it looks like. The official binaries support 64bit windows/linux/mac. They are properly signed for MacOS for example.
There are no official builds for 32bit binaries and there is a beta release for linux ARM 64bit.

Not sure if there is a way to have a different agent specs that would not include osquerybeat for these unsupported platforms and architectures for the agent for now. Or if there is a better way to handle it.
@blakerouse @urso ?

@blakerouse
Copy link
Contributor

@aleksmaus From the build of the osquerybeat itself I do not know. When it comes to Agent you can limit where an Application (aka. osquerybeat) can run based on a EQL condition.

See https://github.com/elastic/beats/blob/master/x-pack/elastic-agent/spec/endpoint.yml#L73 (endpoint currently doesn't support arm64)

@aleksmaus
Copy link
Contributor Author

just did a quick test the osqueryd amd64 runs on Apple ARM architecture. So this PR should just work as is if we enable ARM. For linux ARM we can pick up a beta ARM distro. Easy to update after the initial merge.

@aleksmaus aleksmaus changed the title Osquerybeat with Agent actions Osquerybeat Mar 25, 2021
@aleksmaus
Copy link
Contributor Author

[2021-03-25T13:09:36.127Z] {"log.level":"error","@timestamp":"2021-03-25T13:09:33.905Z","log.origin":{"file.name":"log/reporter.go","file.line":36},"message":"2021-03-25T13:09:33Z: type: 'ERROR': sub_type: 'FAILED' message: Application: endpoint-security--8.0.0-SNAPSHOT[fca78797-9936-4851-b91c-a75ba2f104a2]: State changed to FAILED: 2 errors occurred:\n\t* package '/opt/Elastic/Agent/data/elastic-agent-cfa785/downloads/endpoint-security-8.0.0-SNAPSHOT-linux-x86_64.tar.gz' not found: open /opt/Elastic/Agent/data/elastic-agent-cfa785/downloads/endpoint-security-8.0.0-SNAPSHOT-linux-x86_64.tar.gz: no such file or directory\n\t* fetching package failed: Get \"https://artifacts.elastic.co/downloads/endpoint-dev/endpoint-security-8.0.0-SNAPSHOT-linux-x86_64.tar.gz\": x509: certificate signed by unknown authority\n\n","ecs.version":"1.6.0"}

@v1v? @blakerouse ?

The PR was up to date with the latest master as of yesterday.

@v1v
Copy link
Member

v1v commented Mar 25, 2021

@aleksmaus

what do you need from my side?

@aleksmaus
Copy link
Contributor Author

@aleksmaus

what do you need from my side?

any advice how to make it green? :-)
it seems to be failing on

downloads/endpoint-security-8.0.0-SNAPSHOT-linux-x86_64.tar.gz: no such file or directory

doesn't look related to the osquerybeat anyhow

@urso
Copy link

urso commented Mar 26, 2021

The PR can be merged already. The E2E test failures (all on metricbeat) are unfortunate, but currently no blocker.

@aleksmaus aleksmaus merged commit 591685e into elastic:master Mar 26, 2021
aleksmaus added a commit to aleksmaus/beats that referenced this pull request Apr 6, 2021
* Osquerybeat with Agent actions supported

* Revert grpc upgrade in this PR back to what it was before v1.29.1

* Make check happy

* Check in forgotten spec file, regenerate the spec

* Some regenerated after clean build fields

* Agent Actions: Part 1 of Osquerybeat with Agent actions

* Rollback some mods upgrade. Address some code review feedback

* Address code review feedback

* Add missing copyright header

* Address more of the code review feedback

* Remove input types from payload communicated back to the agent

* Change the way the inputs are tied to the applications

Based on conversation on PR made the change so now the input type are
not passed back to the agent. Instead they are derived by the agent from
the application spec.

This is the additional property that was added to the spec to declare
accepted action_input_types:

action_input_types:
- osquery

* Address code review feedback

* Try to use /var/run directory for a socket, fallback on platform
  specific temp dir if access denied (running as non-root)
* Upgrade to build with osquery 4.7.0

* Update CI scripts to get osquerybeat building

* Exclude arm64 from running osquery for now

(cherry picked from commit 591685e)
urso pushed a commit that referenced this pull request Apr 7, 2021
Cherry-pick of PR #24456 to 7.x branch. Original message: 

## What does this PR do?

This PR is the first cut of the agent actions with osquerybeat. This allows agent to run osquerybeat, dispatch the ad-hoc queries received with "agent actions" and send the results back.
The osquerybeat was migrated from https://github.com/elastic/osquerybeat, the original POC we worked on awhile back.

Here is how the actions work with this implementation:
1. The agent starts the apps (osquerybeat in particular)
2. The osquerybeat receives the configuration from the agent
3. The osquerybeat parses configuration and sends the supported input types and osqueryd version back to the agent as a payload with the status update.  Along with the payload it also sends the version of osqueryd binary that is shipped with the agent. The requirement for the product is to be able to surface this information to the end-users. 
4. The osquerybeat also registers the action handlers with the rpc client for it's input types (in this case the input type is "osquery")
5. The agent received the input types in the status update payload and updates inputTypes set of the corresponding ApplicationState
6. When the new action is received from the Fleet Server, the Agent finds the matching ApplicationState for the input type and dispatches the action. 
7. The result of the action is forwarded back to Fleet Server. The query result data is send directly to elasticsearch by osquerybeat.


#### Packaging:
The osquerybeat build downloads the official distro from https://osquery.io/downloads/official and caches it locally for incremental builds.
The official distro binaries are provided as:
* .tar.gz for linux
* .pkg for MacOS
* .msi for Winderz

For osquerybeat package the osqueryd binary is extracted from the official distro(the only thing that we need) for the linux package. For windows and Mac the .pkg or .msi file is included into the package.
When osquerybeat if ran for the first time, depending on the platform it either unpacks the the osqueryd from the distro on that platform first or starts osqueryd as a child process right away. 

#### Disclaimer:
This is my first attempt on digging through the agent code, so there is a possibility that the lines of responsibilities for different pieces of the code are not exactly where they intended to be as designed by the original contributors. Would need some feedback.
Also not sure I covered all the bases with CI build for osquerybeat to get it correctly published to artifactory etc.
Might need some advice this.
Fully expect to iterate on osquerybeat customization further after this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Team:Automation Label for the Observability productivity team Team:Elastic-Agent Label for the Agent team v7.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants