-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Osquerybeat #24456
Osquerybeat #24456
Conversation
Pinging @elastic/ingest-management (Team:Ingest Management) |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
Is it possible to split the PR such that we can review the libbeat management changes independent of osquerybeat, and also independent of the changes to packaging? |
Sure. I did this in the same PR initially to make it easier to change if I need to move more of the stuff into libbeat based on the feedback. Will create two new PRs referring to this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Couple minor security questions.
// Create temp directory for socket and possibly other things | ||
// The unix domain socker path is limited to 108 chars and would | ||
// not always be able to create in subdirectory | ||
tmpdir, removeTmpDir, err := createTempDir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to use unguessable dir name with minimal permissions; eg. uuidv4.
|
||
func (s *Scheduler) isCancelled() bool { | ||
select { | ||
case <-s.ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can also do:
return s.ctx.Err() != nil
Based on conversation on PR made the change so now the input type are not passed back to the agent. Instead they are derived by the agent from the application spec. This is the additional property that was added to the spec to declare accepted action_input_types: action_input_types: - osquery
* Try to use /var/run directory for a socket, fallback on platform specific temp dir if access denied (running as non-root) * Upgrade to build with osquery 4.7.0
@@ -0,0 +1,67 @@ | |||
when: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it required ARM support? If so, the ARM stage is missing similar to the packaging stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently uses the official distro of osquery, and at the time this project started there were no ARM official distro. The linux arm distro is now available in beta https://osquery.io/blog/linux-arm64-beta-support, could add it later.
Another and possibly better option for GA is to build osquery for all the platforms from source and host in our artifactory. This would allow us to avoid packaging official distros .msi or .pkg for windows and mac into the distribution and build any flavor we need possibly earlier:
https://osquery.readthedocs.io/en/latest/development/building/
Anybody can assist with settings this up on our CI infrastructure?
One of the downsides of building osquery ourselves: we would have to figure out any issues for platforms they don't officially support, we would have to sign the binaries ourselves, possible extra efforts with testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anybody can assist with settings this up on our CI infrastructure?
If the signing is required, the unified release process supports such a service, though we need to run the build in another CI. But if required, we can start a conversation with the team unified release team about where to host those binaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aleksmaus I think we can skip arm64 for 7.13 since it's only in beta on osquery and add it later when it's out of beta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's the plan: to land what we have and then iterate.
As far as I see those are unrelated to these changes, though it was triggered as a consequence of changing some files that forced to run the whole pipeline similar to the one triggered in any of the branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from the CI point of view, I cannot say anything else about the other changes
So for initial release we are going to rely on the official osquery binaries it looks like. The official binaries support 64bit windows/linux/mac. They are properly signed for MacOS for example. Not sure if there is a way to have a different agent specs that would not include osquerybeat for these unsupported platforms and architectures for the agent for now. Or if there is a better way to handle it. |
@aleksmaus From the build of the osquerybeat itself I do not know. When it comes to Agent you can limit where an Application (aka. osquerybeat) can run based on a EQL condition. See https://github.com/elastic/beats/blob/master/x-pack/elastic-agent/spec/endpoint.yml#L73 (endpoint currently doesn't support arm64) |
just did a quick test the osqueryd amd64 runs on Apple ARM architecture. So this PR should just work as is if we enable ARM. For linux ARM we can pick up a beta ARM distro. Easy to update after the initial merge. |
@v1v? @blakerouse ? The PR was up to date with the latest master as of yesterday. |
what do you need from my side? |
any advice how to make it green? :-)
doesn't look related to the osquerybeat anyhow |
The PR can be merged already. The E2E test failures (all on metricbeat) are unfortunate, but currently no blocker. |
* Osquerybeat with Agent actions supported * Revert grpc upgrade in this PR back to what it was before v1.29.1 * Make check happy * Check in forgotten spec file, regenerate the spec * Some regenerated after clean build fields * Agent Actions: Part 1 of Osquerybeat with Agent actions * Rollback some mods upgrade. Address some code review feedback * Address code review feedback * Add missing copyright header * Address more of the code review feedback * Remove input types from payload communicated back to the agent * Change the way the inputs are tied to the applications Based on conversation on PR made the change so now the input type are not passed back to the agent. Instead they are derived by the agent from the application spec. This is the additional property that was added to the spec to declare accepted action_input_types: action_input_types: - osquery * Address code review feedback * Try to use /var/run directory for a socket, fallback on platform specific temp dir if access denied (running as non-root) * Upgrade to build with osquery 4.7.0 * Update CI scripts to get osquerybeat building * Exclude arm64 from running osquery for now (cherry picked from commit 591685e)
Cherry-pick of PR #24456 to 7.x branch. Original message: ## What does this PR do? This PR is the first cut of the agent actions with osquerybeat. This allows agent to run osquerybeat, dispatch the ad-hoc queries received with "agent actions" and send the results back. The osquerybeat was migrated from https://github.com/elastic/osquerybeat, the original POC we worked on awhile back. Here is how the actions work with this implementation: 1. The agent starts the apps (osquerybeat in particular) 2. The osquerybeat receives the configuration from the agent 3. The osquerybeat parses configuration and sends the supported input types and osqueryd version back to the agent as a payload with the status update. Along with the payload it also sends the version of osqueryd binary that is shipped with the agent. The requirement for the product is to be able to surface this information to the end-users. 4. The osquerybeat also registers the action handlers with the rpc client for it's input types (in this case the input type is "osquery") 5. The agent received the input types in the status update payload and updates inputTypes set of the corresponding ApplicationState 6. When the new action is received from the Fleet Server, the Agent finds the matching ApplicationState for the input type and dispatches the action. 7. The result of the action is forwarded back to Fleet Server. The query result data is send directly to elasticsearch by osquerybeat. #### Packaging: The osquerybeat build downloads the official distro from https://osquery.io/downloads/official and caches it locally for incremental builds. The official distro binaries are provided as: * .tar.gz for linux * .pkg for MacOS * .msi for Winderz For osquerybeat package the osqueryd binary is extracted from the official distro(the only thing that we need) for the linux package. For windows and Mac the .pkg or .msi file is included into the package. When osquerybeat if ran for the first time, depending on the platform it either unpacks the the osqueryd from the distro on that platform first or starts osqueryd as a child process right away. #### Disclaimer: This is my first attempt on digging through the agent code, so there is a possibility that the lines of responsibilities for different pieces of the code are not exactly where they intended to be as designed by the original contributors. Would need some feedback. Also not sure I covered all the bases with CI build for osquerybeat to get it correctly published to artifactory etc. Might need some advice this. Fully expect to iterate on osquerybeat customization further after this.
What does this PR do?
This PR is the first cut of the agent actions with osquerybeat. This allows agent to run osquerybeat, dispatch the ad-hoc queries received with "agent actions" and send the results back.
The osquerybeat was migrated from https://github.com/elastic/osquerybeat, the original POC we worked on awhile back.
Here is how the actions work with this implementation:
Packaging:
The osquerybeat build downloads the official distro from https://osquery.io/downloads/official and caches it locally for incremental builds.
The official distro binaries are provided as:
For osquerybeat package the osqueryd binary is extracted from the official distro(the only thing that we need) for the linux package. For windows and Mac the .pkg or .msi file is included into the package.
When osquerybeat if ran for the first time, depending on the platform it either unpacks the the osqueryd from the distro on that platform first or starts osqueryd as a child process right away.
Disclaimer:
This is my first attempt on digging through the agent code, so there is a possibility that the lines of responsibilities for different pieces of the code are not exactly where they intended to be as designed by the original contributors. Would need some feedback.
Also not sure I covered all the bases with CI build for osquerybeat to get it correctly published to artifactory etc.
Might need some advice this.
Fully expect to iterate on osquerybeat customization further after this.
Why is it important?
This allows the security assets management team to ship osquery integration in the upcoming release.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Prerequisites:
Related issues
Sample action document with the query:
Corresponding action result document:
Corresponding query result datastream: