-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elastic-Agent] Modify output to be insecure if flag is provided #28007
[Elastic-Agent] Modify output to be insecure if flag is provided #28007
Conversation
Pinging @elastic/agent (Team:Agent) |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
💚 Flaky test reportTests succeeded. 🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
/package |
@@ -147,7 +147,7 @@ func newManaged( | |||
router, | |||
&pipeline.ConfigModifiers{ | |||
Decorators: []pipeline.DecoratorFunc{modifiers.InjectMonitoring}, | |||
Filters: []pipeline.FilterFunc{filters.StreamChecker, modifiers.InjectFleet(rawConfig, sysInfo.Info(), agentInfo)}, | |||
Filters: []pipeline.FilterFunc{filters.StreamChecker, modifiers.InjectInsecureOutput(cfg.Fleet), modifiers.InjectFleet(rawConfig, sysInfo.Info(), agentInfo)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it will affect more than just Fleet Server running under Elastic Agent, it will also affect all the other beats, correct?
If this does affect the other beats, I don't think we want this, because how will this work when it comes to multiple outputs? I believe it will have the effect that if --insecure
is used all outputs will then become insecure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that's the purpose. if we wont pass, events wont get consumed
this is to ease up on initial experience. you wont use insecure in prod. at least i hope nobody will
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would get consumed if the policy is configured correctly. I don't think it should be Elastic Agent's job to override the output settings from Kibana.
If we had proper health reporting for outputs, it would be clear there is an issue. The real issue is that the Elastic Agent reports healthy when it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if value is provided by kibana it should not be overriden, i'll add testcase to make it more visible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kibana on the other hand would needs to have the info that hey this cert we're using is self signed and may no be properly configured to support some config automatically (providing values to out config option).
insecure is our concept and i think it's up to us to help user using this with FRE.
if fleetConfig == nil || | ||
fleetConfig.Server == nil || | ||
fleetConfig.Server.TLS == nil || | ||
fleetConfig.Server.TLS.VerificationMode == tlscommon.VerifyFull { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem to have the affect that it will only do something if the Elastic Agent is running Fleet Server, but it will affect all beats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right. changed to use client config instead
/package |
1 similar comment
/package |
/package |
e2e passes but not within a time limit of 1 minute which needs to be increased, 2 minutes seems to work fine locally |
I know that the e2e-testing jenkins runs at night have a timeout factor, here: |
@EricDavisX i did a fix on e2e side, it should be fine now |
@michalpristas you mean this PR elastic/e2e-testing#1599, right? As this PR was merged two weeks ago, and we have not seen any instability related to that change since then, I'd say we are good to go from e2e-standpoint |
I had a review with Michal over this and understand it better now. It looks like this can solve a lot of customer / user pain as is, and if we ever did decide to change the fix on the Kibana side this merge doesn't preclude that or make it harder. I understand that we inject into each output with setting verification to 'none', but if the verification mode is set then that is skipped - so it seems good to me. this is backed up by tests as well, so I'm giving it a thumb after discussion and fairly light literal code review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm; will help fix customer pain
++ on having 2 flags. Maybe we should rename also
We could deprecate I wrote out |
/package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add the option to the container command as well. Adding it to the setup_config.go
and the environment variable FLEET_SERVER_ELASTICSEARCH_INSECURE=1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, we should be explicit on what this does vs the --insecure
flag does in the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for all the fixes!
…if flag is provided (#28387) * [Elastic-Agent] Modify output to be insecure if flag is provided (#28007) [Elastic-Agent] Modify output to be insecure if flag is provided (#28007) (cherry picked from commit 62d84db) # Conflicts: # x-pack/elastic-agent/pkg/agent/cmd/enroll.go # x-pack/elastic-agent/pkg/agent/cmd/enroll_cmd.go * conflicts Co-authored-by: Michal Pristas <[email protected]>
) (#28386) [Elastic-Agent] Modify output to be insecure if flag is provided (#28007) (cherry picked from commit 62d84db) Co-authored-by: Michal Pristas <[email protected]>
) (#28388) [Elastic-Agent] Modify output to be insecure if flag is provided (#28007) (cherry picked from commit 62d84db) Co-authored-by: Michal Pristas <[email protected]>
…stic#28007) [Elastic-Agent] Modify output to be insecure if flag is provided (elastic#28007)
Hi @michalpristas cc @EricDavisX I have re-attempted to validate this PR on 7.16 BC3 on-prem env as per steps mentioned in Ticket summary. However we are unable to reproduce desired results. Please check out the below steps and let us know if we are missing anything.
Scenario1: Ran a fleet server command with --insecure flag and observed below error. Screenshot: Scenario2: Copied same certs at VM2 and re-attempted to run updated fleet-server command with --insecure flag Scenario3: Created new certs at VM2 and re-attempted to run fleet-server command with --insecure flag Build Details:
Please let us know what we are still missing. Thanks |
ping @michalpristas could you have a look? |
can you setup environment as this:
VM2:
you may see agent from vm2 not reporing any events as ES is not configured to accept connections from non-local sources but enroll should work. agent2 should still be able to talk to fleet server running under agent1 and appear healthy. also another thing is that we do not propagate insecure to beats so they may fail on cert validation and you need to update verification mode in output setting in fleet. this is what was agreed on in a PR. |
We have already testing this and are able to connect agent2 with agent1 as per scenario mentioned above, however, only issue at here is though agent2 seems up healthy but it doesn't send any logs to kibana UI or data streams or logs under discover tab. Tested on Build 7.16 BC6: Steps used:
Observations:
Agent2 logs: Please let us know if more info is required from our end. Thanks |
@dikshachauhan-qasource I think what you have is a successful test! The data flow issue is, as Michal notes (presumably, as you did not mention it in your setup steps) due to the missing addition to the Fleet output settings. can you set: |
Hi @EricDavisX , We have tried with above settings too and found no change in Agent 2 or Fleet-server behavior. Agent 2 still not able to send any logs to Kibana UI Agent2 logs: Please let us know if anything is required from our end. Thanks |
…secure if flag is provided (elastic#28387) * [Elastic-Agent] Modify output to be insecure if flag is provided (elastic#28007) [Elastic-Agent] Modify output to be insecure if flag is provided (elastic#28007) (cherry picked from commit 9c6a0bb) # Conflicts: # x-pack/elastic-agent/pkg/agent/cmd/enroll.go # x-pack/elastic-agent/pkg/agent/cmd/enroll_cmd.go * conflicts Co-authored-by: Michal Pristas <[email protected]>
What does this PR do?
At the moment when we enter
--insecure
during install/enroll it applies only for communication between agent and fleet server.This is a source of confusion as users often times expect all communication to be insecure.
This PR propagates insecure also to fleet-server to ES communication and after it is enrolled it also updates output definition passed to processes.
Why is it important?
in case fleet-server to ES communication is not
insecure
we often see x509 errors when self signed certs are used. Especially when ES is running on other machine than the one running fleet-server. In this case self signed cert is not validated as it's generated usually for localhost/127.0.0.1
and validation fails when dialing IP of ES machine.Passing this to processes is important in case we have not only
fleet-server
running on edge but also monitoring or other integration. If we did not pass this, no events would be sent to ES (same failures as the one during enroll in fleet server scenario)Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test
--fleet-server-es
pointing to ES from step 1 and--fleet-server-es-ca
shared with the ES from step 1