Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing from Elasticsearch to Logstash output and back causes agents to go offline #2602

Open
allamiro opened this issue May 12, 2023 · 5 comments
Labels
bug Something isn't working Team:Fleet Label for the Fleet team

Comments

@allamiro
Copy link

allamiro commented May 12, 2023

  • Version: 8.6.1
  • Operating System: Windows 10 and Windows Server 2019
  • Discuss Forum URL:

- Steps to Reproduce:

  • Navigate to Fleet -> Agents Policy tab
  • Select the fleet policy
  • Click on the fleet policy settings
  • Change Fleet Policy output for integrations to Logstash
  • Change output for Agent monitoring to Logstash
  • Save changes

Upon failure :

  • Remove the problematic fleet server
  • Update the Fleet Server policy output for integrations and monitoring under settings to point back to Elastic Search instead of logstash
  • Setup and add a new fleet server
  • Add the Fleet server to Kibana using the old Fleet Policy
  • Check the agents status

Another symptoms that we seeing in the log agent that are offline are reporting

possible transient error during  checking with fleet-server , retrying 

I cant provide the logs .

I know we need to upgrade to 8.7 or even 8.8 and we are planning to do so to resolve the problems around this bug report elastic/elastic-agent#2316 . However, I suspect there is a also problem with the recovery functions. Where after we stood up a new fleet server following the changes made above bu most of the agents remain unhealthy. The majority of agents are showing as offline which is also may may be discussed on elastic/elastic-agent#2554

A reboot on some of the systems fixed the issue because the service get hung after a service restart .However Its not possible to send a mass reboot command to all systems . Multiple agents remain to be in the unhealthy status.

Its worth mentioning that I opened an enhancement request to add the functions of 2523

@allamiro allamiro added the bug Something isn't working label May 12, 2023
@jugsofbeer
Copy link

We had a similar sounding issue, Elastic Agent running fine, then server OS patching occurred, with the Elastic Agent recognising a reboot was about to occur so Elastic Agent shutdown and restarted itself within 2-5 seconds and then the server rebooted. After that the Elastic Agent fails to startup automatically.

Starts up if you login to the server and manually start the service though.

@allamiro
Copy link
Author

allamiro commented May 16, 2023

@jugsofbeer I have seen multiple issues that resolved by restarting the agent even in the newer version and I think that function is needed just in case as I mentioned previously
elastic/elastic-agent#2628

@cmacknz cmacknz changed the title Elastic Agent Service Recovery Failure ( Agents remain offline or unhealthy) Changing from Elasticsearch to Logstash output and back causes agents to go offline May 17, 2023
@cmacknz cmacknz transferred this issue from elastic/elastic-agent May 17, 2023
@cmacknz
Copy link
Member

cmacknz commented May 17, 2023

The issue description seems similar to elastic/elastic-agent#2554 but affecting Fleet server instead.

@cmacknz cmacknz added the Team:Fleet Label for the Fleet team label May 17, 2023
@cmacknz
Copy link
Member

cmacknz commented May 18, 2023

Another very similar problem that could be related here #2603

@michel-laterman
Copy link
Contributor

Changing the output that the fleet-server integration uses to logstash will put fleet-server into an unrecoverable state, this is expected behaviour.
@kpollich, have/can we disable this in the fleet-ui?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team:Fleet Label for the Fleet team
Projects
None yet
Development

No branches or pull requests

4 participants