-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate java-attacher errors to Kibana #7832
Comments
@joshdover are there any plans for adding a more fine grained health check UI to Fleet where this might fit? I believe in the past @ruflin mentioned some vague ideas for a health state per agent, listing all the processes that are supposed to be running. |
I think that if a policy contains both APM Server and APM Agent configurations (probably only relevant to Java agent now, but hopefully will be relevant to others in the future), we can assume this APM Server is only used for local purposes and simply consider the entire APM integration unhealthy if there is an indication that the agent is unhealthy. |
After APM Server has discovered the Java installation and before it calls the attacher, it should also validate that the Java installation is working as expected. Currently, APM Server logs this message when invoking the attacher fails: Checking whether the Java installation is working by invoking |
Tested on Windows, I get same - or slightly worse as it can't download the requested version too
|
Improving the agent integration health reporting is tracked under elastic/elastic-agent#100. We are just starting to design what this looks like. |
Regarding #7832 (comment), it is not yet clear to me whether an integration is supposed to also signal whether or not the Elastic Agent should try to restart the process when reported unhealthy or if there will be more fine granular indication. A restart by the Elastic Agent would not make sense in the described cases. @cmacknz can you already share any more details on how this will look like or expected timelines for the definitions for the healthcheck work? |
@simitt We have been iterating on the design details. The proposal is Integration Status Health Reporting. It was being reworked a bit last week but the high level details are right. I added you to the stakeholder list to make sure you are notified of changes. The new error reporting mechanism needs to be supported in the agent control protocol, @ph can comment on the timeline for implementing this but I suspect implementation will start in 8.4 sometime. |
@felixbarny given the above conversation, I don't think it makes sense to implement something in the apm-server before the healthcheck endpoint in the Elastic Agent is defined. What do you think? |
Yes, I agree. |
@eyalkoren is looking into splitting the attacher off into its own integration, which would naturally enable surfacing errors. I don't think it makes sense to invest in a lot of changes to Elastic Agent, Fleet, and APM Server in the interim, when we plan to provide a more dedicated integration in the hopefully not too distant future. If needed we can reopen this. |
When using the java-attacher, an error (e.g. failure to execute java) should be indicated in Kibana somehow. For example, this might be done by setting the status of the APM integration to degraded.
The text was updated successfully, but these errors were encountered: