Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Implement per-integration health reporting #154634

Closed
allamiro opened this issue Apr 8, 2023 · 5 comments · Fixed by #158826
Closed

[Fleet] Implement per-integration health reporting #154634

allamiro opened this issue Apr 8, 2023 · 5 comments · Fixed by #158826
Assignees
Labels
enhancement New value added to drive a business result QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@allamiro
Copy link

allamiro commented Apr 8, 2023

Describe the feature:
Agent in unhealthy state status should provide more information on what components are failing or affected such as names

1 or more components/units in a failed state

Describe a specific use case for the feature:
By providing more information on the specific components or their names that are failing or affected, or what affecting them it becomes easier to identify and resolve the issue, reducing the time and effort required for troubleshooting.
For example, if an Elastic Agent is reporting an unhealthy state due to a failure in a specific integration or module, providing the name of that component can help the administrator to quickly identify the root cause and take appropriate action to resolve the issue. This can include restarting the affected component, updating its configuration or dependencies, or contacting support for further assistance.

Screenshot 2023-04-08 at 5 11 15 PM

UI design

Figma link | Prototype

The reporting on the agent document that @cmacknz decribes in #154634 (comment) should be extrapolated to the Agent Details UI per the following design. There is prior art done here for custom to the Elastic Defend (Endpoint) integration: #133405

image

Related doc issue: elastic/ingest-docs#209

@botelastic botelastic bot added the needs-team Issues missing a team label label Apr 8, 2023
@jsanz jsanz added enhancement New value added to drive a business result Team:Fleet Team label for Observability Data Collection Fleet team labels Apr 12, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Apr 12, 2023
@jen-huang jen-huang changed the title Improve Elastic Agent error reporting to include component-specific information in unhealthy state status [Fleet] Improve Elastic Agent error reporting to include component-specific information in unhealthy state status Apr 12, 2023
@cmacknz
Copy link
Member

cmacknz commented Apr 24, 2023

#154826 added a way to view the agent status document which contains the equivalent output from elastic-agent status --output=json. This should be available in 8.8.

Eventually there will be a proper UI for this.

@cmacknz
Copy link
Member

cmacknz commented Apr 25, 2023

For reference here the the "View agent JSON" link that will be in 8.8:
Screen Shot 2023-04-25 at 2 56 02 PM

Here is the raw list of components and their state that will explain how the 1 or more components/units in a failed state was determined:
Screen Shot 2023-04-25 at 2 56 18 PM

Just to reiterate there will be a better UI for this eventually.

@jen-huang jen-huang changed the title [Fleet] Improve Elastic Agent error reporting to include component-specific information in unhealthy state status [Fleet] Implement per-integration health reporting May 5, 2023
@juliaElastic juliaElastic added the QA:Needs Validation Issue needs to be validated by QA label May 31, 2023
jillguyonnet added a commit that referenced this issue Jun 14, 2023
## Summary

Implement agent integration health reporting in Fleet UI.

Closes #154634

### Screenshots

These screenshots were taken with an error (invalid config) on the
system integration (metrics).

#### Before

The error affecting the `system` integration is not visible in the UI.
To find it, the user would need to inspect the agent JSON or run the
`elastic-agent status` command.

<img width="1917" alt="Screenshot 2023-06-08 at 16 59 31"
src="https://github.com/elastic/kibana/assets/23701614/03aa372b-15f0-4be3-99a9-d8e68d5e8486">

#### After

The error affecting the `system` integration is surfaced in the UI:

<img width="1917" alt="Screenshot 2023-06-08 at 14 59 38"
src="https://github.com/elastic/kibana/assets/23701614/de9ada67-84c9-4cd0-a1b2-81a2f964ae1a">

For reference, the following screenshots show existing behaviour with
the Elastic Defend integration (errors in the policy response):

<img width="1917" alt="Screenshot 2023-06-08 at 14 59 52"
src="https://github.com/elastic/kibana/assets/23701614/7f793316-cd6d-4b81-97ef-5cdb0b9f2659">

<img width="1917" alt="Screenshot 2023-06-08 at 15 00 05"
src="https://github.com/elastic/kibana/assets/23701614/1d9928dc-09db-4d62-87ca-b708f3f5807a">

### Steps to reproduce

1. Enroll an agent in Fleet and add some integrations.
2. Introduce some failure, e.g. malformed package policy.
3. The failures should correctly be surfaced in the UI.

### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)
@harshitgupta-qasource
Copy link

Hi Team,
We have executed 04 testcases under the Feature test run for the 8.9.0 release at the link:

Status:

Build details:
VERSION: 8.9 BC4
BUILD: 64661
COMMIT: ddf0c19

As the testing is completed on this feature, we are marking this as QA:Validated.

Please let us know if anything else is required from our end.
Thanks

@harshitgupta-qasource harshitgupta-qasource added QA:Validated Issue has been validated by QA and removed QA:Needs Validation Issue needs to be validated by QA labels Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants