[Fleet] Show remote es output error state on UI #172181

juliaElastic · 2023-11-29T14:55:39Z

Summary

Reading latest output health state from logs-fleet_server.output_health-default data stream by output id, and displaying error state on UI - Edit Output flyout.

Steps to verify:

enable feature flag remoteESOutput
add remote_elasticsearch output, can be a non-existent host for this test
add the output as monitoring output of an agent policy
run fleet-server with the changes here
enroll an agent
wait until fleet-server starts reporting degraded state in the output health data stream
open edit output flyout on UI and verify that the error state is visible
when the connection is back again (update host to a valid one, or remote es was temporarily down), the error state goes away

The UI was suggested in the design doc: https://docs.google.com/document/d/19D0bX7oURf0yms4qemfqDyisw_IYB-OVw4oU-t4lf18/edit#bookmark=id.595r8l91kaq8

Notes/suggestions:

We might want to add the output state to the output list as well (maybe as badges like agent health?) as it's not too visible in the flyout (have to scroll down).
Also the error state will be reported earliest when an agent is enrolled and fleet-server can't create api key, so not immediately when the output is added. It would be good to show the time of the last state (e.g. how we display on agents last checkin x minutes ago)
I think it would be beneficial to display the healthy state too.

Added badges to output list:

Added healthy state UI to Edit output:

Checklist

Delete any items that are not applicable to this PR.

Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Unit or functional tests were updated or added to match the most common scenarios

elasticmachine · 2023-11-29T14:55:46Z

Pinging @elastic/fleet (Team:Fleet)

apmmachine · 2023-11-29T14:55:56Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
/oblt-deploy-serverless : Deploy a serverless Kibana instance using the Observability test environments.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

juliaElastic · 2023-11-29T14:59:05Z

x-pack/plugins/fleet/server/services/output.ts

+    const latestHit = response.hits.hits[0]._source as any;
+    return {
+      state: latestHit.state,
+      message: latestHit.message ?? '',


we might want to add the timestamp, to report the last time on the UI, in case the health reporting stopped and the state might be stale

Added a tooltip with the last reported time:

.../public/applications/fleet/sections/settings/components/edit_output_flyout/output_health.tsx

joshdover · 2023-11-30T11:02:19Z

Couple of questions:

How does this UX work when there are multiple Fleet Servers and some are having connectivity issues to a remote ES and others are not?
Could we show an overall status badge similar to the agent health badge on the output table in main Fleet Settings page? That way users can see the status problems without opening the flyout.

juliaElastic · 2023-11-30T11:51:27Z

Couple of questions:

How does this UX work when there are multiple Fleet Servers and some are having connectivity issues to a remote ES and others are not?

Is it possible that the same agent is checking in to multiple fleet servers? If so, then it's possible that two fleet servers start pinging the same remote ES. If they report different state to the data stream, we might see it as oscillating between healthy and degraded on the UI.
What would be the scenario when two Fleet Servers report different status? Something like an air gapped FS and a public one?

Could we show an overall status badge similar to the agent health badge on the output table in main Fleet Settings page? That way users can see the status problems without opening the flyout.

Yes, this would be nice, I was thinking about this. I'll add it.

kpollich

LGTM 🚀

kibana-ci · 2023-12-04T08:43:14Z

💛 Build succeeded, but was flaky

Failed CI Steps

FTR Configs #3

Test Failures

[job] [logs] FTR Configs #3 / endpoint Response Actions Responder from alerts should show Responder from alert details under alerts list page

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`fleet`	949	950	+1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`fleet`	1.2MB	1.2MB	+2.5KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`fleet`	151.9KB	152.3KB	+353.0B

History

💔 Build #180564 failed 240823a
💔 Build #180520 failed c3e3e8d
💔 Build #180494 failed 1e0f352
💔 Build #180470 failed 57fa51e
💚 Build #180452 succeeded ac6c72a

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @juliaElastic

@jillguyonnet

## Summary Closes #104986 Enable feature flags for `remoteESOutput` and `outputSecretsStorage`. The feature is ready when #172181 and elastic/fleet-server#3127 is merged. Output secret storage [issues](#157458) are closed, so I think the feature flag for that should be enabled too. cc @jillguyonnet

kilfoyle

LGTM! 🚀

output health

8293d77

juliaElastic self-assigned this Nov 29, 2023

juliaElastic requested a review from a team as a code owner November 29, 2023 14:55

botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Nov 29, 2023

juliaElastic added release_note:skip Skip the PR/issue when compiling release notes and removed Team:Fleet Team label for Observability Data Collection Fleet team labels Nov 29, 2023

juliaElastic commented Nov 29, 2023

View reviewed changes

botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Nov 29, 2023

juliaElastic marked this pull request as draft November 29, 2023 14:59

juliaElastic added the ci:cloud-deploy Create or update a Cloud deployment label Nov 29, 2023

juliaElastic mentioned this pull request Nov 29, 2023

Report remote output health elastic/fleet-server#3116

Closed

kpollich reviewed Nov 29, 2023

View reviewed changes

.../public/applications/fleet/sections/settings/components/edit_output_flyout/output_health.tsx Outdated Show resolved Hide resolved

Merge branch 'main' into output-health

bf7747c

juliaElastic and others added 8 commits November 30, 2023 13:01

Merge branch 'main' into output-health

ac6c72a

added badge to output list

ecb23c7

use react query to fetch output health

99322ed

reset output health if error

57fa51e

showing last reported time

1e0f352

renamed translations

fb3f477

added api tests

c3e3e8d

added ui tests

ee5caaa

juliaElastic marked this pull request as ready for review November 30, 2023 15:53

juliaElastic and others added 2 commits November 30, 2023 16:59

convert timestamp to string

264d3b6

Merge branch 'main' into output-health

240823a

kpollich approved these changes Nov 30, 2023

View reviewed changes

added openapi spec

1b258bd

juliaElastic requested a review from a team as a code owner December 4, 2023 08:00

juliaElastic requested a review from kilfoyle December 4, 2023 08:01

Merge branch 'main' into output-health

07e638a

juliaElastic mentioned this pull request Dec 4, 2023

[Fleet] enable feature flags #172464

Merged

kilfoyle approved these changes Dec 5, 2023

View reviewed changes

juliaElastic merged commit ae5e2fd into elastic:main Dec 5, 2023
33 checks passed

kibanamachine added v8.12.0 backport:skip This commit does not require backporting labels Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fleet] Show remote es output error state on UI #172181

[Fleet] Show remote es output error state on UI #172181

juliaElastic commented Nov 29, 2023 •

edited

Loading

elasticmachine commented Nov 29, 2023

apmmachine commented Nov 29, 2023

juliaElastic Nov 29, 2023

juliaElastic Nov 30, 2023

joshdover commented Nov 30, 2023

juliaElastic commented Nov 30, 2023

kpollich left a comment

kibana-ci commented Dec 4, 2023 •

edited

Loading

kilfoyle left a comment

[Fleet] Show remote es output error state on UI #172181

[Fleet] Show remote es output error state on UI #172181

Conversation

juliaElastic commented Nov 29, 2023 • edited Loading

Summary

Notes/suggestions:

Checklist

elasticmachine commented Nov 29, 2023

apmmachine commented Nov 29, 2023

🤖 GitHub comments

juliaElastic Nov 29, 2023

Choose a reason for hiding this comment

juliaElastic Nov 30, 2023

Choose a reason for hiding this comment

joshdover commented Nov 30, 2023

juliaElastic commented Nov 30, 2023

kpollich left a comment

Choose a reason for hiding this comment

kibana-ci commented Dec 4, 2023 • edited Loading

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

Module Count

Async chunks

Page load bundle

History

kilfoyle left a comment

Choose a reason for hiding this comment

juliaElastic commented Nov 29, 2023 •

edited

Loading

kibana-ci commented Dec 4, 2023 •

edited

Loading