-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
report output health #3127
report output health #3127
Conversation
@@ -253,6 +255,32 @@ func (m *selfMonitorT) updateState(ctx context.Context) (client.UnitState, error | |||
return state, nil | |||
} | |||
|
|||
func reportOutputHealth(ctx context.Context, bulker bulk.Bulk, logger zerolog.Logger) { | |||
//pinging logic | |||
bulkerMap := bulker.GetBulkerMap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned on the previous pr, the regular health reporting will stop if fleet-server is restarted, and doesn't restart until an agent tries to create an API key again (e.g. due to change in output config), because the bulkerMap
is stored in memory and output bulkers are created when there is a config change or a new output used for the first time by an agent.
@@ -218,6 +218,8 @@ func (m *selfMonitorT) updateState(ctx context.Context) (client.UnitState, error | |||
return client.UnitStateStarting, nil | |||
} | |||
|
|||
reportOutputHealth(ctx, m.bulker, m.log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently pinging remote outputs every 5s (default monitor interval) and writing out a doc to the output health data stream.
We could change this to only write out a doc if the state changed.
internal/pkg/dl/output_health.go
Outdated
type OutputHealth struct { | ||
Output string `json:"output,omitempty"` | ||
State string `json:"state,omitempty"` | ||
Message string `json:"message,omitempty"` | ||
Timestamp string `json:"@timestamp,omitempty"` | ||
DataStream DataStream `json:"data_stream,omitempty"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is something that is being written to ES, does it make more sense to define in in model/schema.json instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to schema.json
Dataset: "fleet_server.output_health", | ||
Type: "logs", | ||
Namespace: "default", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be constants? Can Namespace
ever be something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be always default.
Co-authored-by: Michel Laterman <[email protected]>
## Summary Closes #104986 Enable feature flags for `remoteESOutput` and `outputSecretsStorage`. The feature is ready when #172181 and elastic/fleet-server#3127 is merged. Output secret storage [issues](#157458) are closed, so I think the feature flag for that should be enabled too. cc @jillguyonnet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Quality Gate passedThe SonarQube Quality Gate passed, but some issues were introduced. 1 New issue |
What is the problem this PR solves?
Report state of remote es outputs
How does this PR solve the problem?
Report
HEALTHY/DEGRADED
state of remote es outputs tologs-fleet_server.output_health-default
.How to test this PR locally
Design Checklist
Checklist
./changelog/fragments
using the changelog toolRelated issues
Resolves #3116