Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extension/opamp] Implement ReportsHealth capability #35488

Conversation

bacherfl
Copy link
Contributor

@bacherfl bacherfl commented Sep 30, 2024

Description: This PR adds the ReportsHealth capability to the OpAMP extension.

Link to tracking Issue: #35433

Testing: Add unit test to check the configuration being used for the OpAMP client. Also tested manually by running the supervisor and verifying the health messages are received there

Documentation: Describe the new option in the readme

Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
@bacherfl bacherfl marked this pull request as ready for review September 30, 2024 12:32
@bacherfl bacherfl requested a review from a team as a code owner September 30, 2024 12:32
Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be a complete implementation of this functionality? The healthcheckv2extension aggregates component status (via the componentstatus.Watcher interface) and produces a nested representation of collector health. See the "verbose example" here. This representation is structurally similar to the ComponentHealth message for OpAMP. I think we can extract the status aggregator from the healthcheckv2 extension, use it in the OpAMP extension, and convert AggregateStatuses into ComponentHealth messages. I believe work to extract the aggregator is underway as part of: #34692.

@bacherfl
Copy link
Contributor Author

bacherfl commented Oct 4, 2024

Is this meant to be a complete implementation of this functionality? The healthcheckv2extension aggregates component status (via the componentstatus.Watcher interface) and produces a nested representation of collector health. See the "verbose example" here. This representation is structurally similar to the ComponentHealth message for OpAMP. I think we can extract the status aggregator from the healthcheckv2 extension, use it in the OpAMP extension, and convert AggregateStatuses into ComponentHealth messages. I believe work to extract the aggregator is underway as part of: #34692.

Thanks for the input @mwear - this implementation is a rudimentary version of the health report to just indicate whether the agent is up or not (i kept the implementation close to how it is implemented in the supervisor). However, when the component health information from the healthcheckv2 extension becomes available to be used here as well using this here makes a lot of sense - I can also leave this PR as a draft for now and then adapt it to populate the component health map once #34692 has been resolved

@mwear
Copy link
Member

mwear commented Oct 4, 2024

@bacherfl, if this is helpful as is, we can move forward with it so long as we have an issue describing the end result we want.

Copy link
Contributor

@evan-bradley evan-bradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One request, but I'm fine getting a basic implementation in since it inches us closer to proper health reporting.

extension/opampextension/opamp_agent.go Outdated Show resolved Hide resolved
# Conflicts:
#	extension/opampextension/go.mod
#	extension/opampextension/go.sum
Signed-off-by: Florian Bacher <[email protected]>
Copy link
Contributor

@evan-bradley evan-bradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as a first pass. The next step will be using status reporting.

# Conflicts:
#	extension/opampextension/go.mod
#	extension/opampextension/go.sum
@evan-bradley evan-bradley merged commit d8cad1c into open-telemetry:main Oct 17, 2024
158 checks passed
@github-actions github-actions bot added this to the next release milestone Oct 17, 2024
@evan-bradley
Copy link
Contributor

I've captured the next step here, which uses component status reporting to improve the health information we report: #35856.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants