Create tool for probing Agent's checkin / control protocol #3390

faec · 2023-09-08T19:16:28Z

A recurring difficulty in evolving Agent's control protocol, e.g. in issues like #2460, is that there is no definitive specification for Agent's behavior when interacting with a component. Current clients rely on heuristics and convention that have evolved through the interactions of several repositories. While this works in the ideal case, error conditions and unusual environments can produce unexpected behavior that is hard to troubleshoot or even detect. It is also an obstacle to creating new clients, since Beats (the biggest current Agent client) makes several unusual implementation choices to adapt its legacy codebase to run under Agent, and thus makes a poor model for new work to imitate.

This issue is to create a testing/debugging component that will run under Agent, and will allow simulation and logging of control protocol interactions with a variety of error states. For example, it should be straightforward to inject delays in the RPC responses, trigger error states in active units, or send checkin data with outdated config indices. It should also provide simple hooks to create custom scenarios.

The main uses of this tool are expected to be:

Enable Agent's control protocol to evolve towards a more definitive specification while preserving interoperability with past versions
Give Agent engineers a straightforward way to simulate unusual conditions encountered "in the wild" when investigating support cases or reported issues
Enable automated testing over a wider range of error conditions

elasticmachine · 2023-09-08T19:16:30Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz · 2023-09-08T20:25:34Z

Pulling this into the current sprint as the first step in solving github.com//issues/2460

faec added enhancement New feature or request Team:Elastic-Agent Label for the Agent team labels Sep 8, 2023

faec self-assigned this Sep 8, 2023

faec mentioned this issue Sep 8, 2023

Control protocol checkin payloads can exceed the gRPC maximum message size when using autodiscovery #2460

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create tool for probing Agent's checkin / control protocol #3390

Create tool for probing Agent's checkin / control protocol #3390

faec commented Sep 8, 2023

elasticmachine commented Sep 8, 2023

cmacknz commented Sep 8, 2023

Create tool for probing Agent's checkin / control protocol #3390

Create tool for probing Agent's checkin / control protocol #3390

Comments

faec commented Sep 8, 2023

elasticmachine commented Sep 8, 2023

cmacknz commented Sep 8, 2023