Skip to content
This repository has been archived by the owner on Sep 21, 2023. It is now read-only.

[Meta] elastic-agent-shipper journey to GA #197

Open
9 of 54 tasks
Tracked by #16
leehinman opened this issue Dec 6, 2022 · 9 comments
Open
9 of 54 tasks
Tracked by #16

[Meta] elastic-agent-shipper journey to GA #197

leehinman opened this issue Dec 6, 2022 · 9 comments
Assignees
Labels
Meta Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@leehinman
Copy link
Contributor

leehinman commented Dec 6, 2022

Fill in checklists below with issues

Checklist to achieve experimental status

Checklist to achieve beta status

  • support global processors
  • Elasticsearch V2 output
  • diskqueue is beta (see [Meta] disk queue journey to GA #118)
  • performance use cases are finalized
  • performance is at least close (~90%) of Beats under agent with own output
  • tests exist for memory and IO usage, and comparisons to Beats under agent with own output
  • can be selected as output in fleet UI
  • startup, shutdown, input, output & performance issues can all be debugged with only the data from the elastic-agent diagnostics command.
  • handle policy updates & queued events

Checklist to achieve ga status

  • elastic-agent-shipper is default output in fleet UI
  • performance is as good as current Beats under agent for all performance use cases
  • disk queue is ga (see [Meta] disk queue journey to GA #118)
  • output automatic tuning finalized

Previous

Below is what we had when elastic-agent-shipper was a separate repo, with all new code.
Keeping for historic reasons.

Checklist to achieve experimental status

Checklist to achieve beta status

Checklist to achieve ga status

  • disk queue is ga (see [Meta] disk queue journey to GA #118)
  • output automatic tuning finalized
  • support for global processors
  • elastic-agent-shipper is default output in fleet UI
  • performance is as good as current Beats under agent for all performance use cases
@leehinman leehinman added Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Meta labels Dec 6, 2022
@leehinman leehinman self-assigned this Dec 6, 2022
@leehinman leehinman changed the title [Meta] journey to GA [Meta] elastic-agent-shipper journey to GA Dec 6, 2022
@cmacknz
Copy link
Member

cmacknz commented Dec 12, 2022

In either the experimental or beta criteria we need an item to track that the shipper is debuggable using only the information collected by the agent diagnostics command.

Checklist to achieve beta status

  • performance testing framework exists
  • performance use cases are finalized

Maybe this is implicit in the two items above, but I think we really want to know how the performance of the agent with the shipper compares to the performance of the agent without shipper before we can recommend anyone use it as a beta.

I think I would rather see "performance is as good as current Beats under agent for all performance use cases" as a Beta requirement to set expectations properly for ourselves, we don't want to pursue this only at the end. If there is some unexpected challenge here we can defer it from the Beta criteria later, but ideally we can make the shipper a performance improvement.

Checklist to achieve ga status

  • support for global processors

I don't think we need global processors to be GA, because this is a completely new feature. This could be done at any time.

  • Output automatic tuning finalized

What does "finalized" mean here? We may want to be cautious about coupling the shipper GA criteria to a GA-able implementation of automatic output tuning. Ideally we can include this though, it is likely necessary to avoid annoying configuration migrations (for existing workers and bulk_max_size configurations).

@leehinman
Copy link
Contributor Author

In either the experimental or beta criteria we need an item to track that the shipper is debuggable using only the information collected by the agent diagnostics command.

Added.

@leehinman
Copy link
Contributor Author

Maybe this is implicit in the two items above, but I think we really want to know how the performance of the agent with the shipper compares to the performance of the agent without shipper before we can recommend anyone use it as a beta.

Moved the "performance as good as curent Beats under agent" up to beta

@leehinman
Copy link
Contributor Author

I don't think we need global processors to be GA, because this is a completely new feature. This could be done at any time.

I'm in favor of moving this post GA. The reason it on the list is because we don't seem to have a list of features for MVP, so I was going off the assumption that all of the ones listed in the design doc would be needed for GA.

@leehinman
Copy link
Contributor Author

  • Output automatic tuning finalized

What does "finalized" mean here? We may want to be cautious about coupling the shipper GA criteria to a GA-able implementation of automatic output tuning. Ideally we can include this though, it is likely necessary to avoid annoying configuration migrations (for existing workers and bulk_max_size configurations).

I was thinking "finalized" would be the user facing portion, so if we need to change the configuration parameters we can up to this point, but after this we have to worry about configuration migration. Maybe it would be better to rename this to something like "finalize configuration options for GA"?

@cmacknz
Copy link
Member

cmacknz commented Dec 12, 2022

Maybe it would be better to rename this to something like "finalize configuration options for GA"?

Agreed, let's make that change to clarify this.

@faec
Copy link
Contributor

faec commented Dec 12, 2022

support for selectors for index / data_streams

This is listed in beta but to me it seems like we might want it for experimental, or at least some partial solution -- today the shipper can only target a single hardcoded Elasticsearch index. We could easily make that single index configurable, but it would still be a single fixed index. Targeting multiple indices with a single shipper would likely require updates to the support library (I've created an issue for the main technical dependency here).

I'm not sure how we expect people to use the experimental releases, but to me it seems like sending all inputs from all sources to a single fixed index would rule out an awful lot of use cases, even for testing.

Overall the question of index / data stream selection could use a lot more clarity... I gather that at some point the output data streams will all be managed through agent, but I'm not sure we have a definite plan how that will happen. Maybe "Event index / datastream can be derived from the agent policy" should be its own item on the checklist, since getting that information from upstream is a separate process than just supporting selectors internally?

@leehinman
Copy link
Contributor Author

Overall the question of index / data stream selection could use a lot more clarity... I gather that at some point the output data streams will all be managed through agent, but I'm not sure we have a definite plan how that will happen. Maybe "Event index / datastream can be derived from the agent policy" should be its own item on the checklist, since getting that information from upstream is a separate process than just supporting selectors internally?

The gRPC Event has the datastream field. https://github.com/elastic/elastic-agent-shipper-client/blob/a7eedbe6bd6c711eac7ee1b2f7d7cf6ea03155be/api/messages/publish.proto#L56-L63 Is that sufficient for index / data stream selection?

@leehinman
Copy link
Contributor Author

This is listed in beta but to me it seems like we might want it for experimental, or at least some partial solution -- today the shipper can only target a single hardcoded Elasticsearch index. We could easily make that single index configurable, but it would still be a single fixed index. Targeting multiple indices with a single shipper would likely require updates to the support library (I've created an issue for the main technical dependency here).

Moved it to experimental. From the comments on #202 it looks like targeting multiple indexes should work.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Meta Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

3 participants