Add fields to integrations by default #949

ishleenk17 · 2022-08-25T12:03:27Z

As part of elastic package, it would be good to have the below fields in the default template of elastic package create, so that every integration inherits these.

Tags
Processors
ignore-older

These fields can be optional for the users.

The above would provide us unanimity across integrations and more flexibility to users.

cc: @ruflin

mtojek · 2022-08-25T12:33:28Z

pinging @elastic/ecosystem and @mukeshelastic for prioritization

ruflin · 2022-08-25T12:49:33Z

One of the options is to add these to each package. The other one is to have it as a feature of Fleet like data_stream.dataset value.

I think ignore_older falls under its own category as it only applies to the logs input.

jsoriano · 2022-08-25T16:03:46Z

I agree that they would offer a lot of flexibility to users, but I think that there are already other efforts intended to cover these areas.

For example custom processing is being added to Fleet. Having it in Fleet makes it available for all packages without adding anything. Any improvement in this area will benefit all packages, including existing ones.
When adding custom processing you will end up needing custom mappings or other features, that is better to solve in a single place.
Adding processors in packages would offer another way of doing custom processing, but limited, and not aligned with other efforts. Also, this processing would happen on the agent side, when the current tendency is to bring processing to the ingest nodes.

Tags would be covered by any solution intended for custom processing. If you can do custom processing, you can add tags.
If we want to have specific settings to add tags instead of using generic custom processing, I think that this should also be implemented in Fleet.

Regarding ignore_older. This is a specific setting of an specific input. There are many other settings and many other inputs, it is not possible to expose them all in all packages. For this a better solution would be input packages, another effort already in progress, that will allow to expose input-specific settings, and make them available for its use in combination with other packages.

ruflin · 2022-08-26T07:14:09Z

We need to split up a few things here:

global processing: should be done in ingest pipelines
local processing where local knowledge is needed: things like looking up host info, dropping filtering events

I don't think the custom processing covers the local case. For tags, it is also a bit different. Assuming we have multiple policies shipping system metrics. All metrics go to the same data stream, but for the "east coast" policy a tag should be added to each event and one for the "west coast" policy. This can't be done globally AFAIK.

jsoriano · 2022-08-26T08:17:23Z

Good points on global vs. local processing. I still wonder if, given that we want this in all packages (or all packages of certain kinds), the config for local processing should be also managed by Fleet. This would allow to change the implementation without needing to change all packages.
Imagine for example looking up host info (or any other local info), it could be based now on local processors, but there could be also an implementation based on the enrichment processor that could allow to centralize the collection of this data in one agent, instead of needing implementations and error-prone yaml configurations on each collector of this agent. Something like this has been discussed in some other places.

It is true that we are probably far from having anything based on enrichment processor, but still, I think it'd be preferred if things wanted the same in all packages of a kind are managed by Fleet. For consistency and to reduce the number of things package developers need to take into account.

For tags, it is also a bit different. Assuming we have multiple policies shipping system metrics. All metrics go to the same data stream, but for the "east coast" policy a tag should be added to each event and one for the "west coast" policy. This can't be done globally AFAIK.

Another good point, but this sounds to me more like a limitation of the current approach for custom processing. As a user I would expect to be able to add different custom processing per policy, the data stream is more an implementation detail.

My main point is that requiring to add something to all packages to have some desired functionality sounds like a source of problems: there can be inconsistencies between packages and it increases complexity (and error-proneness) of package development.

ishleenk17 · 2022-08-26T10:53:06Z

Processors:
I think from the above discussions we agree that local processing is required as well.
The next point for discussion being if we have to do it at package or fleet level.
To avoid inconsistencies across packages, I think it would be a good idea to have these implemented at Fleet level along with custom processing.

Tags:
Is there an agreement that we would be having custom processing at policy level ? That would help us avoid usage of tags.

ignore_older:
@jsoriano : You mentioned that input packages would take care of these. Would this be a default setting available in the appropriate input package or a customer using these input packages for their service would have to add these in config ?

Pinging @elastic/obs-service-integrations

jsoriano · 2022-08-26T11:08:52Z

@jsoriano : You mentioned that input packages would take care of these. Would this be a default setting available in the appropriate input package or a customer using these input packages for their service would have to add these in config ?

The log input package would have this option, yes. Ideally these inputs could use the pipelines of other packages, providing a similar functionality to the definition of custom inputs in Filebeat modules configuration.
The UX for this is still under discussion: elastic/kibana#133296.

ishleenk17 · 2022-08-29T05:28:12Z

@jsoriano : Is there any ticket for custom processing at policy level ?
@ruflin Also, I suppose we should proceed with having processors implementation at fleet level. We can check further on this with Fleet team.

ruflin · 2022-08-29T07:01:33Z

The log input package would have this option, yes.

This triggered a more fundamental idea on my end. What if Fleet becomes knowledge about certain "core" inputs like logfile? Instead of each package having to figure out what the necessary fields are for using the logs input instead Fleet would have some basic knowledge. Or a package would inherit it from the log input package? This could be an extension of the input package development elastic/kibana#133296 @kpollich It makes it even more clear that any non input package is basically an extension of an input package (or multiple of them).

jsoriano · 2022-08-29T07:18:22Z

Is there any ticket for custom processing at policy level ?

Not that I know, I'd say this is a new idea that has appeared here. Pinging @kpollich for his opinion here.

What if Fleet becomes knowledge about certain "core" inputs like logfile? Instead of each package having to figure out what the necessary fields are for using the logs input instead Fleet would have some basic knowledge. Or a package would inherit it from the log input package?

Yes, I think we should go in the direction of including this information in the specific input packages and not on every package. Fleet would also get the knowledge about these options from the input packages, this knowledge shouldn't be hardcoded in Fleet.

ishleenk17 · 2022-08-30T11:46:01Z

Also, I suppose we should proceed with having processors implementation at fleet level. We can check further on this with Fleet team.

@kpollich: Could you please let us know about any dependencies if these processors are required to be implemented at Fleet Level.

kpollich · 2022-08-30T12:30:14Z

Is there any ticket for custom processing at policy level ?

I believe this is the issue we're looking for here: elastic/kibana#122892

Could you please let us know about any dependencies if these processors are required to be implemented at Fleet Level.

I don't think we have any real blockers on our end, as we already support custom processors for the Custom Logs integration. We'd just need to do some definition/architecture around how we'll expand that support to all integrations.

What if Fleet becomes knowledge about certain "core" inputs like logfile? Instead of each package having to figure out what the necessary fields are for using the logs input instead Fleet would have some basic knowledge. Or a package would inherit it from the log input package?

Yes, I think we should go in the direction of including this information in the specific input packages and not on every package. Fleet would also get the knowledge about these options from the input packages, this knowledge shouldn't be hardcoded in Fleet.

I like these thoughts, but I wonder about the feasibility of "cross package lookups". e.g. I have the nginx package which "extends" the logs input package. The logs input package defines support for custom processors, ignore_older, etc. Does Fleet need to go fetch that input package (or otherwise have some internal reference to it) in order to resolve those fields/options? Right now a single package is its own source of truth. If we have this notion of extending "base packages" we break that assumption which feels incorrect to me.

If this isn't the case, and a single package manifest still contains all the package data Fleet needs to operate, then the "base package" notion is not a Fleet concern and instead should be handled by the elastic-package CLI when bootstrapping a package. e.g. we can create an nginx package that extends the logs base package and is pre-populated with the extended fields.

ruflin · 2022-08-31T08:59:59Z

I like these thoughts, but I wonder about the feasibility of "cross package lookups". e.g. I have the nginx package which "extends" the logs input package. The logs input package defines support for custom processors, ignore_older, etc. Does Fleet need to go fetch that input package (or otherwise have some internal reference to it) in order to resolve those fields/options? Right now a single package is its own source of truth. If we have this notion of extending "base packages" we break that assumption which feels incorrect to me.

If this isn't the case, and a single package manifest still contains all the package data Fleet needs to operate, then the "base package" notion is not a Fleet concern and instead should be handled by the elastic-package CLI when bootstrapping a package. e.g. we can create an nginx package that extends the logs base package and is pre-populated with the extended fields.

All very good thoughts. To not derail the initial conversation of the Github issue, should we take this to a new issue?

ishleenk17 · 2022-09-01T09:30:00Z

I don't think we have any real blockers on our end, as we already support custom processors for the Custom Logs integration. We'd just need to do some definition/architecture around how we'll expand that support to all integrations.

@kpollich I have raised a Github ticket for the same. elastic/kibana#139900.
Please let me know if this can be picked as part of next iteration. There have been multiple SDH requests for the same.

I believe this is the issue we're looking for here: elastic/kibana#122892

Also, regarding elastic/kibana#122892, for custom processing at policy level.
Could you please share the completion schedule for the same.

kpollich · 2022-09-01T11:30:37Z

cc @jen-huang @nimarezainia - See above for requested work in 8.6

ishleenk17 · 2022-09-08T13:07:33Z

If we implement processors in Fleet UI, what would be the behaviour of the integrations that already have a processor yaml box and are brought up on the new UI.
@ruflin @kpollich : Thoughts ?

ishleenk17 · 2022-09-12T10:47:44Z

Gisting out AI's here:

Processors: This is which we expect to be added through Fleet:
Support of processors across integrations through Fleet kibana#139900.

cc @jen-huang @nimarezainia - See above for requested work in 8.6

@jen-huang @nimarezainia : Could you please incorporate this in 8.6 planning. It would be a great addition for integrations from Fleet.

Tags: If the processing is at policy level, having tags can be avoided.
@kpollich, @jen-huang @nimarezainia : [Fleet] Add input settings at the agent policy level kibana#122892 Is this planned for 8.6 ?
ignore_older: For now if there is an urgent ask for this, we will add it to specific integration. Once input package are implemented, we expect these to be part of the input

ruflin · 2022-09-12T14:45:40Z

If we implement processors in Fleet UI, what would be the behaviour of the integrations that already have a processor yaml box and are brought up on the new UI.

I expect that we have quite a few scenarios in the future where certain configs become obsolete because these will be removed eventually or were taken over by another config. In an ideal scenario we can just migrate users but I guess this is not the case here. One idea would be to mark this fields as deprecated. For any user where the field already contains content, it still shows up and these users have to migrate over to the new feature manually. For any other user, it does not show up.

ishleenk17 · 2022-09-27T06:24:30Z

As per Kyles comment here, looks like having a generic raw yml box is not being looked at for near future.
In that case all of these processors, tags, condition addition has to be taken up individually.
@ruflin @gizas : Please share your thoughts here

gizas · 2022-09-27T09:07:07Z

Added a follow up elastic/kibana#139900 (comment) .
I would like to clarify how soonish can any kind of such changes start going into roadmap.

Additionally processors are a little different than conditions, because can live for start only in the top level (if that eases the implementation) and do not interfere with any existing data stream field. So the text box for conditions can start and independently continue the discussions for any new cases like processors that can be more complex

FYI exposing conditions is a critical thing for cloudnative team if we want to support auto-discovery feature in managed agent based on them

ChrsMark · 2022-09-27T09:21:17Z

@gizas @ishleenk17 @kpollich let's use elastic/kibana#108525 for the conditions specific topic since it's kind of different one.

ishleenk17 · 2022-09-27T09:55:06Z

If we are not taking the generic approach of raw yaml, it would be good to carry on discussions for processors/tags here and conditions in the other thread.

nimarezainia · 2022-09-28T18:35:06Z

@ishleenk17 Consistency of input processors across all integrations is a goal but we won't be able to address this in the 8.6 timeframe.

What we have done (based on customer requirement) has been to ensure that some of the top integrations already have implemented tags&processors for their datastreams. is there a specific customer request we need to consider? if not we should punt this to a later release and ensure the technical design is appropriate.

We also do plan to address the Global processors on a per policy basis via: https://github.com/elastic/ingest-dev/issues/2442

ishleenk17 · 2022-09-29T05:25:05Z

@nimarezainia , there are a couple of reasons why we think having this consistency of processors/tags is important:

There have been couple of SDH's/customer asks for this enhancement.
For all the new integrations we are building this has been an ask in PRD's
Solution architects have reached out stating importance of having these while being on call with the customer

if not we should punt this to a later release and ensure the technical design is appropriate.

I agree, the end to end technical design and challenges should be thought through. We will plan to close that in this iteration, post that I suppose we should prioritize this activity.

cc: @rameshelastic @lalit-satapathy

nimarezainia · 2022-09-29T20:02:04Z

@ishleenk17 I agree that we need this type of consistency across all integrations and the proposal to do it via fleet seems to be lowest effort approach. Regarding customer requests though, we have addressed those for the most part by modifying the individual packages - so I think there shoudl be less urgency in that sense.

Based on the lack of urgency on this I don' want to disrupt the work/planning that has already started for 8.6 and we will try and pursiue the correct implementation in 8.7 (once prioritized amongst other asks there)

ishleenk17 · 2022-11-08T05:24:03Z

we will try and pursiue the correct implementation in 8.7 (once prioritized amongst other asks there)

Hello @nimarezainia , wanted to check if we will be prioritizing this for 8.7 ?
cc: @rameshelastic @lalit-satapathy

ishleenk17 · 2023-04-07T16:50:01Z

@nimarezainia : There have been a couple of requests for processors in integrations yet again.
This time the ask came in for processors in Redis.
Can we look into this issue and take it up for 8.9 by fixing it through Fleet? It will help us cater to such requests in general across integrations.
cc: @rameshelastic @lalit-satapathy @ruflin @mukeshelastic

nimarezainia · 2023-04-09T06:32:21Z

@ishleenk17 our 8.9 is over subscribed. I have added this as a lower priority item for 8.9 but we may have to consider 8.10 or later.

(fyi @kpollich )

jlind23 · 2023-05-12T05:50:15Z

@andresrc for this is has been push out to alter sprint due to competing priorities.
If this is a high priority on your end feel free to have someone from your team pushing a PR and I will assign someone to review it.

elasticmachine · 2023-05-12T12:36:07Z

Pinging @elastic/fleet (Team:Fleet)

nimarezainia · 2023-06-19T23:13:05Z

The UX proposal for this, as was developed for some of the most utilized integrations, is here: https://docs.google.com/document/d/18x1q_jog3_xgCdsU20wVZcyZz1_ZiI4pXGE5ZMma6Ho/edit?usp=sharing

I'm moving this out of PM definition into tech.

kpollich · 2023-11-01T18:20:51Z

I've created elastic/kibana#170336 to capture the plan for adding default field appending logic for all data streams to Fleet

kpollich · 2024-07-19T17:02:03Z

Closing in favor of elastic/kibana#170336

jlind23 added the Team:Ecosystem Label for the Packages Ecosystem team label Aug 25, 2022

ishleenk17 assigned ishleenk17 and unassigned ishleenk17 Aug 25, 2022

jsoriano added the discuss label Aug 25, 2022

kpollich mentioned this issue Aug 31, 2022

[Discussion] Input packages as "templates" elastic/package-spec#404

Open

ishleenk17 changed the title ~~Add fields to default elastic package template~~ Add fields to integrtions by default Sep 5, 2022

ishleenk17 self-assigned this Sep 5, 2022

ishleenk17 changed the title ~~Add fields to integrtions by default~~ Add fields to integrtaions by default Sep 5, 2022

gizas mentioned this issue Sep 8, 2022

[Fleet] Add Processor Setting at the agent policy level elastic/kibana#140276

Open

ishleenk17 changed the title ~~Add fields to integrtaions by default~~ Add fields to integrations by default Sep 12, 2022

ritalwar mentioned this issue Sep 12, 2022

[mySQL Integration] Add tags & processors to all mySQL related datastreams elastic/integrations#3431

Closed

jen-huang unassigned ishleenk17 Sep 12, 2022

ChrsMark mentioned this issue Sep 26, 2022

[Meta][Fleet] Offer a UI for input conditions elastic/kibana#108525

Open

kpollich mentioned this issue Sep 26, 2022

Support of processors across integrations through Fleet elastic/kibana#139900

Open

zez3 mentioned this issue Oct 4, 2022

Filebeat Processors - make uppercase and lowercase processors (as in ES ingest nodes) available to filebeat elastic/beats#22254

Closed

nimarezainia mentioned this issue Oct 25, 2022

[Custom Log Integration] Add tags & processors to Custom Log datastreams elastic/integrations#3432

Closed

andresrc added Team:Fleet Label for the Fleet team and removed Team:Ecosystem Label for the Packages Ecosystem team labels May 12, 2023

nimarezainia removed their assignment Jul 27, 2023

jen-huang assigned kpollich and unassigned ishleenk17 Oct 25, 2023

This was referenced Oct 27, 2023

Integration policy update/edit error due to duplicate key elastic/integrations#8274

Open

[Fleet] Add default variables to all integration data streams elastic/kibana#170336

Open

ishleenk17 self-assigned this Nov 2, 2023

jlind23 unassigned ishleenk17 Nov 6, 2023

jlind23 unassigned kpollich Nov 13, 2023

kpollich closed this as not planned Won't fix, can't repro, duplicate, stale Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fields to integrations by default #949

Add fields to integrations by default #949

ishleenk17 commented Aug 25, 2022

mtojek commented Aug 25, 2022

ruflin commented Aug 25, 2022

jsoriano commented Aug 25, 2022 •

edited

Loading

ruflin commented Aug 26, 2022

jsoriano commented Aug 26, 2022

ishleenk17 commented Aug 26, 2022

jsoriano commented Aug 26, 2022

ishleenk17 commented Aug 29, 2022

ruflin commented Aug 29, 2022

jsoriano commented Aug 29, 2022

ishleenk17 commented Aug 30, 2022 •

edited

Loading

kpollich commented Aug 30, 2022 •

edited

Loading

ruflin commented Aug 31, 2022 •

edited

Loading

ishleenk17 commented Sep 1, 2022

kpollich commented Sep 1, 2022

ishleenk17 commented Sep 8, 2022

ishleenk17 commented Sep 12, 2022

ruflin commented Sep 12, 2022

ishleenk17 commented Sep 27, 2022

gizas commented Sep 27, 2022 •

edited

Loading

ChrsMark commented Sep 27, 2022

ishleenk17 commented Sep 27, 2022

nimarezainia commented Sep 28, 2022

ishleenk17 commented Sep 29, 2022

nimarezainia commented Sep 29, 2022

ishleenk17 commented Nov 8, 2022 •

edited

Loading

ishleenk17 commented Apr 7, 2023

nimarezainia commented Apr 9, 2023

jlind23 commented May 12, 2023

elasticmachine commented May 12, 2023

nimarezainia commented Jun 19, 2023

kpollich commented Nov 1, 2023

kpollich commented Jul 19, 2024

Add fields to integrations by default #949

Add fields to integrations by default #949

Comments

ishleenk17 commented Aug 25, 2022

mtojek commented Aug 25, 2022

ruflin commented Aug 25, 2022

jsoriano commented Aug 25, 2022 • edited Loading

ruflin commented Aug 26, 2022

jsoriano commented Aug 26, 2022

ishleenk17 commented Aug 26, 2022

jsoriano commented Aug 26, 2022

ishleenk17 commented Aug 29, 2022

ruflin commented Aug 29, 2022

jsoriano commented Aug 29, 2022

ishleenk17 commented Aug 30, 2022 • edited Loading

kpollich commented Aug 30, 2022 • edited Loading

ruflin commented Aug 31, 2022 • edited Loading

ishleenk17 commented Sep 1, 2022

kpollich commented Sep 1, 2022

ishleenk17 commented Sep 8, 2022

ishleenk17 commented Sep 12, 2022

ruflin commented Sep 12, 2022

ishleenk17 commented Sep 27, 2022

gizas commented Sep 27, 2022 • edited Loading

ChrsMark commented Sep 27, 2022

ishleenk17 commented Sep 27, 2022

nimarezainia commented Sep 28, 2022

ishleenk17 commented Sep 29, 2022

nimarezainia commented Sep 29, 2022

ishleenk17 commented Nov 8, 2022 • edited Loading

ishleenk17 commented Apr 7, 2023

nimarezainia commented Apr 9, 2023

jlind23 commented May 12, 2023

elasticmachine commented May 12, 2023

nimarezainia commented Jun 19, 2023

kpollich commented Nov 1, 2023

kpollich commented Jul 19, 2024

jsoriano commented Aug 25, 2022 •

edited

Loading

ishleenk17 commented Aug 30, 2022 •

edited

Loading

kpollich commented Aug 30, 2022 •

edited

Loading

ruflin commented Aug 31, 2022 •

edited

Loading

gizas commented Sep 27, 2022 •

edited

Loading

ishleenk17 commented Nov 8, 2022 •

edited

Loading