Enrich kubernetes events with involved_object related fields #6817

MichaelKatsoulis · 2023-07-05T07:34:28Z

Events collected by kubernetes events dataset contain informations about the involved_object kind and name.

It would be useful to enrich those events with fields that can help to link an event with a pod.name or deployment.name.
A use case for this is during the creation of meaningful dashboard where a user can notice an issue with a pod and then try to get the events related with this pod.

Currently this would require to manually set the kubernetes.event.involved_object.name == kubernetes.pod.name. Also this would mean that events and pod data could not easily co exist in the same dashboard as this would require to first unfilter by kubernetes.pod.name and then filter by kubernetes.event.involved_object.name.

This can be easily overcome by adding a script processor in kubernetes event integration.
The processor could look into the involved_object.kind and if it is Pod , it would add the kubernetes.pod.name equal to involved_object.name.
The same can be done with other resource kinds like deployment, job, daemonset, replicaset, cronjob, node but this should be evaluated. The pod.name is the most important one for our dashboards use case.

Another possibility would be to add this extra fields with an ingest pipeline but I am not sure if there is any gain.
Third option would be to add the fields in the source (beats kubernetes module) although this would mean to wait until beats next release.

The text was updated successfully, but these errors were encountered:

MichaelKatsoulis · 2023-07-05T07:35:43Z

example processor

- script:
        lang: javascript
        id: add_pod_name
        source: >
          function process(event) {
            var kind = event.Get("kubernetes.event.involved_object.kind");
            if (kind === "Pod") {
                var pod = event.Get("kubernetes.event.involved_object.name");
                event.Put("kubernetes.pod.name", pod);
            }
            return event;
          }

ChrsMark · 2023-07-05T08:14:05Z

I see the 3rd option as the most future proof one.

MichaelKatsoulis · 2023-07-05T13:43:49Z

Maybe we could split this in short and long term solution.
The short could be the processor which will only add the kubernetes.pod.name so the serverless project can take care of this in their dashboards.
And the long term would be to add it in beats source code.

ChrsMark · 2023-07-05T13:55:52Z

I would be fine if we deliver both options at the same time so as to avoid having Beat's source code left behind after the need is covered.
But still I would prefer to avoid duplicating work. The product's release cycle is what it is and we should abide by this in any case.

MichaelKatsoulis · 2023-07-05T14:41:45Z

But still I would prefer to avoid duplicating work. The product's release cycle is what it is and we should abide by this in any case.

But the use of packages allow us to deliver things without relying on beats release cycle. It would not be duplication per se. The work on beats code would add more fields for all kinds while the processor would temporarily add the pod.name only.

That being said I don't have strong arguments, other than it's a feature that would benefit serverless project the sooner the better.
cc @tommyers-elastic

ChrsMark · 2023-07-05T15:08:18Z

But the use of packages allow us to deliver things without relying on beats release cycle.

That's true but I'm worried of if we should split the implementation in 2 different places. By populating the k8s fields from Beats and the Integration/package at the same time can be tricky and looks like not a good practice. (I'm talking about different fields: kubernetes.foo from Beats while kubernetes.bar from the Integration)
If we want to do this it should be documented really well, but still can fall under the cracks.

In any case if we want to proceed anyways we would need to write down the implementation parts and keep track of their accomplishment.

MichaelKatsoulis · 2023-07-05T16:00:47Z

Maybe I was not clear. I don’t think we should populate same fields from two different places. When the beats is updated and available then we will update the integration to remove the processor.

ChrsMark · 2023-07-06T07:06:41Z

Ok, I think we talk about the same thing here. Let me write down our options to be clear:

A) we add the processor now and then the Beats part

Pros: support the field now
Cons: duplicate work. Introduces "technical debt" even for a small period of time. Maybe the processor solution cannot even be tested with unit tests.

B) we add the Beats part now and wait for the release as usual

Pros: one time work. No possible pitfalls. No need to come back to remove parts.
Cons: a small delay in supporting this

C) implement both solutions now and then just remove the processor

Pros: deliver the feature, cover Beats part now that is hot. Do not introduce tech debt even for a small period of time.
Cons: work duplication but still better than option A which will need duplicated work anyways with the risk of "forgetting" to implement Beats part properly.

To my mind option B or C would be more robust, clean and future proof but if there is good reasoning into compromising for option A then it's fine. However I think that we need a good process in general for cases where we want to deliver Integrations but Beats release cycle blocks us from doing it within the desired time.

@ruflin @gizas @tommyers-elastic , what are your thoughts here? Even if its just a small detail it seems that we will be hitting the generic issue. See also a related comparability discussion. Whatever we decide here we might need to think of how we properly handle similar issues in general and this issue might be a good exercise and could be used as reference in the future.

MichaelKatsoulis · 2023-07-06T07:26:31Z

There is an option D. As our focus is mainly to deliver robust features in the best possible way and we don't want to introduce any technical dept, we could just deliver solution B (beats).

Meanwhile any customer that wants this new filed (server less project) can create a script processor on their side. Our integration supports adding processors. I have already created a PR in their automation repo to do so. https://github.com/elastic/platform-observability-charts/pull/63

tommyers-elastic · 2023-07-12T10:33:50Z

i don't necessarily see the beats solution as the 'correct' solution here. we are not adding any new data, just an interpretation of the existing data. as we move to support more data collection methods, having logic like this at ingest time vs collection time could be beneficial. what do you see as the main benefits to having this logic at the beats level @ChrsMark @MichaelKatsoulis ?

in terms of testing - we have automated tests for ingest pipelines, we can test combinations of object kind and the resulting outputs.

another option for the MKI team to get unblocked here is to use runtime fields.

ChrsMark · 2023-07-12T11:00:16Z

Makes sense @tommyers-elastic but I'm worried of how this would scale? Kubernetes events related fields will be populated from 2 different places which I don't see as a good practice. So if we can avoid it would be better from project's structure, maintainability and readability.

Use case: Somebody changes Beats's code and remove or change the underlying event's fields. Even if the unit tests of Beats will be tuned the integration's processor will break. Why to risk this when you can just have the implementation all together?

If the processor/pipelines were the only way to provide support for these data then that would be another story.

In any case, whatever we choose here we should ensure the project's structure, maintainability and readability with docs, dev-doc s etc.

botelastic · 2024-07-11T11:22:41Z

Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

ChrsMark changed the title ~~Enrich kubernetes events with with involved_object related fields~~ Enrich kubernetes events with involved_object related fields Jul 5, 2023

botelastic bot added the Stalled label Jul 11, 2024

andrewkroh added the Integration:kubernetes Kubernetes label Aug 14, 2024

botelastic bot removed the Stalled label Aug 14, 2024

andrewkroh added Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team [elastic/obs-cloudnative-monitoring] enhancement New feature or request labels Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrich kubernetes events with involved_object related fields #6817

Enrich kubernetes events with involved_object related fields #6817

MichaelKatsoulis commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023

ChrsMark commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023

ChrsMark commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023 •

edited

Loading

ChrsMark commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023

ChrsMark commented Jul 6, 2023

MichaelKatsoulis commented Jul 6, 2023

tommyers-elastic commented Jul 12, 2023

ChrsMark commented Jul 12, 2023 •

edited

Loading

botelastic bot commented Jul 11, 2024

Enrich kubernetes events with involved_object related fields #6817

Enrich kubernetes events with involved_object related fields #6817

Comments

MichaelKatsoulis commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023

ChrsMark commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023

ChrsMark commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023 • edited Loading

ChrsMark commented Jul 5, 2023

MichaelKatsoulis commented Jul 5, 2023

ChrsMark commented Jul 6, 2023

A) we add the processor now and then the Beats part

B) we add the Beats part now and wait for the release as usual

C) implement both solutions now and then just remove the processor

MichaelKatsoulis commented Jul 6, 2023

tommyers-elastic commented Jul 12, 2023

ChrsMark commented Jul 12, 2023 • edited Loading

botelastic bot commented Jul 11, 2024

MichaelKatsoulis commented Jul 5, 2023 •

edited

Loading

ChrsMark commented Jul 12, 2023 •

edited

Loading