-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enrich kubernetes events with involved_object related fields #6817
Comments
example processor
|
I see the 3rd option as the most future proof one. |
Maybe we could split this in short and long term solution. |
I would be fine if we deliver both options at the same time so as to avoid having Beat's source code left behind after the need is covered. |
But the use of packages allow us to deliver things without relying on beats release cycle. It would not be duplication per se. The work on beats code would add more fields for all kinds while the processor would temporarily add the That being said I don't have strong arguments, other than it's a feature that would benefit serverless project the sooner the better. |
That's true but I'm worried of if we should split the implementation in 2 different places. By populating the k8s fields from Beats and the Integration/package at the same time can be tricky and looks like not a good practice. (I'm talking about different fields: In any case if we want to proceed anyways we would need to write down the implementation parts and keep track of their accomplishment. |
Maybe I was not clear. I don’t think we should populate same fields from two different places. When the beats is updated and available then we will update the integration to remove the processor. |
Ok, I think we talk about the same thing here. Let me write down our options to be clear: A) we add the processor now and then the Beats partPros: support the field now B) we add the Beats part now and wait for the release as usualPros: one time work. No possible pitfalls. No need to come back to remove parts. C) implement both solutions now and then just remove the processorPros: deliver the feature, cover Beats part now that is hot. Do not introduce tech debt even for a small period of time. To my mind option B or C would be more robust, clean and future proof but if there is good reasoning into compromising for option A then it's fine. However I think that we need a good process in general for cases where we want to deliver Integrations but Beats release cycle blocks us from doing it within the desired time. @ruflin @gizas @tommyers-elastic , what are your thoughts here? Even if its just a small detail it seems that we will be hitting the generic issue. See also a related comparability discussion. Whatever we decide here we might need to think of how we properly handle similar issues in general and this issue might be a good exercise and could be used as reference in the future. |
There is an option D. As our focus is mainly to deliver robust features in the best possible way and we don't want to introduce any technical dept, we could just deliver solution B (beats). Meanwhile any customer that wants this new filed (server less project) can create a script processor on their side. Our integration supports adding processors. I have already created a PR in their automation repo to do so. https://github.com/elastic/platform-observability-charts/pull/63 |
i don't necessarily see the beats solution as the 'correct' solution here. we are not adding any new data, just an interpretation of the existing data. as we move to support more data collection methods, having logic like this at ingest time vs collection time could be beneficial. what do you see as the main benefits to having this logic at the beats level @ChrsMark @MichaelKatsoulis ? in terms of testing - we have automated tests for ingest pipelines, we can test combinations of object kind and the resulting outputs. another option for the MKI team to get unblocked here is to use runtime fields. |
Makes sense @tommyers-elastic but I'm worried of how this would scale? Kubernetes events related fields will be populated from 2 different places which I don't see as a good practice. So if we can avoid it would be better from project's structure, maintainability and readability. Use case: Somebody changes Beats's code and remove or change the underlying event's fields. Even if the unit tests of Beats will be tuned the integration's processor will break. Why to risk this when you can just have the implementation all together? If the processor/pipelines were the only way to provide support for these data then that would be another story. In any case, whatever we choose here we should ensure the project's structure, maintainability and readability with docs, dev-doc s etc. |
Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as |
Events collected by kubernetes events dataset contain informations about the
involved_object
kind and name.It would be useful to enrich those events with fields that can help to link an event with a
pod.name
ordeployment.name
.A use case for this is during the creation of meaningful dashboard where a user can notice an issue with a pod and then try to get the events related with this pod.
Currently this would require to manually set the
kubernetes.event.involved_object.name == kubernetes.pod.name
. Also this would mean that events and pod data could not easily co exist in the same dashboard as this would require to first unfilter bykubernetes.pod.name
and then filter bykubernetes.event.involved_object.name
.This can be easily overcome by adding a script processor in kubernetes event integration.
The processor could look into the
involved_object.kind
and if it isPod
, it would add thekubernetes.pod.name
equal toinvolved_object.name
.The same can be done with other resource kinds like
deployment
,job
,daemonset
,replicaset
,cronjob
,node
but this should be evaluated. Thepod.name
is the most important one for our dashboards use case.Another possibility would be to add this extra fields with an ingest pipeline but I am not sure if there is any gain.
Third option would be to add the fields in the source (beats kubernetes module) although this would mean to wait until beats next release.
The text was updated successfully, but these errors were encountered: