Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved pipeline resource cleanup #298

Closed
rakesh-garimella opened this issue Jul 26, 2023 · 3 comments
Closed

Improved pipeline resource cleanup #298

rakesh-garimella opened this issue Jul 26, 2023 · 3 comments
Assignees
Labels
area/logs LogPipeline area/manager Manager or module changes area/traces TracePipeline kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@rakesh-garimella
Copy link
Contributor

rakesh-garimella commented Jul 26, 2023

Description
Currently we dont have a proper decision on restarting the services for logging, trace and metrics for corner cases. Eg. see below the corner cases

Logging

When there is a logpipeline which uses a secret and if the secret is deleted then in the reconciliation we remove this logpipeline from the sections configmap and restart the fluentbit. Currently we cannot delete the fluentbit daemonset because the current decision making does not allow deleting the daemonset if the logpipeline is inpending state

tracing/metrics

Same scenario with trace pipeline we return when dont have a deployable pipelines. Thus we dont update the config and we keep running with config which might not work eventually (as the secret has beed deleted).

Same is the case with metrics as well.

Expected result

When we have logpipeline/tracepipeline/metricpipeline in pending state (which was running but moved to pending because secret has been deleted) then we should not create resources.

Ideally we should think something like this

if (deployablePipelines>0) {
  for i := range deployablePipelines {
    generateConfig()
  }
  deployOrRestartCollector()
} else {
  deleteCollector()
}

@rakesh-garimella rakesh-garimella added area/logging kind/bug Categorizes issue or PR as related to a bug. labels Jul 26, 2023
@kyma-bot kyma-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 24, 2023
@kyma-bot kyma-bot closed this as completed Oct 1, 2023
@a-thaler a-thaler removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 1, 2023
@a-thaler a-thaler reopened this Oct 1, 2023
@a-thaler a-thaler added area/logs LogPipeline area/traces TracePipeline area/manager Manager or module changes and removed area/logging labels Oct 6, 2023
@kyma-bot kyma-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 5, 2023
@a-thaler a-thaler removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 5, 2023
@kyma-project kyma-project deleted a comment from kyma-bot Dec 5, 2023
@kyma-project kyma-project deleted a comment from kyma-bot Dec 5, 2023
@kyma-project kyma-project deleted a comment from kyma-bot Dec 5, 2023
@kyma-project kyma-project deleted a comment from kyma-bot Dec 5, 2023
@a-thaler
Copy link
Collaborator

a-thaler commented Dec 5, 2023

There was following situation on one cluster which needs to get addressed in the same run:

there was a healthy tracepieline till the referenced secret got deleted. The pipeline false in pending state, the collector and config stays unchanged and continues to work healthy.
A new version rollout of the operator and collector is not getting applied in that situation, the collector stays in the old version which is a serious bug

Copy link

github-actions bot commented Feb 4, 2024

This issue has been automatically marked as stale due to the lack of recent activity. It will soon be closed if no further activity occurs.
Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2024
@a-thaler a-thaler removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 5, 2024
Copy link

github-actions bot commented Apr 6, 2024

This issue has been automatically marked as stale due to the lack of recent activity. It will soon be closed if no further activity occurs.
Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 6, 2024
@a-thaler a-thaler removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2024
@a-thaler a-thaler changed the title Rethink the implementation of the log/trace/metric configs and restart of respective services Improved pipeline resource cleanup Jun 3, 2024
@shorim shorim self-assigned this Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/logs LogPipeline area/manager Manager or module changes area/traces TracePipeline kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants