Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSSION] deprecate / remove old internal drop stats? #2910

Open
incertum opened this issue Nov 14, 2023 · 9 comments
Open

[DISCUSSION] deprecate / remove old internal drop stats? #2910

incertum opened this issue Nov 14, 2023 · 9 comments
Assignees
Milestone

Comments

@incertum
Copy link
Contributor

@Andreagit97 opening a dedicated issue to discuss falcosecurity/libs#1433 (comment)

I saw that in Falco (and I imagine also in other consumers) we use the get_capture_stats method to obtain the number of drops/events. On the other hand, with stats_v2 we are using an agnostic approach, where the final consumer receives a vector of metrics already populated by sinsp. My question here is, do we want scap_stats_v2 to replace the old scap_stats? If yes, how do we obtain the specific number of drops/events from this agnostic approach? Do we want to keep these specific numbers or the final goal is to expose a set of metrics with a Prometheus endpoint?

First and foremost we are talking about https://github.com/falcosecurity/falco/blob/master/userspace/falco/event_drops.cpp aka Falco internal: syscall event drop that we will call "old drop stats" versus the new metrics Falco option that is also capable of creating an internal rule Falco internal: metrics snapshot containing not just the drop counters but also more metrics.

@leogr starting to summarize a few shortcomings of the old stats from my perspective. At the same time I would be honoring that some adopters prefer to keep the old stats around for longer. Therfore I would be fine keeping it, but also willing to help work out a transition plan.

Cons old drop stats:

  • Old stats report the number of drops within a 1s time frame window which may be a less transparent or intuitive metric versus being able to define your deltas based on what you think has the most meaning (the new metrics framework snapshots the current drop counts at an interval you choose and subsequently you can derive your own deltas the way you prefer)
  • Can generate an unpredictable high amount of logs
  • By default only reports drops when drop percentage is above 10%
  • Enabled by default
  • Not intuitive to customize or even turn off other than adjusting a logging level for Falco rules

Pros old drop stats:

  • Not bound to a fixed metrics interval to send the first alert of drops occurring
@Andreagit97 Andreagit97 added this to the TBD milestone Nov 20, 2023
@Andreagit97
Copy link
Member

yeah we need to explore a little bit all the usages of these old drops stats in Falco, to understand if we really need them or if we can just replace them with the new ones

@incertum
Copy link
Contributor Author

incertum commented Jan 3, 2024

When I asked on slack no one seems to be urgently still needing this.

In the last 3+ debugging sessions I have been involved, we always found the newer metrics feature to provide more actionable insights.

Proposing to introduce a deprecation warning for Falco 0.38 or Falco 0.37 and then follow the formal deprecation cycle?
WDYT @falcosecurity/falco-maintainers?

Besides the pros and cons I listed above it will help communicate easier to follow debugging steps and reduce the config surface, effectively making space for new configs that will move the needle in terms of improving Falco's performance and capabilities.

@poiana
Copy link
Contributor

poiana commented Apr 2, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@Andreagit97
Copy link
Member

/remove-lifecycle stale

@leogr
Copy link
Member

leogr commented Apr 3, 2024

/assign

added to my backlog 👼

Tentatively for
/milestone 0.38.0

@poiana poiana modified the milestones: TBD, 0.38.0 Apr 3, 2024
@LucaGuerra LucaGuerra modified the milestones: 0.38.0, 0.39.0 May 30, 2024
@poiana
Copy link
Contributor

poiana commented Aug 28, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@FedeDP
Copy link
Contributor

FedeDP commented Aug 29, 2024

/remove-lifecycle stale

@incertum incertum modified the milestones: 0.39.0, 0.40.0 Aug 31, 2024
@poiana
Copy link
Contributor

poiana commented Nov 30, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@FedeDP
Copy link
Contributor

FedeDP commented Dec 2, 2024

/remove-lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants