Added enhancements/cluster-logging/content-filter.md #1513

alanconway · 2023-11-03T15:53:15Z

No description provided.

enhancements/cluster-logging/content-filter.md

jcantrill · 2023-11-14T19:14:16Z

ref: https://issues.redhat.com/browse/LOG-4603

openshift-bot · 2024-01-05T01:15:46Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

alanconway · 2024-01-05T19:00:13Z

/remove-lifecycle stale

alanconway · 2024-01-05T22:33:55Z

@jcantrill @r2d2rnd thanks for excellent comments, sorry for slow response. I think the latest push addresses them all, re-comment if not.

enhancements/cluster-logging/content-filter.md

periklis

This is a very good start proposal for enhancing log record control in many constrained production environments. It offers two important tools for dropping entire records and or dropping fields from each record.

After thorough reading I have a couple of questions as food for thought if we (intent to) leave room for future improvements:

What is the inspiration for naming the feature filters and accordingly the first two prune (related to fields) and drop (related to entire records)? Is the source of inspiration on how similar features work on the supported collector implementations (i.e. Vector, fluentd)? If yes can you point to them in a related work section?
Although the proposal aims to give users means to control log record size and log volume, it is not clear if we want to go any step forward in future, i.e. relabeling records, constructing calculated fields. Latter is explicitly stated as a non-goal, but is this a definitive NO that we won't do this ever? If not what can we learn from other sibling tools like Prometheus' relabeling-config that combines dropping/pruning with relabeling in one config block?
Considering filter validation it seems that each as well as both together having a serious impact on the ClusterLogForwarder status field to inform the user about broken and/or invalid rules as well as the impact on log records and volume. I believe error reporting deserves a separate section to make the proposal complete. The reader should be informed on how we report invalid rules/filters (i.e. support and docs will have an easier enablement in this regard).
Do we need to provide default filters from the early start on? This looks like extra baggage that will result in special handling once we support more formats than the traditional Viaq model, i.e. OpenTelemetry. My suggestion is to drop defaults until we realize the impact supporting filters across multiple log record formats.
Speaking about OpenTelemetry, I believe this proposal deserves a section elaborating about the impact on OpenTelemetry formatted log records. Can it be that we miss cases which render OpenTelemetry log records invalid if a user prunes/drops fields from them? The proposed section should provide an first analysis in that regard.
Can we simplify/condense the drop filter matchers in a form like the Prometheus' relabeling-config regex matchers? Do we need numerical comparison early on? Especially supporting numerical comparison opens IMHO the desire for more than less/greater, e.g. lessOrEqual/greaterOrEqual, modulus, etc. Can we simply keep the matchers only handling text?

jcantrill · 2024-01-09T19:35:27Z

What is the inspiration for naming the feature filters and accordingly the first two prune (related to fields) and drop (related to entire records)? Is the source of inspiration on how similar features work on the supported collector implementations (i.e. Vector, fluentd)? If yes can you point to them in a related work section?

Mostly this stems from the "fluentd" naming and we already have API in place caled "filters". I believe as part of the v2.0 API we should consider renaming to "transforms" which agrees with Vector. I think it is also more open ended and implies range of activities: data modification, dropping, etc.

Although the proposal aims to give users means to control log record size and log volume, it is not clear if we want to go any step forward in future, i.e. relabeling records, constructing calculated fields. Latter is explicitly stated as a non-goal, but is this a definitive NO that we won't do this ever? If not what can we learn from other sibling tools like Prometheus' relabeling-config that combines dropping/pruning with relabeling in one config block?

This specific API is related only to reducing log size and dropping records. The filter framework allows us to add more if required but we are refraining from doing so at this time.

Considering filter validation it seems that each as well as both together having a serious impact on the ClusterLogForwarder status field to inform the user about broken and/or invalid rules as well as the impact on log records and volume. I believe error reporting deserves a separate section to make the proposal complete. The reader should be informed on how we report invalid rules/filters (i.e. support and docs will have an easier enablement in this regard).

Do we need to provide default filters from the early start on? This looks like extra baggage that will result in special handling once we support more formats than the traditional Viaq model, i.e. OpenTelemetry. My suggestion is to drop defaults until we realize the impact supporting filters across multiple log record formats.

I can imagine us modifying internal implementation to make these special filters which would simplify dropping them in future as needed (e.g. Viaq). I think this is also the correct direction for OpenTelementry but this enhancement is strictly for the very specific purposes.

Speaking about OpenTelemetry, I believe this proposal deserves a section elaborating about the impact on OpenTelemetry formatted log records. Can it be that we miss cases which render OpenTelemetry log records invalid if a user prunes/drops fields from them? The proposed section should provide an first analysis in that regard.

Can we simplify/condense the drop filter matchers in a form like the Prometheus' relabeling-config regex matchers? Do we need numerical comparison early on? Especially supporting numerical comparison opens IMHO the desire for more than less/greater, e.g. lessOrEqual/greaterOrEqual, modulus, etc. Can we simply keep the matchers only handling text?

I would agree. I don't believe we have need for numerical comparison for now and I have not seen any requirements.

alanconway · 2024-01-15T19:09:11Z

1. What is the inspiration for naming the feature `filters`

Vector uses 'filter' to drop records and 'transform' to modify them, but then blurs the distinction by allowing a 'transform' to also drop records using "abort".
I decided to call everything a 'filter', the distinction doesn't seem all that helpful.
E.g. the "drop" and "prune" filters do just that (drop records, prune fields) but the apiAudit filter does both since the kube-api audit policy is written that way.

2. Although the proposal aims to give users means to control log record size and log volume, it is not clear if we want to go any step forward in future, i.e. relabeling records, constructing calculated fields. Latter is explicitly stated as a non-goal, but is this a definitive NO that we won't do this ever?

Non-goal only for current work, could be extended in future.

If not what can we learn from other sibling tools like Prometheus' relabeling-config that combines dropping/pruning with relabeling in one config block?

The "filter" type is an open ended YAML struct so we have scope to add new filters that do whatever we want.

3. Considering filter validation it seems that each as well as both together having a serious impact on the `ClusterLogForwarder` status field to inform the user about broken and/or invalid rules

I think this fits in the existing scheme of having a list of text error messages associated with each broken input/output/pipeline.

as well as the impact on log records and volume. I believe error reporting deserves a separate section to make the proposal complete. The reader should be informed on how we report invalid rules/filters (i.e. support and docs will have an easier enablement in this regard).

+1 to adding an error handling section.

4. Do we need to provide default filters from the early start on?

Not necessarily. The one we've had most intense demand for is filtering by log level, but this isn't so hard to write by hand as a drop fiter:

-  field: level 
   in: [ critical, error, warning]

5. Speaking about OpenTelemetry, I believe this proposal deserves a section elaborating about the impact on [OpenTelemetry formatted log records](https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/#examples). Can it be that we miss cases which render OpenTelemetry log records invalid if a user prunes/drops fields from them? The proposed section should provide an first analysis in that regard.

Originally the proposal had such a section but it proved confusing. The filters will work the same, but with different field names. Additional validation or guidelines for OTEL can be added when we have an OTEL implementation.

6. Can we simplify/condense the `drop` filter matchers in a form like the Prometheus' [relabeling-config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) `regex` matchers? Do we need numerical comparison early on? Especially  supporting numerical comparison opens IMHO the desire for more than less/greater, e.g. lessOrEqual/greaterOrEqual, modulus, etc. Can we simply keep the matchers only handling text?

+1 to dropping numerical for first cut.

alanconway · 2024-01-15T19:57:58Z

@periklis @jcantrill thanks again for more good comments, the proposal is simpler & better for them.
Please re-review & make sure all your concerns are covered.

enhancements/cluster-logging/content-filter.md

jcantrill · 2024-01-15T23:24:04Z

/approve

openshift-ci · 2024-01-15T23:24:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcantrill

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~enhancements/cluster-logging/OWNERS~~ [jcantrill]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

vparfonov · 2024-01-16T10:28:07Z

enhancements/cluster-logging/content-filter.md

+- No new security issues.
+- Extensions to ClusterLogForwarder will be reviewed as part of normal code review.
+
+### Drawbacks


Regexp syntax can be depended on collector implementation, e.g: Ruby is case-sensitive, Rust is case-insensitive by default and etc.

periklis

Just small formatting suggestions for the examples.

periklis · 2024-01-17T06:12:40Z

enhancements/cluster-logging/content-filter.md

+``` yaml
+  spec:
+    filters:
+    - name: foo
+      type: prune
+      prune:
+		notIn: [.message, .kubernetes]   # Keep only the message and kubernetes fields.
+        in: [.kubernetes.flat_labels]    # Prune the kubernetes.flat_labels sub-field.
+    pipelines:
+    - name: bar
+		filterRefs: ["foo"]
+```


Suggested change

``` yaml

spec:

filters:

- name: foo

type: prune

prune:

notIn: [.message, .kubernetes] # Keep only the message and kubernetes fields.

in: [.kubernetes.flat_labels] # Prune the kubernetes.flat_labels sub-field.

pipelines:

- name: bar

filterRefs: ["foo"]

```

``` yaml

spec:

filters:

- name: foo

type: prune

prune:

notIn: [.message, .kubernetes] # Keep only the message and kubernetes fields.

in: [.kubernetes.flat_labels] # Prune the kubernetes.flat_labels sub-field.

pipelines:

- name: bar

filterRefs: ["foo"]

periklis · 2024-01-17T06:13:15Z

enhancements/cluster-logging/content-filter.md

+``` yaml
+spec:
+  filters:
+  - name:             # Provided by the user
+    type: drop
+    drop:
+      - test:
+        - field:      # JSON path to the field
+		  # Requires exactly one of the following conditions.
+          matches:    # regular expression match
+          notMatches: # regular expression does not match
+```


Suggested change

``` yaml

spec:

filters:

- name: # Provided by the user

type: drop

drop:

- test:

- field: # JSON path to the field

# Requires exactly one of the following conditions.

matches: # regular expression match

notMatches: # regular expression does not match

```

``` yaml

spec:

filters:

- name: # Provided by the user

type: drop

drop:

- test:

- field: # JSON path to the field

# Requires exactly one of the following conditions.

matches: # regular expression match

notMatches: # regular expression does not match

periklis · 2024-01-17T06:15:45Z

enhancements/cluster-logging/content-filter.md

+``` yaml
+filters:
+  - name: important
+    type: drop
+    drop:
+      - tests:
+		- field: .kubernetes.namespace_name
+		  notMatches: "very-important"  # Keep everything from this namespace.
+		- field: .level # Keep important levels
+		  matches: "warning|error|critical"
+```


Suggested change

``` yaml

filters:

- name: important

type: drop

drop:

- tests:

- field: .kubernetes.namespace_name

notMatches: "very-important" # Keep everything from this namespace.

- field: .level # Keep important levels

matches: "warning|error|critical"

```

``` yaml

filters:

- name: important

type: drop

drop:

- tests:

- field: .kubernetes.namespace_name

notMatches: "very-important" # Keep everything from this namespace.

- field: .level # Keep important levels

matches: "warning|error|critical"

periklis · 2024-01-17T06:21:36Z

3. Considering filter validation it seems that each as well as both together having a serious impact on the ClusterLogForwarder status field to inform the user about broken and/or invalid rules as well as the impact on log records and volume. I believe error reporting deserves a separate section to make the proposal complete. The reader should be informed on how we report invalid rules/filters (i.e. support and docs will have an easier enablement in this regard).

@alanconway ☝️ Regarding that do you intend to provide examples on how the status field will look like for error reporting?

vparfonov · 2024-01-18T15:16:20Z

enhancements/cluster-logging/content-filter.md

+  - name: important
+    type: drop
+    drop:
+      - tests:


Is "tests" good name here, what about something like: "rules"

dhellmann · 2024-02-13T16:30:21Z

#1555 is changing the enhancement template in a way that will cause the header check in the linter job to fail for existing PRs. If this PR is merged within the development period for 4.16 you may override the linter if the only failures are caused by issues with the headers (please make sure the markdown formatting is correct). If this PR is not merged before 4.16 development closes, please update the enhancement to conform to the new template.

openshift-ci · 2024-02-16T18:58:04Z

@alanconway: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

jcantrill

/lgtm

openshift-ci bot requested review from jcantrill and periklis November 3, 2023 15:54

jcantrill reviewed Nov 3, 2023

View reviewed changes

alanconway mentioned this pull request Nov 6, 2023

LOG-4603: Add log forwarder input selectors for content filtering #1508

Merged

r2d2rnd reviewed Nov 8, 2023

View reviewed changes

enhancements/cluster-logging/content-filter.md Outdated Show resolved Hide resolved

r2d2rnd reviewed Nov 8, 2023

View reviewed changes

enhancements/cluster-logging/content-filter.md Outdated Show resolved Hide resolved

r2d2rnd reviewed Nov 8, 2023

View reviewed changes

enhancements/cluster-logging/content-filter.md Outdated Show resolved Hide resolved

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2024

openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2024

alanconway force-pushed the content-filter-enhancement branch from 9959e1c to 249287a Compare January 5, 2024 22:33

alanconway force-pushed the content-filter-enhancement branch from 249287a to a5f5dbd Compare January 5, 2024 22:34

jcantrill reviewed Jan 8, 2024

View reviewed changes

periklis reviewed Jan 9, 2024

View reviewed changes

alanconway force-pushed the content-filter-enhancement branch from a5f5dbd to d5fd9af Compare January 15, 2024 19:56

alanconway force-pushed the content-filter-enhancement branch from d5fd9af to d29f350 Compare January 15, 2024 19:59

jcantrill reviewed Jan 15, 2024

View reviewed changes

alanconway force-pushed the content-filter-enhancement branch 2 times, most recently from 4319cb7 to a6a1feb Compare January 15, 2024 21:56

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2024

vparfonov reviewed Jan 16, 2024

View reviewed changes

periklis reviewed Jan 17, 2024

View reviewed changes

vparfonov reviewed Jan 18, 2024

View reviewed changes

alanconway force-pushed the content-filter-enhancement branch 2 times, most recently from 746fa58 to 9084b95 Compare February 16, 2024 18:35

Added enhancements/cluster-logging/content-filter.md

3409e86

alanconway force-pushed the content-filter-enhancement branch from 9084b95 to 3409e86 Compare February 16, 2024 18:54

jcantrill reviewed Feb 16, 2024

View reviewed changes

openshift-ci bot assigned jcantrill Feb 16, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 16, 2024

openshift-merge-bot bot merged commit e7c9a21 into openshift:master Feb 16, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added enhancements/cluster-logging/content-filter.md #1513

Added enhancements/cluster-logging/content-filter.md #1513

alanconway commented Nov 3, 2023

jcantrill commented Nov 14, 2023

openshift-bot commented Jan 5, 2024

alanconway commented Jan 5, 2024

alanconway commented Jan 5, 2024

periklis left a comment

jcantrill commented Jan 9, 2024

alanconway commented Jan 15, 2024

alanconway commented Jan 15, 2024

jcantrill commented Jan 15, 2024

openshift-ci bot commented Jan 15, 2024

vparfonov Jan 16, 2024

periklis left a comment

periklis Jan 17, 2024

periklis Jan 17, 2024

periklis Jan 17, 2024

periklis commented Jan 17, 2024

vparfonov Jan 18, 2024 •

edited

Loading

dhellmann commented Feb 13, 2024

openshift-ci bot commented Feb 16, 2024

jcantrill left a comment

Added enhancements/cluster-logging/content-filter.md #1513

Added enhancements/cluster-logging/content-filter.md #1513

Conversation

alanconway commented Nov 3, 2023

jcantrill commented Nov 14, 2023

openshift-bot commented Jan 5, 2024

alanconway commented Jan 5, 2024

alanconway commented Jan 5, 2024

periklis left a comment

Choose a reason for hiding this comment

jcantrill commented Jan 9, 2024

alanconway commented Jan 15, 2024

alanconway commented Jan 15, 2024

jcantrill commented Jan 15, 2024

openshift-ci bot commented Jan 15, 2024

vparfonov Jan 16, 2024

Choose a reason for hiding this comment

periklis left a comment

Choose a reason for hiding this comment

periklis Jan 17, 2024

Choose a reason for hiding this comment

periklis Jan 17, 2024

Choose a reason for hiding this comment

periklis Jan 17, 2024

Choose a reason for hiding this comment

periklis commented Jan 17, 2024

vparfonov Jan 18, 2024 • edited Loading

Choose a reason for hiding this comment

dhellmann commented Feb 13, 2024

openshift-ci bot commented Feb 16, 2024

jcantrill left a comment

Choose a reason for hiding this comment

vparfonov Jan 18, 2024 •

edited

Loading