Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confgenerator: move modify filter before the processors #419

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ridwanmsharif
Copy link
Contributor

@ridwanmsharif ridwanmsharif commented Feb 18, 2022

This change moves the modify filter to before the processors.
This is useful so the processors can reference the LogName moving
forward. This will be needed for modify_fields and exclude_logs.

This change also updates the parser component to preserve existing
fields on the record when parsing. We do this so the LogName is not
blown away when the parser is used.

@ridwanmsharif ridwanmsharif requested review from quentinmit, a team and igorpeshansky and removed request for a team February 18, 2022 04:52
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif-log-name branch 2 times, most recently from a9d7ef4 to c5ef87e Compare March 1, 2022 15:28
This change moves the modify filter to before the processors.
This is useful so the processors can reference the LogName moving
foward. This will be needed for `modify_fields` and `exclude_logs`.

This change also updates the `parser` component to preserve existing
fields on the record when parsing. We do this so the LogName is not
blown away when the parser is used.

Signed-off-by: Ridwan Sharif <[email protected]>
@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif-log-name branch from c5ef87e to db6fd78 Compare March 1, 2022 15:30
@ridwanmsharif ridwanmsharif requested review from a team, davidbtucker and quentinmit and removed request for a team March 1, 2022 16:49

// We need to preserve existing fields (like LogName) that are present
// before parsing.
"Reserve_Data": "True",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this will cause user-visible behavior changes. :( We need to make sure we still reset all the other fields by default. (As part of my modify_fields work I'm going to have to expose this as a boolean for users, since we can't just override it.)

Hm... to preserve only the one key, I'm thinking you'd need to "nest" the whole log entry, parse with Reserve_Data: True, pull out the log name field, and then delete the nested field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, using the alternative approach we'd do the following when using the parse processor:

  • Nest all keys under a temporary field
  • Parse the key that we care about (with "Reserve_Data:" "True")
  • Lift only the log name field from nested temporary field
  • Delete the temporary field (effectively erasing all the other fields that existing in record pre-parsing)

And in the modify_fields work you're working on, we'd just skip all of that and use "Reserve_Data:" "True"

Based on my understanding above I have a few thoughts:

  • Our current documentation doesn't specify what the expected behaviour is with the fields that are not the parse key field (https://cloud.google.com/stackdriver/docs/solutions/agents/ops-agent/configuration#logging-processors). I expected them to actually be preserved by default but turns out it wasn't true (I don't know if this was a conscious decision or a result of using the defaults from Fluent Bit). We might not be able to fix forward but figured I'd leave a note here so it is considered
  • This alternative approach might cause a lot of performance regression (see b/169784211#comment21 for more details about how a single add field and nest operation impacts performance).
  • It seems to me that we could alternatively also ask for a feature request in the Fluent Bit plugin directly where we as for an option like "Reserve_Fields": ["field_1", "field_2",...] and it will only reserve the specified fields. This way we can default to only preserving the LogName (or other fields that the ops agent adds in the future like resource_name in Add resource_name to log entries using modify + nest filter #414).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants