-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize filtering functionality #10985
Conversation
"github.com/coreos/go-systemd/sdjournal" | ||
) | ||
|
||
type MatcherConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type MatcherConfig should have comment or be unexported
c8df57b
to
2977a57
Compare
Is there a notable performance benefit in building journald filters instead of matching and filtering out the events/logs ourselves? Anyways, OR-combining same fields makes sense. One can always apply some post filtering via processors. One could also put it in config like this:
I think this syntax makes somewhat clear that I'm not sure I like the extra settings like kernel,unit,identifier. These can be expressed using the matchers as well (are even compiled into matchers), yet I wonder if its fully clear how these settings affect the actual configured matchers. |
Can we split the backoff fix into another PR? So the fix doesn't get lost. |
Separate PR for backoff: #11861 |
2977a57
to
6a65514
Compare
any update? |
There's a big difference from what I observed so I think the feature is necessary. I was working on deploying Journalbeat to collected a few select sources (two systemd units and kernel logs). The initial deployment was taking a very long time and consuming plenty of CPU since it was going through the backlog of all logs in the entire journal. Another option I considered was to create one input for each source so that I could use On FilteringIt looks like we can construct filters with AND's and OR's with a limitation on the number of levels. From: https://www.freedesktop.org/software/systemd/man/sd_journal_add_conjunction.html
There is an example in the test cases at https://github.com/systemd/systemd/blob/e7d5fe17db9d046b364f933c4dcdd145f47b024a/src/journal/test-journal-match.c#L59. I think that example equates to something like this in our typical YAML conditional format: - and:
- or:
- and:
- or:
- QUUX: mmmm
- QUUX: xxxxx
- QUUX: yyyyy
- or:
- HALLO: ""
- HALLO: WALDO
- PIFF: paff
- and:
- or:
- ONE: one
- ONE: two
- TWO: two
- or:
- and:
- or:
- L4_1: "yes"
- L4_1: ok
- or:
- L4_2: "yes"
- L4_2: ok
- and:
- or:
- L3: "yes"
- L3: ok But I'm not immediately sure how to convert that back into a series of |
Did you use
Yep, this was my concern/question. The API is not easy to use. And it looks like the "language" only covers a subset of the logical expressions. The man page even states:
Also the kind of filters one can apply are limited. E.g. one can not use regexes.
Yes, the POC already supports this for journald and windows event logs by configuring |
I started with a Beats drop_event processor with a conditional. This wasn't acceptable from a performance perspective (high latency and lots of CPU usage) It probably would be more performant if we could move our filtering earlier, but using the built-in journal match is the optimal way IMO. After rolling Journalbeat out to several machines for several services at different times I think the following features combo would be ideal:
Is OK if I open an issues for (1) and (2) to further describe the feature and use cases? (edit: I opened a PR for (1) because I personally could use this now.) |
This pull request is now in conflicts. Could you fix it? 🙏
|
@kvch What should we do with this PR? It was stale for quite some time. |
This PR should be adapted in the journalD input. This filtering mode is a must-have, mostly mimics the community Journalbeat and also adds more options for users to define their own journal filters. Systemd can do some prefiltering for us, so Filebeat/Journalbeat has to process less events. I suggest to add this to the Filebeat input before we enable it. The main problem with this PR was that it was a breaking change in Journalbeat. |
Are you sure the number of level is limited ? I don't get that from the man page
I think the translation would be to intercalate a sd_journal_add_disjonction between each member of and or list, and add_conjunction for an or list :
And I think matches in a leaf That syntax could remain backward compatible, I think. If we considers an implicit include_matches:
- unit=a # sd_journal_add_match
- unit=b # sd_journal_add_match
- unit=c # sd_journal_add_match
- severity=1 # sd_journal_add_match
- severity=2 # sd_journal_add_match
- or : # sd_journal_add_conjunction
- transport=stuff # sd_journal_add_match
# sd_journal_add_disjunction
- something=the_thing # sd_journal_add_match The problem is when there is match with the same key inside an |
Matches MatcherConfig | ||
// Units stores the units to monitor | ||
Units []string | ||
// Kernel stores whether kernel messages should be monitored | ||
Kernel bool | ||
// Identifiers stores the syslog identifiers to watch | ||
Identifiers []string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the mentioned syntax be used, the go code will not need to know about units, kernels, or identifiers.
This pull request does not have a backport label. Could you fix it @kvch? 🙏
NOTE: |
Closing in favour of #29294 |
This PR initializes support for the similar filters provided by the community Journalbeat. It does not yet have the complete functionality as wildcards for unit names are missing.
But I am opening this PR to discuss the interface of
include_matches
.Previously
include_matches
expected a list of filter expressions. Now more advanced filtering is added, but it required a different way of configuration.Also, the way matches are applied to a systemd journal is a bit counterintuitive. If matcher expressions are applied to the same field, they are connected with OR. In case of differen field names, the expressions are connected with AND. Thus, it does not let you define full locigal expressions as our processor filtering does.
The following configuration matches for entries where
process.name
is eitherchromium
ordockerd
:If we would like to get entries of
chromium
,dockerd
and audit messages the following configuration is required.What I would like to discuss if the configuration above is ok. Another alternative which came to my mind is this:
But in my opinion the one I have implemented is simpler. WDYT?
Note that most of the filtering use cases are covered by
unit
,kernel
andidentifiers
.TODO
Depends on #10982