-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[8.0] iptables system test: found 0 hits in logs-iptables.log-ep data stream #2602
Comments
A breaking change in the journald input is breaking all integrations that use the input. @kvch I left you a few comments about making the change in a compatible manner. WDYT adding an automatic translation to keep old configs working? elastic/beats#29294 (comment) There's another bug that will likely break integrations after the config issue is resolved - elastic/beats#30031. The field validation tests should fail because there would be a bunch of unexpected key names. |
The config breaking change doesn't seem to be the culprit. Filebeat 8.1.0-SNAPSHOT appears to ignore the filters specified in the old config format and read all journald logs. If I follow the Filebeat log I can see it publishing the 3 expected logs, but then Filebeat appears to crash.
It's not possible to observe the stdout of the Filebeat process run by Agent so I cannot see if there is a panic. I've tested Filebeat standalone with the console output and it does not crash. But it also does not add a cursor to registry. That seems odd. This is what I see under 8.1.
And this is what I see under 7.17.
|
I think the problem is related to https://github.com/elastic/beats/pull/29070/files#diff-839f58c64b7063f75769a4d945fca4efb1e7f103cd6217a90c2363490f918dd0L144-R151. @belimawr can you take a look at Filebeat. I think the change you made lost both the translation of journald field names to ECS ( |
Yup, journald is also affected: https://beats-ci.elastic.co/blue/organizations/jenkins/Ingest-manager%2Fintegrations/detail/main/67/tests |
This data should now be set directly into the event here: https://github.com/belimawr/beats/blob/67b5ba70ed05ac3b6085ef023375dbcb48ad4b69/filebeat/input/journald/input.go#L239-L244 Anyways, I'm looking into it. |
You can do this, you have to log in to the special internal bucket: beats-ci-temp-internal. Anyway, I checked there and no panic, or we haven't logged it :)
|
@mtojek It's not a problem with our CI setup (that's great). It's that Elastic Agent does not direct the stdout/stderr anywhere when it uses Go os/exec. So that output is lost to /dev/null. |
I followed that approach as I couldn't believe that processes aren't dying and they were. Unfortunately @andrewkroh was right and stderr is dumped to I checked PIDs of running filebeat processes and excluded them into my strace trap:
then started the journald test with When it hanged, it dumped the content
I still believe we need a more convenient debugging setup that don't require extra tools. |
I could not agree more! Putting that fact aside, do you know which version/commit of the beats you were running/debugging? If it's the same code as on Thanks for getting the stack trace!! Do you know if this tests uses a real journald or a journal file? |
Diagnostics:
|
That's the test input:
|
We did some more debugging today with @belimawr and came a point:
It looks the problem is in
Based on the Github comments, we can overcome this by installing a newer version. filebeat.yml:
|
Follow-up: As the libsystemd0 update isn't available on the Ubuntu
No problems were observed there :) Here comes the question - should we update the base Docker image for Elastic Agent? It's confirmed now that it contains a broken library. |
I am in favor of using a newer version in both the Filebeat and Elastic Agent containers. There was a bug reported that required >=246 to address. So with 247 from hirsute that would fix the issue. |
Maybe we move to 21.04 hirsute until the next LTS release comes out in April 2022 (Jammy Jellyfish). |
@elastic/elastic-agent-control-plane because it is a problem with the Elastic-Agent Docker image, could you take a look at it? @mtojek managed to fix it by updating the |
I agree, this fall into our side, @jlind23 fyi. |
@elastic/elastic-agent-control-plane Ok, this is moving way too slow in terms of triaging/investigating. I'm going to temporarily disable both failing tests (journals and iptables). Both are panicking: |
@mtojek sorry but we had a couple of other SDH to triage before those failing tests. I'll do my best to put someone on it. |
Thanks, Julien, it would be great to do initial triage and prepare a plan for this. |
Filed the issue against Ubuntu, hopefully we can get this fixed in the LTS so we can use the LTS for the docker image. I do not think switching to a development release for our images is a good idea. https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1959725 |
@mtojek this is an experimental feature and thus it will not be fixed until further resolution on ubuntu side. |
TBH I would keep it open, as we merged a skip-tests PR. This issue will help us remember to re-enable system tests. |
Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as |
There might be a workaround by moving the test data into a directory prefixed by |
I raised a PR to allow elastic-package to put test data into |
system tests for the journald input have been disabled to a segfault. This uses a workaround to avoid that segfault so we can continue testing. Closes elastic#2602 Relates elastic/elastic-package#1236
system tests for the journald input have been disabled to a segfault. This uses a workaround to avoid that segfault so we can continue testing. While performing that testing I discovered that neither iptables nor journald were aligned with the current ECS definition of the log.syslog.* fields. ECS added numerous log.syslog fields that should be used by journald/iptables instead of syslog.*. And because journald is an input package this needs to be done without an Ingest Pipeline so that users with custom pipelines can benefit. Bump stack version for the iptables integration to get journald input fixes. Closes #2602 Relates elastic/elastic-package#1236
Hi,
while reviewing daily CI jobs I found this flaky problem with iptables:
It looks like the iptables integration doesn't produce any metrics/logs? cc @andrewkroh
The text was updated successfully, but these errors were encountered: