Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logp: don't write to files by default if running in a container environment #236

Merged
merged 5 commits into from
Oct 9, 2024

Conversation

mauri870
Copy link
Member

@mauri870 mauri870 commented Oct 7, 2024

What does this PR do?

The PR #208 introduced a bug that made the -environment container|systemd flag stop writting to stderr on the default logger. This PR restores the old behavior as well as adding a test that observes that logs are properly written to stderr.

Why is it important?

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works

Related issues

@mauri870 mauri870 added bug Something isn't working Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Oct 7, 2024
@mauri870 mauri870 self-assigned this Oct 7, 2024
@mauri870 mauri870 requested a review from a team as a code owner October 7, 2024 14:46
@mauri870 mauri870 requested review from belimawr and rdner and removed request for a team October 7, 2024 14:46
@mauri870 mauri870 force-pushed the fix-output-container-environment branch from ce685aa to 0f7f948 Compare October 7, 2024 14:52
@mauri870 mauri870 changed the title fix: don't write to files by default if running in a container environment logp: don't write to files by default if running in a container environment Oct 7, 2024
logp/config.go Outdated
toFiles := true

// If running in a container environment, don't write to files by default.
if environment == ContainerEnvironment {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The elastic-agent container always logs to files in the file system by default, even if logging to stdout+stderr is disabled. These are the logs that are included in diagnostics from the container.

Let's make sure we don't break agent diagnostics when this merges into elastic-agent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. This is reverting to the previous behavior. When we silently made this change earlier, it didn’t break elastic-agent, nor was it broken before the change. I don’t believe it's affected at all, but I could be wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looking at changes in elastic-agent, there is only two occurrences of logp.DefaultConfig and both use logp.DefaultEnvironment which always gets ToFiles = true.

Copy link
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking a bit more about it and I believe the problem is not just because we always set toFiles to true, it's also because DefaultConfig sets an output, so when logp.createLogOutput

func createLogOutput(cfg Config, enab zapcore.LevelEnabler) (zapcore.Core, error) {
switch {
case cfg.toIODiscard:
return makeDiscardOutput(cfg, enab)
case cfg.ToStderr:
return makeStderrOutput(cfg, enab)
case cfg.ToSyslog:
return makeSyslogOutput(cfg, enab)
case cfg.ToEventLog:
return makeEventLogOutput(cfg, enab)
case cfg.ToFiles:
return makeFileOutput(cfg, enab)
}
switch cfg.environment {
case SystemdEnvironment, ContainerEnvironment:
return makeStderrOutput(cfg, enab)
case MacOSServiceEnvironment, WindowsServiceEnvironment:
return makeFileOutput(cfg, enab)
default:
return zapcore.NewNopCore(), nil
}
}

gets called, the environment is not taken into consideration because an output has already been explicitly chosen.

I believe the systemd case (Filebeat running as a service) is also affected. Installing from the deb should trigger this case, I'm not sure if there is an option to "manually" install Filebeat as a service.

Could you also test the deb/systemd flow to make sure we fixed it in all cases?

@mauri870
Copy link
Member Author

mauri870 commented Oct 9, 2024

Could you also test the deb/systemd flow to make sure we fixed it in all cases?

Sure, I was able to build a deb package with PACKAGES=deb PLATFORMS=linux/amd64 mage package, I'll spin up a vm to test this case.

@mauri870
Copy link
Member Author

mauri870 commented Oct 9, 2024

@belimawr From my testing with systemd, filebeat creates /var/log/filebeat/*.ndjson files and journalctl -u filebeat.service does not contain any logs. The proper behavior would be for the logs to go to stderr in the systemd case as well and appear in the journalctl output, right?

@mauri870 mauri870 marked this pull request as draft October 9, 2024 11:40
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mauri870

@mauri870
Copy link
Member Author

mauri870 commented Oct 9, 2024

@belimawr This should be working properly now both for systemd and container environments. I tested the deb package of v8.15.0 and it does not log to syslog by default. I managed to package filebeat (main) with this fix included and it seems to log to syslog properly when running via systemd as well as docker, see attached logs. Please let me know if there is any additional concerns on your end.

root@mauri-ubuntu:/home/azureuser# filebeat version
filebeat version 9.0.0 (amd64), libbeat 9.0.0 [775d26d94a5a1c5a16ed2537d629031e6ef6ee65 built 2024-10-09 12:47:12 +0000 UTC]
root@mauri-ubuntu:/home/azureuser# sudo systemctl status filebeat
● filebeat.service - Filebeat sends log files to Logstash or directly to Elasticsearch.
     Loaded: loaded (/usr/lib/systemd/system/filebeat.service; disabled; preset: enabled)
     Active: active (running) since Wed 2024-10-09 13:02:53 UTC; 1min 8s ago
       Docs: https://www.elastic.co/beats/filebeat
   Main PID: 3602 (filebeat)
      Tasks: 8 (limit: 9459)
     Memory: 22.4M (peak: 22.8M)
        CPU: 84ms
     CGroup: /system.slice/filebeat.service
             └─3602 /usr/share/filebeat/bin/filebeat --environment systemd -c /etc/filebeat/filebeat.yml --path.home /usr/share/filebeat --path.config /etc/filebeat --path.data /var/lib/filebeat --path.logs /var/log/filebeat

Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"info","@timestamp":"2024-10-09T13:02:53.261Z","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cfgfile.(*Reloader).Run","file.name":"cfgfile/reload.go","file.l>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"error","@timestamp":"2024-10-09T13:02:53.339Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"error","@timestamp":"2024-10-09T13:02:53.339Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"error","@timestamp":"2024-10-09T13:02:53.339Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"error","@timestamp":"2024-10-09T13:02:53.339Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"error","@timestamp":"2024-10-09T13:02:53.340Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"warn","@timestamp":"2024-10-09T13:02:53.358Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata.>
Oct 09 13:02:53 mauri-ubuntu filebeat[3602]: {"log.level":"info","@timestamp":"2024-10-09T13:02:53.358Z","log.logger":"add_cloud_metadata","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/processors/add_cloud_metadata.>
Oct 09 13:03:23 mauri-ubuntu filebeat[3602]: {"log.level":"info","@timestamp":"2024-10-09T13:03:23.262Z","log.logger":"monitoring","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/monitoring/report/log.(*reporter).logS>
Oct 09 13:03:53 mauri-ubuntu filebeat[3602]: {"log.level":"info","@timestamp":"2024-10-09T13:03:53.262Z","log.logger":"monitoring","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/monitoring/report/log.(*reporter).logS>
root@mauri-ubuntu:/home/azureuser#

@mauri870 mauri870 marked this pull request as ready for review October 9, 2024 13:17
@belimawr
Copy link
Contributor

belimawr commented Oct 9, 2024

@belimawr From my testing with systemd, filebeat creates /var/log/filebeat/*.ndjson files and journalctl -u filebeat.service does not contain any logs. The proper behavior would be for the logs to go to stderr in the systemd case as well and appear in the journalctl output, right?

Yes, I believe this is the correct behaviour: when running under systemd, the logs go to stderr and can be seen using journalctl.

The best approach to be sure is to test an older version 😅

I also saw your last comment, the behaviour looks correct.

Comment on lines +76 to +78
// For container and systemd environments, we don't write to files by default.
switch environment {
case ContainerEnvironment, SystemdEnvironment:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that we have this logic in two different places, bugs me... Because the FF for 8.15.3 was Tue, I'd get this PR in as it is and backport it if we still have time to make into the release. Then create an issue to re-visit it at some point in the future.

I believe that's the correct function to define the default behaviour, createLogOutput should not need the environment switch.

@mauri870
Copy link
Member Author

mauri870 commented Oct 9, 2024

I also saw your last comment, the behaviour looks correct.

Thanks, I did test with 8.14.3 and the behavior seems consistent with with filebeat main + this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Beats do not log to stderr when running in a container or systemd environment
4 participants