Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Journalbeat has no ability to not process old journal events #17758

Closed
timp87 opened this issue Apr 16, 2020 · 8 comments
Closed

Journalbeat has no ability to not process old journal events #17758

timp87 opened this issue Apr 16, 2020 · 8 comments
Labels

Comments

@timp87
Copy link

timp87 commented Apr 16, 2020

Describe the enhancement:
Journalbeat has no ability to not process journal events older than N ${time_units}.
For example, filebeat has ignore_older option which relies on logfile modification time (general approach for plain text logfiles).
Journalbeat has no similar option even for the same approach. It reads the latest event from any journal file it finds even in "tail" mode.

Describe a specific use case for the enhancement or feature:
We have k8s cluster and want to see logs in ELK not only from containers, but from daemons running on host system also.
We reconfigure docker's log-driver option from json-file to journald. Then we deploy journalbeat daemonset instead of filebeat, of couse with mounted /var/log/journal and other valuable dirs from host system.
Journalbeat pulls and processes all events from journal, even if they are a year old. Then events go to logstash.
We don't want to grow our ELK in size uncontrolled. So we split indexes on a per day basis. And keep open indexes for the last week only. Older indexes are closed.
Now logstash tries to write events to the closed indexes. This leads to a filedescriptor leak on logstash (long-living known logstash bug^W feature ;)).
I want a way to tell journalbeat to not process journal event older than a week.

Describe possible workaround
We configure journald to rotate its journal files every day and keep them only for a week. That's it.
So nothing in journalbeat configuration unfortunately.

@timp87 timp87 changed the title Journalbeat has not ability to not process old journal events Journalbeat has no ability to not process old journal events Apr 16, 2020
@andresrc andresrc added [zube]: Inbox Team:Services (Deprecated) Label for the former Integrations-Services team labels Apr 19, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@andrewkroh
Copy link
Member

I think it might be possible to seek the cursor based on time by using sd_journal_seek_realtime_usec. If so, then when a new reader starts for the first time (no previous state is found) then it could start reading at the time derived from ignore_older. If ignore_older: 24h was used then it would seek to now() - 24h and then start reading. Any restarts afterwards would resume from the last persisted cursor position.

@kvch kvch added Filebeat Filebeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Jan 5, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@kvch kvch removed the Team:Services (Deprecated) Label for the former Integrations-Services team label Jan 5, 2022
@jlind23 jlind23 unassigned kvch Mar 7, 2022
@botelastic
Copy link

botelastic bot commented Mar 7, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Mar 7, 2023
@timp87
Copy link
Author

timp87 commented Mar 7, 2023

It's relevant

@botelastic botelastic bot removed the Stalled label Mar 7, 2023
@botelastic
Copy link

botelastic bot commented Mar 6, 2024

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Mar 6, 2024
@timp87
Copy link
Author

timp87 commented Mar 6, 2024

Still need this

@botelastic botelastic bot removed the Stalled label Mar 6, 2024
@andrewkroh
Copy link
Member

I want a way to tell journalbeat to not process journal event older than a week.

(journalbeat is no longer around and the functionality is part of Filebeat.)

This is possible. The following configuration will at most initially backfill -168h of data (one week). Then after that it will resume from the saved cursor when restarted (because a cursor for all_events will exist in the registry).

filebeat.inputs:
- type: journald
  id: all_events
  seek: cursor
  cursor_seek_fallback: since
  since: -168h

@zube zube bot removed the [zube]: Done label Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants