Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filestream input: Fingerprint for inode race detection #27278

Closed
urso opened this issue Aug 9, 2021 · 5 comments
Closed

Filestream input: Fingerprint for inode race detection #27278

urso opened this issue Aug 9, 2021 · 5 comments
Labels
Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@urso
Copy link

urso commented Aug 9, 2021

Log collection using the filestream or logs input sometimes can treat an old file as a new file if we are seing an inode being reused. The clean_removed setting allows us to remove the state from the registry more early (filestream input can even detect removal asynchronously), but especially with autodiscovery in place we might have the input being shutdown before we managed to detect that the file was removed.

In order to better detect the inode reuse race condition, we want the harvester/prospector to add a fingerprint to the file metadata in the registry. The fingerprint would be computed from the first 4KB (configurable). The harvester (prospector) would check the fingerprint after opening the file, in order to check that the contents matches the original file.

#19990 Also discusses fingerprinting for identity tracking. But the solution proposed here is supposed to be used in conjunction with any other identity tracking we already have in place.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 9, 2021
@urso urso added the Team:Elastic-Agent Label for the Agent team label Aug 9, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 9, 2021
@kvch kvch self-assigned this Aug 9, 2021
@kvch kvch removed their assignment Oct 7, 2021
@kvch kvch added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jan 10, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic
Copy link

botelastic bot commented Jan 10, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@nimarezainia
Copy link
Contributor

This issue will be addressed by #34419
(@belimawr & @pierrehilbert please confirm and ensure this use case is addressed in sprint 10)

@botelastic botelastic bot removed the Stalled label Mar 28, 2023
@pierrehilbert
Copy link
Collaborator

I will close this one for now and we will reopen it if it's not fixed by #34419

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

5 participants