Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick #18748 to 7.x: Add initial support for configurable file identity tracking #19885

Merged
merged 1 commit into from
Jul 14, 2020

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Jul 14, 2020

Cherry-pick of PR #18748 to 7.x branch. Original message:

What does this PR do?

This PR adds a new option to the log input of Filebeat named file_identity. The option lets users configure file identity for state tracking.

Available strategies

  1. native (default): Filebeat identifies files based on their inode and device id.
  2. path: Files are considered different if they have different paths.
  3. inode_marker: A special marker file and the inode is used to tell apart files. It is not supported on Windows.

State IDs previously were not saved to the registry file. Now these are persisted on the disk.

Architecture

I introduced a new interface: file.StateIdentifier. The responsibility of StateIdentifier is to generate an identifier for a file.State based on the configuration. If someone wants to implement their own StateIdentifier method, all they need is to create a struct which satisfies this interface.

// StateIdentifier generates an ID for a State.
type StateIdentifier interface {
	// GenerateID generates and returns the ID of the state
	GenerateID(State) (stateId, identifierType string)
}

As every state has an ID, Filebeat just compares the IDs of the two states to decide if they belong to the same file or not.

The scope of the PR does not include strategies which include fingerprinting the contents of the file.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made the corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@kvch kvch added [zube]: In Review backport Team:Services (Deprecated) Label for the former Integrations-Services team labels Jul 14, 2020
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 14, 2020
…18748)

This PR adds a new option to the `log` input of Filebeat named `file_identity`. The option lets users configure file identity for state tracking.

1. `native` (default): Filebeat identifies files based on their inode and device id.
2. `path`: Files are considered different if they have different paths.
3. `inode_marker`: A special marker file and the inode is used to tell apart files. It is not supported on Windows.

State IDs previously were not saved to the registry file. Now, these are persisted on the disk.

I introduced a new interface: `file.StateIdentifier`. The responsibility of `StateIdentifier` is to generate an identifier for a `file.State` based on the configuration. If someone wants to implement their own `StateIdentifier` method, all they need is to create a struct which satisfies this interface.

```golang
// StateIdentifier generates an ID for a State.
type StateIdentifier interface {
	// GenerateID generates and returns the ID of the state
	GenerateID(State) (stateId, identifierType string)
}
```

As every state has an ID, Filebeat just compares the IDs of the two states to decide if they belong to the same file or not.

The scope of the PR does not include strategies which include fingerprinting the contents of the file.
(cherry picked from commit 8ff6894)
@kvch kvch force-pushed the backport_18748_7.x branch from 05983de to a752e7b Compare July 14, 2020 12:18
@kvch kvch merged commit 94a7270 into elastic:7.x Jul 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Services (Deprecated) Label for the former Integrations-Services team [zube]: Done
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants