Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fingerprint file identity by default and migrate file state from native or path #41762

Merged
merged 35 commits into from
Dec 19, 2024
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
eabe0f8
[Filebeat/Filestream] Fix `sourceStore.UpdateIdentifiers`
belimawr Nov 21, 2024
a2798fe
Fix tests
belimawr Nov 21, 2024
a4ff07a
Check if source matches the real file
belimawr Nov 22, 2024
3ee0e78
Improve conditions to update registry and comments
belimawr Dec 6, 2024
4bcebe7
Fix exiting tests
belimawr Dec 6, 2024
12ac2f3
Working test
belimawr Dec 6, 2024
57e6129
Run mage check and add all generated files
belimawr Dec 9, 2024
2de77ca
Add unit tests for all common cases
belimawr Dec 9, 2024
817155f
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 9, 2024
c1915a4
Add integration tests
belimawr Dec 10, 2024
6f33fab
Clean up test config
belimawr Dec 10, 2024
9bd1bf6
fix exiting tests
belimawr Dec 10, 2024
937e671
Add test for corner case
belimawr Dec 10, 2024
fd8872a
Update tests to use require function
belimawr Dec 10, 2024
2af67ec
Ensure old entries are removed from the registry
belimawr Dec 10, 2024
4834d43
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 10, 2024
d8404b4
Update docs, changelog and fix lint warnings
belimawr Dec 11, 2024
b4f1f20
Update docs
belimawr Dec 11, 2024
3d6022b
Remove inode marker from tests
belimawr Dec 11, 2024
0cff3cc
Fix lint warnings
belimawr Dec 11, 2024
4e73c1e
Remove inode_marker from tests and small improvements
belimawr Dec 11, 2024
a91a4d4
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 11, 2024
7c8a3ae
Make fingerprint the default file identity
belimawr Dec 12, 2024
0feb3bb
Update old tests to use the old file identity
belimawr Dec 12, 2024
6730cb7
update reference
belimawr Dec 12, 2024
1e92ff2
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 12, 2024
c1693f2
Fix Filestream tests
belimawr Dec 12, 2024
09002a1
Fix filestream integration tests
belimawr Dec 12, 2024
9758447
Fix more tests
belimawr Dec 12, 2024
68c4a64
Fix more tests
belimawr Dec 13, 2024
6feba3f
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 13, 2024
e858f0e
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 16, 2024
8893029
implement review suggestions
belimawr Dec 19, 2024
d516a86
update generated files
belimawr Dec 19, 2024
4859c9a
Merge branch 'main' of github.com:elastic/beats into 40197-filestream…
belimawr Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update docs, changelog and fix lint warnings
  • Loading branch information
belimawr committed Dec 11, 2024

Verified

This commit was signed with the committer’s verified signature. The key has expired.
westonruter Weston Ruter
commit d8404b46234f0aaf41f19f38093315f7626b55bf
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
@@ -362,6 +362,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]
- Improve S3 polling mode states registry when using list prefix option. {pull}41869[41869]
- AWS S3 input registry cleanup for untracked s3 objects. {pull}41694[41694]
- The environment variable `BEATS_AZURE_EVENTHUB_INPUT_TRACING_ENABLED: true` enables internal logs tracer for the azure-eventhub input. {issue}41931[41931] {pull}41932[41932]
- The Filestream input now supports changing the file identity from `native` or `path` to `fingerprint` keeping the state (no data re-ingestion) {issue}40197[40197] {pull}41762[41762]

*Auditbeat*

7 changes: 5 additions & 2 deletions filebeat/docs/inputs/input-filestream-file-options.asciidoc
Original file line number Diff line number Diff line change
@@ -547,8 +547,11 @@ limit of harvesters.
Different `file_identity` methods can be configured to suit the
environment where you are collecting log messages.

WARNING: Changing `file_identity` methods between runs may result in
duplicated events in the output.
IMPORTANT: Changing `file_identity` is only supported from if
migrating from `native` or `path` to `fingerprint`.

WARNING: Any unsupported change in `file_identity` methods between
runs may result in duplicated events in the output.

*`native`*:: The default behaviour of {beatname_uc} is to differentiate
between files using their inodes and device ids.
11 changes: 8 additions & 3 deletions filebeat/docs/inputs/input-filestream.asciidoc
Original file line number Diff line number Diff line change
@@ -86,7 +86,9 @@ multiple input sections:
[[filestream-file-identity]]
==== Reading files on network shares and cloud providers

WARNING: Filebeat does not support reading from network shares and cloud providers.
WARNING: Some file identity methods do not support reading from
network shares and cloud providers, to avoid duplicating events, use
`fingerprint` when reading from network shares or cloud providers.

However, one of the limitations of these data sources can be mitigated
if you configure Filebeat adequately.
@@ -98,8 +100,11 @@ values might change during the lifetime of the file. If this happens
of the file. To solve this problem you can configure the `file_identity` option. Possible
values besides the default `inode_deviceid` are `path`, `inode_marker` and `fingerprint`.

WARNING: Changing `file_identity` methods between runs may result in
duplicated events in the output.
IMPORTANT: Changing `file_identity` is only supported from if
migrating from `native` or `path` to `fingerprint`.

WARNING: Any unsupported change in `file_identity` methods between
runs may result in duplicated events in the output.

Selecting `path` instructs {beatname_uc} to identify files based on their
paths. This is a quick way to avoid rereading files if inode and device ids
2 changes: 1 addition & 1 deletion filebeat/input/filestream/prospector_test.go
Original file line number Diff line number Diff line change
@@ -108,7 +108,7 @@
}
defer f.Close()
tmpFileName := f.Name()
fi, err := f.Stat()
fi, err := f.Stat() // nolint:typecheck // It is used on L151

Check failure on line 111 in filebeat/input/filestream/prospector_test.go

GitHub Actions / lint (linux)

directive `// nolint:typecheck // It is used on L151` should be written without leading space as `//nolint:typecheck // It is used on L151` (nolintlint)
if err != nil {
t.Fatalf("cannot stat test file: %v", err)
}
@@ -194,7 +194,7 @@
}
defer f.Close()
tmpFileName := f.Name()
fi, err := f.Stat()

Check failure on line 197 in filebeat/input/filestream/prospector_test.go

GitHub Actions / lint (linux)

ineffectual assignment to err (ineffassign)

fd := loginp.FileDescriptor{
Filename: tmpFileName,
2 changes: 2 additions & 0 deletions x-pack/filebeat/filebeat.reference.yml
Original file line number Diff line number Diff line change
@@ -3185,6 +3185,8 @@ filebeat.inputs:
# batch of events has been published successfully. The default value is 1s.
#filebeat.registry.flush: 1s

# The interval which to run the registry clean up
#filebeat.registry.cleanup_interval: 5m

# Starting with Filebeat 7.0, the registry uses a new directory format to store
# Filebeat state. After you upgrade, Filebeat will automatically migrate a 6.x
Loading