-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filebeat: ID for filestream is required #30717
Conversation
31a5f22
to
d28345b
Compare
Could you add a small explanation of what your proposed change is doing? I suspect this is a tricky part of the code and some more explanation/comments would help make it easier to review. |
861fc48
to
4f5bb40
Compare
@belimawr could you please backport it to 7.17.X and all versions above? |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
/test |
cb397d1
to
2afb636
Compare
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
Add some general exclude rules and update the linter's Go version to match `.go-version`
3701b54
to
6ff985a
Compare
I fixed all lint issues, rebased onto |
@@ -142,7 +148,7 @@ linters-settings: | |||
# Enable to require an explanation of nonzero length after each nolint directive. Default is false. | |||
require-explanation: true | |||
# Enable to require nolint directives to mention the specific linter being suppressed. Default is false. | |||
require-specific: true | |||
require-specific: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least one line was failing in 3+ linters, but they're reported one at a time... It looks too strict to a repo like Beats.
@@ -114,7 +120,7 @@ linters-settings: | |||
|
|||
gosimple: | |||
# Select the Go version to target. The default is '1.13'. | |||
go: "1.17.6" | |||
go: "1.17.8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it matches the .go-version
file/version.
@rdner I did some small modifications in the lint config to reduce noisy, what do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should leave an explanation comment for what we exclude, so people would not have to look it up all the time.
# it on its name. | ||
- linters: | ||
- stylecheck | ||
text: "ST1003:" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text: "ST1003:" | |
text: "ST1003:" # Poorly chosen identifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had it at the top, but I agree the yaml structure didn't help:
# Exclude package name contains '-' issue because we have at least one package with
# it on its name.
What about having it like that:
# Exclude package name contains '-' issue because we have at least one poorly chosen
# package with'-' it on its name.
- text: "ST1003:"
linters:
- stylecheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that, I thought that comment was not related to the particular exception but now I see it's a list and the comment is on top of the item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I committed the changes on 63c0d7a, it should be more clear now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original yaml snipped I copied from is pretty confusing, but after taking some time to decipher it, I came up with something I hope is more readable.
CI is failing: |
/test |
…30717) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit c61b219) # Conflicts: # .golangci.yml # dev-tools/templates/.golangci.yml # filebeat/docs/inputs/input-filestream-file-options.asciidoc # filebeat/input/filestream/internal/input-logfile/manager.go # filebeat/input/filestream/parsers_integration_test.go
…30717) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit c61b219) # Conflicts: # .golangci.yml # dev-tools/templates/.golangci.yml # filebeat/docs/inputs/input-filestream-file-options.asciidoc # filebeat/input/filestream/internal/input-logfile/manager.go # filebeat/input/filestream/parsers_integration_test.go
…30717) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit c61b219) # Conflicts: # .golangci.yml # dev-tools/templates/.golangci.yml # filebeat/docs/inputs/input-filestream-file-options.asciidoc # filebeat/input/filestream/internal/input-logfile/manager.go # filebeat/input/filestream/parsers_integration_test.go
…stream input without ID (#30717) (#30996) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit c61b219) # Conflicts: # .golangci.yml # dev-tools/templates/.golangci.yml # filebeat/docs/inputs/input-filestream-file-options.asciidoc # filebeat/input/filestream/internal/input-logfile/manager.go # filebeat/input/filestream/parsers_integration_test.go Co-authored-by: Tiago Queiroz <[email protected]>
* filebeat: Migrates registry entries from filestream input without ID (#30717) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit c61b219)
…lastic#30717) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]>
…30717) When a filestream input does not have an ID, we set the default .global ID at runtime and this ID is used to identify which files in the registry belong to each input. But if there are more than one input without an ID, then data is duplicated when Filebeat is restarted because the it's not possible to identify which input owns each file. We are adding a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat. They way we do it is by looking the registry for files that have their path matching the input but have the input ID as .global, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data. Most of the logic is the same of the already existing method UpdateIdentifiers from the sourceStore but for this PR it was also needed to update the in-memory store right away. Also borrowing from UpdateIdentifiers usage, this is done during the file prospector initialisation. Log errors are written when we detect filestream inputs without an ID or when more than one input is set with the same ID. Some files are refactored to pass the new linter, some test code also had some small refactoring. Co-authored-by: Craig MacKenzie <[email protected]>
What does this PR do?
It makes mandatory to configure an ID to all filestream inputs. If an ID is added to a existing, working, configuration, no data is duplicated.
How does it work
When a filestream input does not have an ID, we set the default
.global
ID at runtime and this ID is used to identify which files in the registry belong to each input.Now that we're making the IDs mandatory, we need a way to migrate those entries in the registry to the newly set ID. Ideally this will happen only once when the users update to this new version of Filebeat.
They way we do it is by looking the registry for files that have their path matching the input but have the input ID as
.global
, those entries are then copied with the new key (which uses the input ID) so Filebeat can continue its operation without duplicating data.Most of the logic is the same of the already existing method
UpdateIdentifiers
from thesourceStore
but for this PR it was also needed to update the in-memory store right away. Also borrowing fromUpdateIdentifiers
usage, this is done during the file prospector initialisation.Why is it important?
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.## Author's ChecklistHow to test this PR locally
filestream
, do not set an ID for this inputRelated issues
## Use cases## Screenshots## Logs