Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

winlogbeat/eventlog: ensure event loggers restart metric collection when handling recoverable errors #36482

Closed
wants to merge 1 commit into from

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Sep 1, 2023

Proposed commit message

When winlogbeat's event loggers encounter recoverable errors they handle this by closing and reopening the channel. This causes the metric collection for the beat and dependent winlog filebeat input to lose metric collection as metric registration only occurs on configuration. So move metric registration to the channel open method. To avoid reregistration of metrics, use the nilness of the metric field to guard against this.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@efd6 efd6 self-assigned this Sep 1, 2023
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Sep 1, 2023
…hen handling recoverable errors

When winlogbeat's event loggers encounter recoverable errors they handle
this by closing and reopening the channel. This causes the metric
collection for the beat and dependent winlog filebeat input to lose
metric collection as metric registration only occurs on configuration.
So move metric registration to the channel open method. To avoid
reregistration of metrics, use the nilness of the metric field to guard
against this.
@efd6 efd6 force-pushed the 36479-winlog_winlogbeat branch from 4ec6e57 to 17320f3 Compare September 1, 2023 02:01
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-09-01T02:02:07.112+0000

  • Duration: 40 min 10 sec

Test stats 🧪

Test Results
Failed 0
Passed 936
Skipped 9
Total 945

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@efd6 efd6 marked this pull request as ready for review September 1, 2023 03:18
@efd6 efd6 requested a review from a team as a code owner September 1, 2023 03:18
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

andrewkroh
andrewkroh previously approved these changes Sep 1, 2023
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make it better in that we will have metrics after a recovery. One concern I have is that we lose the cumulative metric state for the input after any major error. If you rely on the metrics alone (without looking at logged errors) then I think you would miss that there had been some issue (albeit recovered).

@efd6
Copy link
Contributor Author

efd6 commented Sep 1, 2023

The alternative that I considered was to add a Reset method which would essentially be a Close/Open pair without the metrics reset. I did not do this because it would likely have been more invasive. I issue is that we are forced by OS design to conflate Close with reset. I'm happy to add the Reset approach if you would prefer.

@efd6
Copy link
Contributor Author

efd6 commented Sep 1, 2023

Counter proposal is here.

@andrewkroh andrewkroh dismissed their stale review September 1, 2023 13:00

Shifting my vote toward #36483.

@mergify
Copy link
Contributor

mergify bot commented Sep 5, 2023

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 36479-winlog_winlogbeat upstream/36479-winlog_winlogbeat
git merge upstream/main
git push upstream 36479-winlog_winlogbeat

@efd6 efd6 closed this in #36483 Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Filebeat] winlog input metrics are unregistered after recovery
3 participants