Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libbeat] Fix parsing of RFC 3164 process IDs in syslog processor #38982

Merged

Conversation

taylor-swanson
Copy link
Contributor

@taylor-swanson taylor-swanson commented Apr 16, 2024

Proposed commit message

  • The pattern for parsing process IDs was too relaxed and would match everything between the first opening and the last closing square bracket in a message. If the message included multiple closing square brackets, the process ID would be set to not only the process ID, but also whatever leads up to the last closing square bracket.
  • The pattern has now been locked down to only digits.
  • Added test case.

Checklist

  • My code follows the style guidelines of this project
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

- The pattern for parsing process IDs was too relaxed and would
match everything between the first opening and the last closing
square bracket in a message. If the message included multiple
closing square brackets, the process ID would be set to not only
the process ID, but also whatever leads up to the last closing
square bracket.
- The pattern has now been locked down to only digits.
- Added test case.
@taylor-swanson taylor-swanson added bug libbeat Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution labels Apr 16, 2024
@taylor-swanson taylor-swanson self-assigned this Apr 16, 2024
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Apr 16, 2024
@taylor-swanson taylor-swanson added the backport-v8.13.0 Automated backport with mergify label Apr 16, 2024
@elasticmachine
Copy link
Collaborator

elasticmachine commented Apr 16, 2024

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 111 min 26 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@taylor-swanson taylor-swanson marked this pull request as ready for review April 18, 2024 12:58
@taylor-swanson taylor-swanson requested a review from a team as a code owner April 18, 2024 12:58
@elasticmachine
Copy link
Collaborator

Pinging @elastic/sec-deployment-and-devices (Team:Security-Deployment and Devices)

@taylor-swanson taylor-swanson requested a review from a team as a code owner April 18, 2024 13:11
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this LGTM! I am far from a ragel expert but based on the issue the substitution of print+ with digit+ makes sense to me

@taylor-swanson
Copy link
Contributor Author

taylor-swanson commented Apr 18, 2024

this LGTM! I am far from a ragel expert but based on the issue the substitution of print+ with digit+ makes sense to me

I think digit should be the right choice here, since the value we are extracting is a process ID. If for some reason we needed alpha and special characters, we could use print, but remove square brackets as valid characters (similar to the tag pattern above it).

Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@taylor-swanson
Copy link
Contributor Author

/test

@taylor-swanson taylor-swanson added the backport-v8.14.0 Automated backport with mergify label Apr 18, 2024
@taylor-swanson taylor-swanson merged commit 8e9a276 into elastic:main Apr 22, 2024
194 of 199 checks passed
@taylor-swanson taylor-swanson deleted the bug/syslog-processor-rfc3164 branch April 22, 2024 13:32
mergify bot pushed a commit that referenced this pull request Apr 22, 2024
…8982)

- The pattern for parsing process IDs was too relaxed and would
match everything between the first opening and the last closing
square bracket in a message. If the message included multiple
closing square brackets, the process ID would be set to not only
the process ID, but also whatever leads up to the last closing
square bracket.
- The pattern has now been locked down to only digits.
- Added test case.

(cherry picked from commit 8e9a276)
mergify bot pushed a commit that referenced this pull request Apr 22, 2024
…8982)

- The pattern for parsing process IDs was too relaxed and would
match everything between the first opening and the last closing
square bracket in a message. If the message included multiple
closing square brackets, the process ID would be set to not only
the process ID, but also whatever leads up to the last closing
square bracket.
- The pattern has now been locked down to only digits.
- Added test case.

(cherry picked from commit 8e9a276)
taylor-swanson added a commit that referenced this pull request Apr 22, 2024
…8982) (#39123)

- The pattern for parsing process IDs was too relaxed and would
match everything between the first opening and the last closing
square bracket in a message. If the message included multiple
closing square brackets, the process ID would be set to not only
the process ID, but also whatever leads up to the last closing
square bracket.
- The pattern has now been locked down to only digits.
- Added test case.

(cherry picked from commit 8e9a276)

Co-authored-by: Taylor Swanson <[email protected]>
taylor-swanson added a commit that referenced this pull request May 13, 2024
…8982) (#39124)

- The pattern for parsing process IDs was too relaxed and would
match everything between the first opening and the last closing
square bracket in a message. If the message included multiple
closing square brackets, the process ID would be set to not only
the process ID, but also whatever leads up to the last closing
square bracket.
- The pattern has now been locked down to only digits.
- Added test case.

(cherry picked from commit 8e9a276)

Co-authored-by: Taylor Swanson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.13.0 Automated backport with mergify backport-v8.14.0 Automated backport with mergify bug libbeat Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[libbeat] RFC 3164 process ID parsing issues
4 participants