Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log.flags and object metadata to aws-s3 input events #26267

Merged

Conversation

andrewkroh
Copy link
Member

What does this PR do?

This adds the log.flags field created by the line readers to aws-s3 events. log.flags contains metadata like multiline and truncated to indicate how the data was processed.

This also adds a config option to include S3 object metadata in the event if it exists. The use case for me was to get the Last-Modified timestamp for cases where the log does not have a timestamp or it cannot be parsed. Then this can be used as a fallback.

Why is it important?

It makes the S3 log reader behave more similar to the log reader. And having access to the S3 headers gives users a bit more flexibility in their use-cases.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jun 11, 2021
@andrewkroh andrewkroh force-pushed the feature/fb/s3-log-flags-last-modified branch from e1a50d5 to d0e7758 Compare June 11, 2021 20:55
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 11, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #26267 updated

  • Start Time: 2021-06-14T15:19:01.307+0000

  • Duration: 103 min 25 sec

  • Commit: ae25a20

Test stats 🧪

Test Results
Failed 0
Passed 14052
Skipped 2292
Total 16344

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 14052
Skipped 2292
Total 16344

@andrewkroh andrewkroh force-pushed the feature/fb/s3-log-flags-last-modified branch from d0e7758 to 4618a93 Compare June 11, 2021 21:27
Copy link
Contributor

@leehinman leehinman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding the meta data to the s3Info struct?

The meta data feels like a good fit with the other data in s3Info, and we already passing an s3Info into all the functions so it would shorten the argument list.

@mergify
Copy link
Contributor

mergify bot commented Jun 14, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feature/fb/s3-log-flags-last-modified upstream/feature/fb/s3-log-flags-last-modified
git merge upstream/master
git push upstream feature/fb/s3-log-flags-last-modified

This adds the log.flags field created by the line readers to aws-s3 events. log.flags contains metadata like `multiline` and `truncated` to indicate how the data was processed.

This also adds a config option to include S3 object metadata in the event if it exists. The use case for me was to get the Last-Modified timestamp for cases where the log does not have a timestamp or it cannot be parsed. Then this can be used as a fallback.
@andrewkroh andrewkroh force-pushed the feature/fb/s3-log-flags-last-modified branch from c414773 to e8a0f65 Compare June 14, 2021 15:11
@andrewkroh
Copy link
Member Author

What do you think about adding the meta data to the s3Info struct?

@leehinman Good idea. Looks better after that change.

@andrewkroh andrewkroh merged commit c25fca8 into elastic:master Jun 14, 2021
mergify bot pushed a commit that referenced this pull request Jun 14, 2021
* Add log.flags and object metadata to aws-s3 input events

This adds the log.flags field created by the line readers to aws-s3 events. log.flags contains metadata like `multiline` and `truncated` to indicate how the data was processed.

This also adds a config option to include S3 object metadata in the event if it exists. The use case for me was to get the Last-Modified timestamp for cases where the log does not have a timestamp or it cannot be parsed. Then this can be used as a fallback.

* Pass metadata using s3Info struct to avoid adding new func params

(cherry picked from commit c25fca8)
andrewkroh added a commit that referenced this pull request Jun 14, 2021
…6298)

* Add log.flags and object metadata to aws-s3 input events

This adds the log.flags field created by the line readers to aws-s3 events. log.flags contains metadata like `multiline` and `truncated` to indicate how the data was processed.

This also adds a config option to include S3 object metadata in the event if it exists. The use case for me was to get the Last-Modified timestamp for cases where the log does not have a timestamp or it cannot be parsed. Then this can be used as a fallback.

* Pass metadata using s3Info struct to avoid adding new func params

(cherry picked from commit c25fca8)

Co-authored-by: Andrew Kroh <[email protected]>
michalpristas pushed a commit to michalpristas/beats that referenced this pull request Jun 17, 2021
* Add log.flags and object metadata to aws-s3 input events

This adds the log.flags field created by the line readers to aws-s3 events. log.flags contains metadata like `multiline` and `truncated` to indicate how the data was processed.

This also adds a config option to include S3 object metadata in the event if it exists. The use case for me was to get the Last-Modified timestamp for cases where the log does not have a timestamp or it cannot be parsed. Then this can be used as a fallback.

* Pass metadata using s3Info struct to avoid adding new func params
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.14.0 Automated backport with mergify enhancement Filebeat Filebeat review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants