Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/filebeat/input: Fix truncation of bodies in request tracing #42327

Merged
merged 3 commits into from
Jan 27, 2025

Conversation

chrisberkhout
Copy link
Contributor

@chrisberkhout chrisberkhout commented Jan 16, 2025

Proposed commit message

x-pack/filebeat/input: Fix truncation of bodies in request tracing

When logging request traces, truncate the request/response body to 10%
of the maximum log file size.

Previously, bodies were truncated to the maximum file size, less 10kB.
10kB is a reasonable number for the other trace details, but space is
also required for encoding the body data as a JSON string value.

One example JSON body was 15% larger after encoding, but the 10kB
margin is 1% or less of the total limit. A body approaching the size
limit would typically generate a log entry that exceeded the limit.

Truncating large log entries to fit the file size limit means there may
only be one such entry per file. By truncating body data to 10% of the
file limit, we can expect to see entries for several request/response
pairs in each file.

The default maximum file size of 1MB gives a default maximum body size
of 100kB.

The behavior of request tracing for the HTTP Endpoint input is
unchanged: it always truncates request bodies to a size of 10kiB.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

@chrisberkhout chrisberkhout self-assigned this Jan 16, 2025
@chrisberkhout chrisberkhout requested a review from a team as a code owner January 16, 2025 16:51
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 16, 2025
@chrisberkhout chrisberkhout added Filebeat Filebeat Team:Security-Service Integrations Security Service Integrations Team and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jan 16, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

Copy link
Contributor

mergify bot commented Jan 16, 2025

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @chrisberkhout? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Jan 16, 2025

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Jan 16, 2025
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent commit message. Thank you.

The default maximum file size of 1MB gives a default maximum body size
of 100kB.

I find this behavior a bit surprising. I would have expected that if I needed to capture a full 5 MiB response body that I would need to increase the max size to something a little larger than 5 MiB, but not 10x the size. What do others think? At a minimum, I think we need to mention this behavior in the documentation associated with the tracer settings.

CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Jan 20, 2025

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fix-req-tracing-truncation upstream/fix-req-tracing-truncation
git merge upstream/main
git push upstream fix-req-tracing-truncation

@chrisberkhout
Copy link
Contributor Author

That's an excellent commit message. Thank you.

The default maximum file size of 1MB gives a default maximum body size
of 100kB.

I find this behavior a bit surprising. I would have expected that if I needed to capture a full 5 MiB response body that I would need to increase the max size to something a little larger than 5 MiB, but not 10x the size. What do others think? At a minimum, I think we need to mention this behavior in the documentation associated with the tracer settings.

@andrewkroh I get that perspective. There are two things that make me lean towards the current version:

Often getting several responses is more important than getting full response bodies. One per file isn't a great experience. If you're getting truncated, you can double the limit (or better, shorten the page length) until you get what you need.

More importantly, since we're doing the truncation on the raw data, we need a fair bit of spare space to avoid problems. The real-world JSON I checked expanded by 15%, but it could need significantly more. A body of backslashes would double in size and bytes encoded as \u00XX grow by 6x.

It would be nice to truncate to a specific length as the data is written into the log but that's a much more complicated change.

I'm interested to know what others think.

If we go with this version I can improve the documentation.

@chrisberkhout chrisberkhout requested a review from a team January 21, 2025 09:59
@kcreddy
Copy link
Contributor

kcreddy commented Jan 21, 2025

The default maximum file size of 1MB gives a default maximum body size
of 100kB.

@chrisberkhout , here are my thoughts.

Couple of things that may impact us.

  1. With new default max body size of 100kB, we will be getting more initial diagnostics where responses are truncated and may need to go-back and request customer to increase their resource.tracer.maxsize if we are in need of full response (some APIs have the pagination links/numbers at the end of the response which are going to fall in this category.). We may need a new resource.tracer.maxsize default, probably 5MB, to reduce this impact.
  2. We don't have resource.tracer.maxsize exposed in most of our integrations. Until now, we have been selectively adding them as per the integration needs, which needs to be made available just like enable_request_tracer.

@chrisberkhout chrisberkhout force-pushed the fix-req-tracing-truncation branch from 6ae4006 to 73e639a Compare January 23, 2025 13:35
@chrisberkhout
Copy link
Contributor Author

chrisberkhout commented Jan 23, 2025

As discussed in the team meeting, we'll go with the current approach unless @efd6 has a better idea.
We can add the option and set higher defaults in integrations as necessary.

I've updated the documentation to be clear about the body limit being 10% of the file size limit. I added a tracer.maxsize entry in the Entity Analytics documentation.

The HTTP Endpoint truncation always limits request bodies to 10kiB. That behavior is unchanged by this PR but I noted it in the documentation.

@chrisberkhout chrisberkhout requested a review from efd6 January 23, 2025 13:44
efd6

This comment was marked as resolved.

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@chrisberkhout chrisberkhout merged commit 97c6f92 into elastic:main Jan 27, 2025
19 of 22 checks passed
mergify bot pushed a commit that referenced this pull request Jan 27, 2025
…42327)

When logging request traces, truncate the request/response body to 10%
of the maximum log file size.

Previously, bodies were truncated to the maximum file size, less 10kB.
10kB is a reasonable number for the other trace details, but space is
also required for encoding the body data as a JSON string value.

One example JSON body was 15% larger after encoding, but the 10kB
margin is 1% or less of the total limit. A body approaching the size
limit would typically generate a log entry that exceeded the limit.

Truncating large log entries to fit the file size limit means there may
only be one such entry per file. By truncating body data to 10% of the
file limit, we can expect to see entries for several request/response
pairs in each file.

The default maximum file size of 1MB gives a default maximum body size
of 100kB.

The behavior of request tracing for the HTTP Endpoint input is
unchanged: it always truncates request bodies to a size of 10kiB.

(cherry picked from commit 97c6f92)
chrisberkhout added a commit that referenced this pull request Jan 28, 2025
…42327) (#42440)

When logging request traces, truncate the request/response body to 10%
of the maximum log file size.

Previously, bodies were truncated to the maximum file size, less 10kB.
10kB is a reasonable number for the other trace details, but space is
also required for encoding the body data as a JSON string value.

One example JSON body was 15% larger after encoding, but the 10kB
margin is 1% or less of the total limit. A body approaching the size
limit would typically generate a log entry that exceeded the limit.

Truncating large log entries to fit the file size limit means there may
only be one such entry per file. By truncating body data to 10% of the
file limit, we can expect to see entries for several request/response
pairs in each file.

The default maximum file size of 1MB gives a default maximum body size
of 100kB.

The behavior of request tracing for the HTTP Endpoint input is
unchanged: it always truncates request bodies to a size of 10kiB.

(cherry picked from commit 97c6f92)

Co-authored-by: Chris Berkhout <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify bugfix Filebeat Filebeat Team:Security-Service Integrations Security Service Integrations Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants