Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formatter: print request header without query string #15711

Merged

Conversation

tsaarni
Copy link
Member

@tsaarni tsaarni commented Mar 26, 2021

Signed-off-by: Tero Saarni [email protected]

Commit Message:
Adds formatter extension implementing a command operator that writes HTTP request header to access log without including the query string. Resolves #7583.

Additional Description:
Motivation for this change: Query string may contain sensitive information introducing potential vulnerability since access logs may unintentionally expose this information. If the problem cannot be fixed in the proxied application itself, the vulnerability can be mitigated by removing query string from the access logs.

This PR is similar to #7847 but implemented as extension and it is first in-tree formatter extension (xref #14512).

It implements command %REQ_WITHOUT_QUERY(X?Y):Z% which works in the same way as existing command operator %REQ(X?Y):Z%. The proposed implementation closely follows SubstitutionFormatParser and reuses SubstitutionFormatParser::parseCommandHeader() for parsing the command arguments.

In order to get check_format.py to pass, I needed to add entry to CODEOWNERS but since I don't know who should be added, I added @dio and myself as a placeholder for now.

Risk Level: Low
Testing: unit test
Docs Changes: New extension category added to docs
Release Notes: formatter: command operator to log request headers without query string.
Platform Specific Features: N/A

@rgs1
Copy link
Member

rgs1 commented Mar 30, 2021

FWIW, we've achieved the equivalent of %REQ_WITHOUT_QUERY(X?Y):Z% by using the header-to-metadata filter, e.g.:

- typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.header_to_metadata.v3.Config
    request_rules:
      - header: ":path"
        on_header_present:
          metadata_namespace: com.example.sanitized_headers
          key: path
          regex_value_rewrite:
            pattern:
              google_re2: {}
              regex: "^(/[^?]*).*$"
            substitution: "\\1"

and then from the access log format string:

%DYNAMIC_METADATA(com.example.sanitized_headers:path)%

though I do still think it'll be nice to have an extension with REQ_WITHOUT_QUERY.

cc: @sc0ttbeardsley

Copy link
Member

@rgs1 rgs1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, left two comments plus let's get the build fixed and I'll make another pass.

@tsaarni
Copy link
Member Author

tsaarni commented Apr 14, 2021

FWIW, we've achieved the equivalent of %REQ_WITHOUT_QUERY(X?Y):Z% by using the header-to-metadata filter, e.g.:

Thanks! I did not know about this filter.

In general, I have some doubts about about this PR. Adding a complete extension for the single purpose of removing ?querystring seems bit wrong. Do people see it as problematic if we have a lot of small extensions that do really trivial things?

I've also done some experiments with more general purpose formatter, inspired by your example with Header-To-Metadata, which allows following

formatters:
  - name: envoy.formatter.regex_substitute
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.formatter.regex_substitute.v3.RegexSubstituteConfig
      command_name: REQ_WITHOUT_QUERY
      regex_value_rewrite:
        pattern:
          google_re2: {}
          regex: "^(/[^?]*).*$"
        substitution: "\\1"

but would this then be duplication of functionality?

@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #15711 was synchronize by tsaarni.

see: more, trace.

@dio
Copy link
Member

dio commented Apr 19, 2021

@tsaarni sorry I was out for a health issue. Sorry was not able to update this earlier. But generally, the PR is good. If you resolve the conflict I'll take a look at it as soon as I can. And it is nice to have that REQ_WITHOUT_QUERY shorthand.

class ReqWithoutQueryCommandParser : public ::Envoy::Formatter::CommandParser {
public:
ReqWithoutQueryCommandParser() = default;
::Envoy::Formatter::FormatterProviderPtr parse(const std::string& token, size_t,
Copy link
Member

@dio dio Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply FormatterProviderPtr doesn't work here? Sorry for nitpicking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, it did not work...

@tsaarni
Copy link
Member Author

tsaarni commented Apr 22, 2021

@tsaarni sorry I was out for a health issue. Sorry was not able to update this earlier. But generally, the PR is good. If you resolve the conflict I'll take a look at it as soon as I can. And it is nice to have that REQ_WITHOUT_QUERY shorthand.

@dio No problem with delays, glad to see you back! I've now resolved the conflict.

I wonder if you have opinion about #15711 (comment), that is: more general purpose formatter (regex manipulation) vs this one (dedicated for query string alone)? I'm OK with anything at the end, but just wanted to hear your thoughts :-)

@tsaarni
Copy link
Member Author

tsaarni commented Apr 26, 2021

Check envoy-presubmit (check linux_x64 gcc) is failing for me due to too long argument list (full log here)

2021-04-24T19:45:27.1433253Z ERROR: /source/test/exe/BUILD:98:14: C++ compilation of rule '//test/exe:win32_scm_test' failed (Exit 1): gcc failed: error executing command 
...
2021-04-24T19:45:27.2269979Z gcc: fatal error: cannot execute ‘/usr/lib/gcc/x86_64-linux-gnu/9/cc1plus’: execv: Argument list too long

and in another case in different target (full log here)

2021-04-22T18:21:55.0427418Z ERROR: /source/test/exe/BUILD:62:14: C++ compilation of rule '//test/exe:main_common_test' failed (Exit 1): gcc failed: error executing command 
...
2021-04-22T18:21:55.1126016Z gcc: fatal error: cannot execute ‘/usr/lib/gcc/x86_64-linux-gnu/9/cc1plus’: execv: Argument list too long

I saw @rgs1 mentioning this in #14855 (comment) but there it was config_fuzz_test_lib. I don't understand why it is not always the same target that fails but it seems to hit more compilation targets as just a specific lib.

@tsaarni
Copy link
Member Author

tsaarni commented May 17, 2021

Could I kindly ask help in reviewing of this PR? I think @dio might be still unavailable, so I was hoping maybe others could be able to help and look at this also?

Copy link
Member

@rgs1 rgs1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one nit, otherwise generally LGTM. My only remaining question is whether this should be included/built by default (it is right now, maybe it shouldn't be).

Over to @dio for another round. Thanks!

@tsaarni
Copy link
Member Author

tsaarni commented Jun 9, 2021

I think it's OK to make this one trusted for downstream and upstream.

Thanks @mattklein123. I've now added entry to release notes and changed the security posture.

zuercher
zuercher previously approved these changes Jun 9, 2021
mattklein123
mattklein123 previously approved these changes Jun 11, 2021
@mattklein123
Copy link
Member

Oops looks like it needs a main merge.

/wait

@tsaarni tsaarni dismissed stale reviews from mattklein123 and zuercher via 708c346 June 11, 2021 08:44
@tsaarni
Copy link
Member Author

tsaarni commented Jun 14, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #15711 (comment) was created by @tsaarni.

see: more, trace.

@tsaarni
Copy link
Member Author

tsaarni commented Jun 14, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #15711 (comment) was created by @tsaarni.

see: more, trace.

@tsaarni
Copy link
Member Author

tsaarni commented Jun 14, 2021

There has been quite many flakes but these tests now seem to be failing quite consistently and I don't seem to get clean CI run anymore

  • IpHttpVersions/HttpHealthCheckIntegrationTest.SingleEndpointImmediateHealthcheckFailHttp/IPv4_Http2Upstream (logs)
  • IpHttpVersions/HttpHealthCheckIntegrationTest.SingleEndpointImmediateHealthcheckFailHttp/IPv6_Http2Upstream (logs)

I saw some other PRs have had same failures, so I wonder if this is a flake as well, or could it have something to do with the changes in this PR?

@zuercher
Copy link
Member

Yeah, it's happening on master as well.

@mattklein123 mattklein123 merged commit 5736630 into envoyproxy:main Jun 14, 2021
leyao-daily pushed a commit to leyao-daily/envoy that referenced this pull request Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

:PATH in access logs without query string
6 participants