Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/envoy: configurable access log format #3694

Merged
merged 1 commit into from
Jul 26, 2021

Conversation

tsaarni
Copy link
Member

@tsaarni tsaarni commented May 19, 2021

This change adds following features:

  • New configuration option that allows setting the format string for Envoy's text based access log.
  • Adds support for Envoy's REQ_WITHOUT_QUERY command operator, which can be used to avoid writing query string to access log.

Fixes #3576

Signed-off-by: Tero Saarni [email protected]

@tsaarni tsaarni requested a review from a team as a code owner May 19, 2021 15:43
@tsaarni tsaarni requested review from danehans and skriss and removed request for a team May 19, 2021 15:43
@tsaarni tsaarni changed the title internal/envoy: configurable access log format WIP internal/envoy: configurable access log format May 19, 2021
@tsaarni
Copy link
Member Author

tsaarni commented May 19, 2021

This is a work-in-progress for requesting comments. The PR lacks e.g. validation, tests, documentation...

Example use in contour.yaml configuration file

accesslog-format-string: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n"

Motivation would be to allow user to set custom access log format and also utilize Envoy access log formatter command extensions. Example extension envoyproxy/envoy#15711.

Access log format extensions could be optionally built into Envoy binary and in theory they might be even out-of-tree code that was added into a custom build of Envoy. The list of valid access log commands cannot be anymore known in advance - full validation within Contour would not be possible.

I would appreciate your opinion about this feature!

@codecov
Copy link

codecov bot commented May 19, 2021

Codecov Report

Merging #3694 (e8888e1) into main (feeb684) will decrease coverage by 1.08%.
The diff coverage is 87.87%.

❗ Current head e8888e1 differs from pull request most recent head be9c0a8. Consider uploading reports for the commit be9c0a8 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3694      +/-   ##
==========================================
- Coverage   76.96%   75.88%   -1.09%     
==========================================
  Files         109      107       -2     
  Lines        8475     7361    -1114     
==========================================
- Hits         6523     5586     -937     
+ Misses       1818     1654     -164     
+ Partials      134      121      -13     
Impacted Files Coverage Δ
cmd/contour/serve.go 10.74% <0.00%> (-0.89%) ⬇️
internal/xdscache/v3/listener.go 90.70% <50.00%> (-0.10%) ⬇️
internal/envoy/v3/accesslog.go 100.00% <100.00%> (ø)
internal/featuretests/v3/envoy.go 100.00% <100.00%> (ø)
internal/controller/gatewayclass.go 0.00% <0.00%> (-71.74%) ⬇️
internal/status/gatewayclass.go 0.00% <0.00%> (-71.43%) ⬇️
internal/protobuf/testhelpers.go 50.00% <0.00%> (-8.34%) ⬇️
internal/xdscache/v3/endpointstranslator.go 80.36% <0.00%> (-6.70%) ⬇️
internal/certgen/output.go 50.00% <0.00%> (-6.00%) ⬇️
internal/envoy/bootstrap.go 45.83% <0.00%> (-4.17%) ⬇️
... and 105 more

@tsaarni tsaarni requested a review from youngnick May 19, 2021 15:51
@tsaarni tsaarni force-pushed the access-log-format-string branch 2 times, most recently from 5f7a1df to fd6283d Compare June 2, 2021 14:13
@skriss
Copy link
Member

skriss commented Jun 14, 2021

@tsaarni sorry we've been slow to provide feedback here. In principle I think this feature makes sense. With regard to the command operators, I'm not sure if you've seen but we've accepted a contribution to do something similar for headers: https://projectcontour.io/docs/v1.16.0/config/request-rewriting/#dynamic-header-values and https://github.com/projectcontour/contour/blob/main/internal/dag/policy.go#L262-L305. With that feature, we included an allow-list of possible command operators in the Contour code. That makes me wonder, do you know what Envoy's behavior is if a malformed access log format string is passed? Will it NACK the config update, or ignore invalid command operators, or something else? That may inform how paranoid we need to be about ensuring the passed value is valid.

@tsaarni
Copy link
Member Author

tsaarni commented Jun 14, 2021

Thanks @skriss for responding!

With that feature, we included an allow-list of possible command operators in the Contour code.

Yes, I think I would need to do that anyways: my underlying goal #3576 would require use of Envoy's (forthcoming) log format command operator extension. These formatter extension needs to be enabled explicitly before use. For Contour to allow using such command operator, it would need to recognize the command enable the corresponding extension like this #3576 (comment). On the other hand, if the command is not used, the extension would not need to be enabled (or it could be always enabled, but this would mean Envoy needs to be always compiled with that extension).

The JSON access log validation does bit more: it checks the command operator syntax with regex but I'm bit uncertain how far to go? It can only capture some specific errors, e.g. it cannot validate if header name in %REQ(MY-HTTP-HEADER)% is correct.

If letting Envoy to do the validation, one gets quite good error message in NACK, at least in this case. I guess the biggest difference is that when Contour is validating, it will exit() and crash loop. When Envoy validates, it will be just error printouts in logs. In both cases it is the same from traffic perspective: Envoy will serve requests with old config (until it happens to restart).

If new command operators are introduced, maintaining allow-list will always mean code impact on Contour. Though it seems that Envoy encourages to implement new command operators as formatter extensions, so it would mean code impact anyways - for maintaining a list of extensions to enable.

That makes me wonder, do you know what Envoy's behavior is if a malformed access log format string is passed? Will it NACK the config update, or ignore invalid command operators, or something else? That may inform how paranoid we need to be about ensuring the passed value is valid.

It will NACK the config update, and give error message which will be visible both in Contour and Envoy logs, example here #3576 (comment).

@tsaarni tsaarni force-pushed the access-log-format-string branch 2 times, most recently from 96cb7fa to 04cd6c0 Compare June 16, 2021 14:44
@tsaarni
Copy link
Member Author

tsaarni commented Jun 16, 2021

I've now adapted the existing validation code from JSON access log fields so that the same validation is also done for text access log format configuration.

Meanwhile, the access log formatter extension which I was working in Envoy was merged envoyproxy/envoy#15711. I uplifted go-control-plane to a unreleased version to experiment with this in Contour. The proposed implementation now recognizes when user has REQ_WITHOUT_QUERY command operator in either JSON or text based access log format string, and enables the formatter extension. When the command is not present the formatter is not enabled. This should allow user to run old Envoy version or Envoy version with disabled extension.

@tsaarni tsaarni changed the title WIP internal/envoy: configurable access log format internal/envoy: configurable access log format Jun 16, 2021
@tsaarni tsaarni force-pushed the access-log-format-string branch from 04cd6c0 to 58e2efa Compare June 16, 2021 15:07
Copy link
Member

@skriss skriss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @tsaarni - I think this looks pretty reasonable. I had some minor inline comments. I still need to look at the tests in some more detail as well.

It'd also be nice to find some place to document this more thoroughly than the config reference docs. One option would be to expand the JSON access logging guide (https://projectcontour.io/guides/structured-logs/) to cover non-JSON access logging as well, and to include the new operator / extension there. We could do that in a separate PR that we don't merge until the next release, since the guides aren't versioned so any changes are "live" immediately (xref #3829 for changing that, I don't see any reason why we shouldn't version the guides).

site/content/docs/main/configuration.md Outdated Show resolved Hide resolved
pkg/config/parameters.go Outdated Show resolved Hide resolved
@@ -5,7 +5,7 @@ go 1.15
require (
github.com/ahmetb/gen-crd-api-reference-docs v0.3.0
github.com/bombsimon/logrusr v1.0.0
github.com/envoyproxy/go-control-plane v0.9.9-0.20210111201334-f1f47757da33
github.com/envoyproxy/go-control-plane v0.9.10-0.20210614203518-782de910ff04
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to decide if we're comfortable shipping with this commit, or if we want to wait for an official v0.9.10.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've done this in the past as well, but not sure how quickly go-control-plane is releasing. Could even pull this out as a separate commit to verify tests, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as tests are passing (which they are currently), I'm OK with shipping with a specific commit here.

internal/envoy/v3/accesslog.go Outdated Show resolved Hide resolved
pkg/config/parameters.go Outdated Show resolved Hide resolved
pkg/config/parameters.go Show resolved Hide resolved
internal/envoy/v3/accesslog.go Show resolved Hide resolved
Copy link
Member

@stevesloka stevesloka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do some real tests with this. Would be super nice to have some E2E tests or follow up with some in another PR. The only downside is we need to test against an unreleased version of Envoy at the moment.

@@ -54,6 +54,10 @@ data:
### Logging options
# Default setting
accesslog-format: envoy
# The default access log format is defined by Envoy but it can be customized by setting following variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need to be docs on the site. I worry we're getting too many comments in the config file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've shortened the text by removing the link to Envoy docs and the full example (still leaving short example value accesslog-format-string: "...\n")

The config file is bit hard to read in general. Maybe one small improvement would be changing indentation of fields-vs-comments, see e.g. sshd_config

# first comment
#first-field:
# second comment
#second-field:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d'oh, it obviously fails anyways to be readable due to structs having indentations :-D

# first comment
#first-field:
#  sub-field:
# second comment

maybe double-hash for comments:

## first comment
#first-field:
#  sub-field:
## second comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this formatting is not great. I filed #3843 to make some improvements here.

@@ -5,7 +5,7 @@ go 1.15
require (
github.com/ahmetb/gen-crd-api-reference-docs v0.3.0
github.com/bombsimon/logrusr v1.0.0
github.com/envoyproxy/go-control-plane v0.9.9-0.20210111201334-f1f47757da33
github.com/envoyproxy/go-control-plane v0.9.10-0.20210614203518-782de910ff04
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've done this in the past as well, but not sure how quickly go-control-plane is releasing. Could even pull this out as a separate commit to verify tests, etc.

Format: &envoy_config_core_v3.SubstitutionFormatString_TextFormatSource{
TextFormatSource: &envoy_config_core_v3.DataSource{
Specifier: &envoy_config_core_v3.DataSource_InlineString{
InlineString: format,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will Envoy blow up if the format is invalid? Or just not log properly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see there's a parser below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the same parser is now used for both JSON and text based logs. If for some reason invalid format would still get through parser, I've tested how Envoy would react, for results see #3576 (comment).

There is also one very common error that could be validated: there needs to be \n at the end of the line.
When it is not included, logs are not flushed until certain amount of logs have been written. And obviously, everything comes as one long line. I think I saw old discussions about automatically adding this, but it is still not done today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, yeah, that \n one is a great candidate for us to check, as you probably would never actually want that behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I'm a 👍 to enforcing/adding the \n.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added check for \n.

@tsaarni tsaarni force-pushed the access-log-format-string branch 3 times, most recently from a1fcd1f to 0c9e075 Compare June 23, 2021 15:58
@tsaarni tsaarni force-pushed the access-log-format-string branch from 0c9e075 to a28de3e Compare June 29, 2021 12:16
@tsaarni
Copy link
Member Author

tsaarni commented Jun 29, 2021

I believe I have covered current comments for now. Please take a look.

I'll create another issue for updating https://projectcontour.io/guides/structured-logs/ (or some other page) with detailed usage docs.

One question regarding API:
The new parameter accesslog-format-string will be effective when accesslog-format: envoy is set. Maybe value envoy was originally intended to refer to "the format that Envoy defines" and that is why it is called like that. In theory text would make more sense now, as an alternative to json, but is it OK to keep using envoy?

I guess adding third option (envoy = default Envoy format, json = custom JSON logs, text =custom text logs) would be equally confusing as just continuing to use envoy and json, former meaning any text based log, default or custom.

@skriss skriss added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Jun 29, 2021
@skriss
Copy link
Member

skriss commented Jun 29, 2021

One question regarding API:
The new parameter accesslog-format-string will be effective when accesslog-format: envoy is set. Maybe value envoy was originally intended to refer to "the format that Envoy defines" and that is why it is called like that. In theory text would make more sense now, as an alternative to json, but is it OK to keep using envoy?

I guess adding third option (envoy = default Envoy format, json = custom JSON logs, text =custom text logs) would be equally confusing as just continuing to use envoy and json, former meaning any text based log, default or custom.

I think leaving it as envoy is OK - to me it just means the non-JSON format, the exact fields of which can now be customized.

Copy link
Member

@skriss skriss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no further comments from me, LGTM. thanks @tsaarni!

Copy link
Member

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but before we merge, I just need to check: @tsaarni, what version of Envoy is the required extension available in? Do we need to hold this PR until after v1.17.0 to wait for the new Envoy version?

If it works with v1.18.3, then we're good to go, if not, we should hold off merging, since 1.17.0 is due out tomorrow (my time).

@tsaarni
Copy link
Member Author

tsaarni commented Jun 30, 2021

If it works with v1.18.3, then we're good to go, if not, we should hold off merging, since 1.17.0 is due out tomorrow (my time).

Otherwise it will work but if user adds the new command operator extension REQ_WITHOUT_QUERY in a custom format, then Contour will try to enable the Envoy formatter extension and fail until there is Envoy 1.19.0.

@youngnick
Copy link
Member

Hmm, okay, thanks. While I know it's the reason you're making the change, I don't think that we can introduce this feature with a known-not-working field in it, given the current level of Envoy support.

In that case, we have two options:

  • Hold this PR to 1.18, which should be released around the same time Envoy 1.19 is
  • Remove the bits to do REQ_WITHOUT_QUERY for now, and put them back in once Envoy 1.19 is available.

@tsaarni, what do you think?

@tsaarni
Copy link
Member Author

tsaarni commented Jun 30, 2021

  • Hold this PR to 1.18, which should be released around the same time Envoy 1.19 is

Yes, lets wait for the Envoy release first. It will avoid some extra work.

@sunjayBhatia
Copy link
Member

I'll add the do not merge tag for now so we dont accidentally merge this

@sunjayBhatia sunjayBhatia added the do not merge Don't merge this PR until this label is removed. label Jun 30, 2021
@skriss skriss added this to the 1.18.0 milestone Jun 30, 2021
@github-actions
Copy link

Marking this PR stale since there has been no activity for 14 days. It will be closed if there is no activity for another 30 days.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2021
@skriss skriss removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2021
@skriss
Copy link
Member

skriss commented Jul 16, 2021

Just merged #3887, so I think this is ready to go now, just need a quick rebase to fix merge conflicts.

@tsaarni tsaarni force-pushed the access-log-format-string branch from a28de3e to be32c62 Compare July 20, 2021 06:18
@tsaarni
Copy link
Member Author

tsaarni commented Jul 20, 2021

I've now rebased and run manual test with the current Envoy version.

I did not look yet how this could be fully automated in the integration tests but I've written manual test procedure here https://gist.github.com/tsaarni/6e8be52179d775bf2a90b6e009e5ef94

@skriss
Copy link
Member

skriss commented Jul 21, 2021

This is ready to merge from my perspective now that we've updated Envoy, cc @youngnick to confirm

@skriss skriss removed the request for review from danehans July 23, 2021 15:46
@skriss
Copy link
Member

skriss commented Jul 23, 2021

Sorry @tsaarni, needs another rebase.

* Adds new configuration option that allows setting the format string for
  Envoy's text based access log.
* Adds support for Envoy's REQ_WITHOUT_QUERY command operator, which can
  be used to avoid writing query string to access log.

Signed-off-by: Tero Saarni <[email protected]>
@tsaarni tsaarni force-pushed the access-log-format-string branch from be32c62 to be9c0a8 Compare July 23, 2021 21:21
@skriss skriss removed the do not merge Don't merge this PR until this label is removed. label Jul 23, 2021
@tsaarni
Copy link
Member Author

tsaarni commented Jul 23, 2021

@skriss no problem, rebase done!

@skriss
Copy link
Member

skriss commented Jul 23, 2021

Awesome, thanks. I''ll merge this first thing next week unless I hear otherwise from @youngnick

Copy link
Member

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Envoy 1.19 is out now, so I thnk we are good to go.

@skriss skriss merged commit b335921 into projectcontour:main Jul 26, 2021
tsaarni added a commit to Nordix/contour that referenced this pull request Sep 19, 2021
* configurable access log format

Signed-off-by: Tero Saarni <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to remove query strings from access log
5 participants