-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for structured logs #624
Comments
Thank you for raising this issue. The access logs are generated directly from envoy. At the moment we have flags to control their output destination, but their format is currently defaulting to whatever envoy uses by default -- something akin to apache2 log. https://github.com/heptio/contour/blob/master/internal/contour/listener.go#L338 Looking at https://www.envoyproxy.io/docs/envoy/v1.6.0/configuration/access_log#config-access-log-format It looks like its possible to redefine the output of the logging format. I'm wary of exposing the full configuration syntax of this to end users, it's too close to showing how the sausage is made. With that said, when you say structured logging, do you mean JSON? |
Yes, I mean JSON. We recently turned JSON access logs on for Traefik in our old DC/OS clusters and they have been useful as well as capable of including/redacting/excluding headers of requests coming through the ingress making correlation really easy, stuff that normally doesn't get included in apache2 logs for brevity's sake. |
Maybe support define log format by configmap is a good way , just like nginx-ingress-controller use log-format-upstream parameter, also istio use logentry is a good practice. |
We're not going to get to this for the 0.7 release. Removing milestone. |
Has there been any agreement on this? Right now this is making Contour a less usable replacement for Nginx, having Envoy metrics available is great but getting full access log data would go a long way to making this a viable option. |
No there has been no agreement because, apart from a request for json formatted logs, no details have been provided.
… On 25 Apr 2019, at 07:08, Abi X Renhart ***@***.***> wrote:
Has there been any agreement on this? Right now this is making Contour a less usable replacement for Nginx, having Envoy metrics available is great but getting full access log data would go a long way to making this a viable option.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
For our use case it would be nice if we could add an annotation or even just load a ConfigMap that will add a structure like so to every listener filter chain.
It looks like we can probably add it into this function with minimal effort https://github.com/heptio/contour/blob/master/internal/envoy/listener.go#L189 The configuration is defined here (and doesn't support nesting) for the Support for the I've validated this works for the static resources, but for LDS it looks like Contour is the only way to configure it. |
Our general preference is rather than make things configurable, we choose a reasonable default for all users. Is there a standard for json logs in this format? If so, we might just make that the default for everyone.
… On 25 Apr 2019, at 07:34, Abi X Renhart ***@***.***> wrote:
For our use case it would be nice if we could add an annotation or even just load a ConfigMap that will add a structure like so to every listener filter chain.
access_log: [
{
name: "envoy.file_access_log",
config: {
json_format: {
downstream_remote_address: "%DOWNSTREAM_REMOTE_ADDRESS%",
path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
authority: "%REQ(:AUTHORITY)%",
protocol: "%PROTOCOL%",
upstream_service_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
upstream_local_address: "%UPSTREAM_LOCAL_ADDRESS%",
duration: "%DURATION%",
downstream_local_address: "%DOWNSTREAM_LOCAL_ADDRESS%",
user_agent: "%REQ(USER-AGENT)%",
response_code: "%RESPONSE_CODE%",
response_flags: "%RESPONSE_FLAGS%",
start_time: "%START_TIME%",
method: "%START_TIME%",
request_id: "%REQ(X-REQUEST-ID)%",
upstream_host: "%UPSTREAM_HOST%",
x_forwarded_for: "%REQ(X-FORWARDED-FOR)%",
requested_server_name: "%REQUESTED_SERVER_NAME%",
bytes_received: "%BYTES_RECEIVED%",
bytes_sent: "%BYTES_SENT%",
upstream_cluster: "%UPSTREAM_CLUSTER%"
},
path: "/dev/stdout"
},
},
]
It looks like we can probably add it into this function with minimal effort https://github.com/heptio/contour/blob/master/internal/envoy/listener.go#L189
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I agree with that approach, there doesn't seem to be a standard for JSON access logs but this structure appears to have everything that Envoy reports (other than custom headers)
|
So I went ahead and did a build against the 0.11.0 code base with the changes we discussed. The dynamic listener logs now look like this:
I also added the same format to the static listener configs, this is the log for the health check endpoiont:
Let me know what needs to be done to get this approved. I was thinking it might make sense to add a flag to enable or disable JSON access logs, but as you said the preferred approach is to create sane defaults which does seem like a good approach. |
@azuretek is there a reason or restriction for outputting numerical values as strings in the JSON objects? Bytes and duration are things one would want to do aggregations and calculate metrics on (assuming the logs end up in Elasticsearch or similar database). For the response code, it's convenient for range queries (<100, 200-299, 300-399, 400-499,>500). It would be nice to avoid an ETL for such simple JSON type coercions. EDIT: just found the relevant part of the Envoy docs:
That's too bad :/ |
@bgagnon definitely ran into that in my PoC, it's annoying but ultimately it's able to be handled by adjusting the mappings on the elastic search side. To make it report correctly we'd have to make changes to envoy (which should probably be done). |
Abi,
Can you please detail the changes you require. If you can give complete examples of the expected json output that would be best.
Thanks
Dave
… On 3 May 2019, at 01:39, Abi X Renhart ***@***.***> wrote:
@bgagnon definitely ran into that in my PoC, it's annoying but ultimately it's able to be handled by adjusting the mappings on the elastic search side. To make it report correctly we'd have to make changes to envoy (which should probably be done ultimately).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@davecheney I need the default access log format to be structured, preferably JSON to more easily ingest into storage backends like Elasticsearch. This is a requirement for us to move away from using NGINX as our ingress. Currently access logs use the default format string, as defined in the envoy documentation: https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#default-format-string
The desired format is JSON specifically with these keys and values. As defined in the envoy documentation https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#format-dictionaries {
"@timestamp": "%START_TIME%",
"authority": "%REQ(:AUTHORITY)%",
"bytes_received": "%BYTES_RECEIVED%",
"bytes_sent": "%BYTES_SENT%",
"downstream_local_address": "%DOWNSTREAM_LOCAL_ADDRESS%",
"downstream_remote_address": "%DOWNSTREAM_REMOTE_ADDRESS%",
"duration": "%DURATION%",
"method": "%REQ(:METHOD)%",
"path": "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
"protocol": "%PROTOCOL%",
"request_id": "%REQ(X-REQUEST-ID)%",
"requested_server_name": "%REQUESTED_SERVER_NAME%",
"response_code": "%RESPONSE_CODE%",
"response_flags": "%RESPONSE_FLAGS%",
"uber_trace_id": "%REQ(UBER-TRACE-ID)%",
"upstream_cluster": "%UPSTREAM_CLUSTER%",
"upstream_host": "%UPSTREAM_HOST%",
"upstream_local_address": "%UPSTREAM_LOCAL_ADDRESS%",
"upstream_service_time": "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
"user_agent": "%REQ(USER-AGENT)%",
"x_forwarded_for": "%REQ(X-FORWARDED-FOR)%"
}, This is an example of how the access log should look {
"upstream_cluster": "default/hello-world/8080/da39a3ee5e",
"downstream_remote_address": "1.2.3.4:49286",
"path": "/?abc=123",
"authority": "hello.test.foo.bar",
"protocol": "HTTP/2",
"upstream_service_time": "1",
"@timestamp": "2019-04-25T06:31:07.162Z",
"upstream_local_address": "-",
"duration": "1",
"downstream_local_address": "10.220.3.107:443",
"response_code": "200",
"user_agent": "curl/7.54.0",
"response_flags": "-",
"method": "GET",
"request_id": "b4ecadf1-8fc0-4e87-89d8-a3aa851196d4",
"upstream_host": "10.141.181.15:8080",
"x_forwarded_for": "1.2.3.4",
"requested_server_name": "hello.test.foo.bar",
"bytes_received": "0",
"uber_trace_id": "-",
"bytes_sent": "17"
} |
Thank you for your reply.
My goal is to change the log output format once, to something structured, without the option to change it.
To do this I need your assistance to know if there is an established standard for json structured http logs to replace the current ncsa style log format.
… On 4 May 2019, at 13:22, Abi X Renhart ***@***.***> wrote:
@davecheney I need the default access log format to be structured, preferably JSON to more easily ingest into storage backends like Elasticsearch. This is a requirement for us to move away from using NGINX as our ingress.
Currently access logs use the default format string, as defined in the envoy documentation: https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#default-format-string
[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%"
%RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION%
%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% "%REQ(X-FORWARDED-FOR)%" "%REQ(USER-AGENT)%"
"%REQ(X-REQUEST-ID)%" "%REQ(:AUTHORITY)%" "%UPSTREAM_HOST%"\n
The desired format is JSON specifically with these keys and values. As defined in the envoy documentation https://www.envoyproxy.io/docs/envoy/latest/configuration/access_log#format-dictionaries
{
***@***.***": "%START_TIME%",
"authority": "%REQ(:AUTHORITY)%",
"bytes_received": "%BYTES_RECEIVED%",
"bytes_sent": "%BYTES_SENT%",
"downstream_local_address": "%DOWNSTREAM_LOCAL_ADDRESS%",
"downstream_remote_address": "%DOWNSTREAM_REMOTE_ADDRESS%",
"duration": "%DURATION%",
"method": "%REQ(:METHOD)%",
"path": "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
"protocol": "%PROTOCOL%",
"request_id": "%REQ(X-REQUEST-ID)%",
"requested_server_name": "%REQUESTED_SERVER_NAME%",
"response_code": "%RESPONSE_CODE%",
"response_flags": "%RESPONSE_FLAGS%",
"uber_trace_id": "%REQ(UBER-TRACE-ID)%",
"upstream_cluster": "%UPSTREAM_CLUSTER%",
"upstream_host": "%UPSTREAM_HOST%",
"upstream_local_address": "%UPSTREAM_LOCAL_ADDRESS%",
"upstream_service_time": "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
"user_agent": "%REQ(USER-AGENT)%",
"x_forwarded_for": "%REQ(X-FORWARDED-FOR)%"
},
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thank you to everyone who has contributed to this issue over the last few weeks. I want to make it clear to all readers that I am not opposed to adding this feature. In fact I want to see it adopted as default. What I do not want to see happen is JSON or NCSA become selectable especially if there are options to make each of those formats configurable. If this is the outcome we will stick with the default NCSA format. Please do not send any PRs until there is agreement on which format will be used. |
#1130 landed in 0.14. Moving to 0.15 to implement. |
Sorry, we ran out of time in 0.15 for this. @youngnick has a plan for how to implement this in beta1. |
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
I've published #1485 with our current design ideas around this feature. I'm hoping to get this reviewed and agreed soon with the aim to land this in beta1. |
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
After internal review, I've merged #1485 with the team's preferred design for JSON logging. Please review either #1485 or |
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
Updates projectcontour#624 Signed-off-by: Nick Young <[email protected]>
Updates #624 Signed-off-by: Nick Young <[email protected]>
Fixes projectcontour#624. Also update a small bug in config file definition that can lead to parsing weirdness. Signed-off-by: Nick Young <[email protected]>
Fixes projectcontour#624. Also update a small bug in config file definition that can lead to parsing weirdness. Signed-off-by: Nick Young <[email protected]>
Fixes #624. Also update a small bug in config file definition that can lead to parsing weirdness. Signed-off-by: Nick Young <[email protected]>
It would be nice to have access logs that are structured as we use splunk for our logging aggregation and extracting key/value pairs is much nicer in splunk with structured logs than it is with regex. Could this be some sort of flag to enable this or did I just miss that this option is already enabled somewhere?
Blocked
The text was updated successfully, but these errors were encountered: