-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add logging context #273
Add logging context #273
Conversation
Add more details to the logging context, useful for debugging.
Update test to recent changes, verify that logging context is set. Add another test to verify what happens when no cluster id is given
I'm not really satisfied with the solution yet, since we rely on the fact that we are not allowed to run concurrently. Ideally we could use something like thread local, which doesn't exist in go. It is expected to pass the context around as parameters, which is not that easy for the cobra commands I guess. We can also take a look on next iteration at https://github.com/jtolio/gls |
If we take a look at the log, we can see that it is not optimal, since the old context is still available until the next job overwrites it.
|
@oleschoenburg I iterated over it again, and I think I came to a solution that is "good enough" for now. It will help us in debugging failing zbchaos runs I think. Log output: {"level":"info","clusterId":"","jobKey":"2251799813685374","processInstanceKey":"2251799813685354","title":"This is a fake experiment","time":"2022-12-07T10:20:37+01:00","message":"Handle zbchaos job [key: 2251799813685374]"}
{"level":"info","clusterId":"","jobKey":"2251799813685374","processInstanceKey":"2251799813685354","title":"This is a fake experiment","time":"2022-12-07T10:20:37+01:00","message":"Running command with args: [version] "}
{"level":"info","clusterId":"","jobKey":"2251799813685374","processInstanceKey":"2251799813685354","title":"This is a fake experiment","time":"2022-12-07T10:20:37+01:00","message":"zbchaos development (commit: HEAD)"}
{"level":"info","clusterId":"","jobKey":"2251799813685517","processInstanceKey":"2251799813685497","title":"This is a fake experiment","time":"2022-12-07T10:20:42+01:00","message":"Handle zbchaos job [key: 2251799813685517]"}
{"level":"info","clusterId":"","jobKey":"2251799813685517","processInstanceKey":"2251799813685497","title":"This is a fake experiment","time":"2022-12-07T10:20:42+01:00","message":"Running command with args: [version] "}
{"level":"info","clusterId":"","jobKey":"2251799813685517","processInstanceKey":"2251799813685497","title":"This is a fake experiment","time":"2022-12-07T10:20:42+01:00","message":"zbchaos development (commit: HEAD)"}
{"level":"info","clusterId":"","jobKey":"2251799813685517","processInstanceKey":"2251799813685497","title":"This is a fake experiment","time":"2022-12-07T10:20:42+01:00","message":"Instance 2251799813685256 [definition 2251799813685253 ] completed"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me 👍
I was wondering if we could set logging.googleapis.com/operation
(docs) to the job key. I think this would give us nice grouping of related logs in StackDriver.
Ah nice @oleschoenburg do you think it makes sense to also use |
Use https://cloud.google.com/logging/docs/agent/logging/configuration#special-fields for the logging context
I just did it, lets see how it is :) |
NEW
Add a logging context, which can be set by the workers to enrich our json logs with more useful information, like process instance key, experiment title etc.
The context is set at the begin of the run zbchaos handler and removed again. See #273 (comment)
OLD
Add a new parameter
--loggingContext
to the root command to set the logging context.This is used by the workers to enrich the logging with details like processInstanceKey, experiment name, etc. This is really useful for debugging since we can group and filter the logging better in stackdriver.
Adjust the tests related to that. Furthermore add a test to verify that the worker can also run without clusterId, which I use in my local tests.
Integration test output with JSON logging now: