-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
driver: allow disabling log collection #6820
Conversation
Operators commonly have docker logs aggregated using various tools and don't need nomad to manage their docker logs. Worse, Nomad uses a somewhat heavy docker api call to collect them and it seems to cause problems when a client runs hundreds of log collections. Here we add a knob to disable log aggregation completely for nomad. When log collection is disabled, we avoid running logmon and docker_logger for the docker tasks in this implementation. The downside here is once disabled, `nomad logs ...` commands and API no longer return logs and operators must corrolate alloc-ids with their aggregated log info. This is meant as a stop gap measure. Ideally, we'd follow up with at least two changes: First, we should optimize behavior when we can such that operators don't need to disable docker log collection. Potentially by reverting to using pre-0.9 syslog aggregation in linux environments, though with different trade-offs. Second, when/if logs are disabled, nomad logs endpoints should lookup docker logs api on demand. This ensures that the cost of log collection is paid sparingly.
12bd02f
to
9438544
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly lgtm
@@ -257,6 +257,8 @@ var ( | |||
hclspec.NewAttr("infra_image", "string", false), | |||
hclspec.NewLiteral(`"gcr.io/google_containers/pause-amd64:3.0"`), | |||
), | |||
|
|||
"disable_log_collection": hclspec.NewAttr("disable_log_collection", "bool", false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would maybe document that this in code that this may change, unless we're committing to this being a driver option rather than eventually a client one.
@@ -99,6 +103,11 @@ func reattachConfigFromHookData(data map[string]string) (*plugin.ReattachConfig, | |||
func (h *logmonHook) Prestart(ctx context.Context, | |||
req *interfaces.TaskPrestartRequest, resp *interfaces.TaskPrestartResponse) error { | |||
|
|||
if h.isLoggingDisabled() { | |||
h.logger.Info("logging is disabled by driver") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might wanna drop this to debug level? - It's likely to be noisy and very repetitive?
ff05434
to
e82dad7
Compare
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Operators commonly have docker logs aggregated using various tools and
don't need nomad to manage their docker logs. Worse, Nomad uses a
somewhat heavy docker api call to collect them and it seems to cause
problems when a client runs hundreds of log collections.
Here we add a knob to disable log aggregation completely for nomad.
When log collection is disabled, we avoid running logmon and
docker_logger for the docker tasks in this implementation.
The downside here is once disabled,
nomad logs ...
commands and APIno longer return logs and operators must correlate alloc-ids with their
aggregated log info.
This is meant as a stop gap measure. Ideally, we'd follow up with at
least two changes:
First, we should optimize behavior when we can such that operators don't
need to disable docker log collection. Potentially by reverting to
using pre-0.9 syslog aggregation in linux environments, though with
different trade-offs.
Second, when/if logs are disabled, nomad logs endpoints should lookup
docker logs api on demand. This ensures that the cost of log collection
is paid sparingly.
Implementation overview
This implementation adds a
disable_log_collection
flag for the docker driver. If set to true, we avoid starting docker_logger. Furthermore, it communicates lack of logging to TaskRunner such that TR avoids starting logmon. TR honors aInternalCapabilities()
that's only supported by internal drivers, which would indicate if logs are skipped.I avoided making the flag available for external drivers (through
Capabilities
) as I don't want to commit to interface changes yet, at least until we decide how to proceed with support log fetching on demand.