-
Notifications
You must be signed in to change notification settings - Fork 12
Send Metrics to Stackdriver Logging + Perf improvements to handle it #117
Send Metrics to Stackdriver Logging + Perf improvements to handle it #117
Conversation
- Move into own package - Make metrics_buffer.MetricsBuffer act like stackdriver.MetricAdapter - Change nozzle code to assume a stackdriver.MetricAdapter. It doesn't need to know about MetricsBuffer.
Utilize the AutoCullingMetricsBuffer by placing the router after it in the pipeline. This allows the buffer to do it's thing and sample metrics to reduce the amount we log to Stackdriver Logging. If this was in the nozzle or written as part of a sink we'd need to perform the buffering/culling at that point or maintain two seperate methods for doing it.
Background: The mutex on mb.metrics blocks calls to PostMetric(s). This prevents the nozzle from draining messages from loggregator which drops the connection. This was an issue when we only wrote to Stackdriver Monitoring but now that we also write to Stackdriver Logging this issue has been magnified. Fix: - Only keep the mutex in the critical section where we copy and flush the internal buffer. Testing: - Without this change the nozzle consistently folds in ~4mins on my machine and and utilizes 2% CPU. - With this change it appears more stable and the usage goes up to 5% which means less blocking!
consolidate these related packages into a single package.
Require event selection for one or the other instead of both. This problem stems from this behavior: strings.Split("", ",") => []string{""} Which asks nozzle.ParseEvents() to determine what event the empty string is. Added tests in the config despite no changes, just to help enforce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few very small things. Otherwise looks great.
|
||
Eventually(metricAdapter.GetPostedMetrics).Should(HaveLen(1)) | ||
|
||
var err error | ||
Eventually(errs).Should(Receive(&err)) | ||
Expect(err).To(Equal(expectedErr)) | ||
}) | ||
|
||
Describe("with a slow MetricAdapter", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Name: valueMetric.GetName(), | ||
Type: envelope.GetEventType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably missed this somewhere else in the CL, but what does the new Type
field get us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed it in in the router to determine where to send it. I'm not crazy about it because it adds a specific type from loggregator into the internal representation. If you have other ideas I'm all ears.
events := []events.Envelope_EventType{} | ||
|
||
for _, name := range names { | ||
if name == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would allow the config to contains something like foo,,bar
. Do we want that, or should it error instead of continuing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I don't think we want to explicitly allow that. The alternate here is to special case empty string here (or move the parsing into the config object).
The reason I added this is because: strings.Split("", ",") => []string{""}
name string | ||
} | ||
|
||
func (ie *invalidEvent) Error() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Thank you @evandbrown ! Nothing too controversial so I'm going to merge. |
Background:
Allow users to send metrics to Stackdriver Logging.
This was attempted in #99 but the solution would crash after a short time and hang. The hang was fixed in #115. This fixes the performance by taking a new approach.
Changes:
MetricRouter
: This sits near the end of the metric pipeline, after theAutoCullingMetricBuffer
but before the finalMetricAdapater
that writes to Stackdriver. This router decides where the finalized Metric (hydrated with labels, units, ..) gets sent to (Stackdriver Logging and/or Monitoring). See: b289b73AutoCullingMetricBuffer.flush()
not block metric arrival. Previously the buffer could not receive any new messages while it was being flushed. See: c352728Refactoring:
stackdriver
package now contains only code for writing to Stackdriver. The message (Log/Metric) types are extracted out. TheAutoCullingMetricBuffer
is extracted out.MetricsBuffer
inheritMetricAdapater
and remove knowledge of the buffer from the nozzle.