Skip to content
This repository has been archived by the owner on Jun 25, 2020. It is now read-only.

Send Metrics to Stackdriver Logging + Perf improvements to handle it #117

Merged
merged 11 commits into from
Sep 22, 2017

Conversation

johnsonj
Copy link
Contributor

@johnsonj johnsonj commented Sep 19, 2017

Background:
Allow users to send metrics to Stackdriver Logging.

This was attempted in #99 but the solution would crash after a short time and hang. The hang was fixed in #115. This fixes the performance by taking a new approach.

Changes:

  • Introduce the MetricRouter: This sits near the end of the metric pipeline, after the AutoCullingMetricBuffer but before the final MetricAdapater that writes to Stackdriver. This router decides where the finalized Metric (hydrated with labels, units, ..) gets sent to (Stackdriver Logging and/or Monitoring). See: b289b73
  • Make the AutoCullingMetricBuffer.flush() not block metric arrival. Previously the buffer could not receive any new messages while it was being flushed. See: c352728

Refactoring:

  • stackdriver package now contains only code for writing to Stackdriver. The message (Log/Metric) types are extracted out. The AutoCullingMetricBuffer is extracted out.
  • Explicitly make MetricsBuffer inherit MetricAdapater and remove knowledge of the buffer from the nozzle.

- Move into own package
- Make metrics_buffer.MetricsBuffer act like stackdriver.MetricAdapter
- Change nozzle code to assume a stackdriver.MetricAdapter. It doesn't
  need to know about MetricsBuffer.
Utilize the AutoCullingMetricsBuffer by placing the router after it in
the pipeline. This allows the buffer to do it's thing and sample metrics
to reduce the amount we log to Stackdriver Logging.

If this was in the nozzle or written as part of a sink we'd need to
perform the buffering/culling at that point or maintain two seperate
methods for doing it.
Background:
The mutex on mb.metrics blocks calls to PostMetric(s). This prevents the
nozzle from draining messages from loggregator which drops the
connection.

This was an issue when we only wrote to Stackdriver Monitoring but now
that we also write to Stackdriver Logging this issue has been magnified.

Fix:
- Only keep the mutex in the critical section where we copy and flush the
  internal buffer.

Testing:
- Without this change the nozzle consistently folds in ~4mins on my
  machine and and utilizes 2% CPU.
- With this change it appears more stable and the usage goes up to 5%
  which means less blocking!
consolidate these related packages into a single package.
Require event selection for one or the other instead of both. This
problem stems from this behavior:

	strings.Split("", ",") => []string{""}

Which asks nozzle.ParseEvents() to determine what event the empty string
is.

Added tests in the config despite no changes, just to help enforce.
Copy link
Contributor

@evandbrown evandbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few very small things. Otherwise looks great.


Eventually(metricAdapter.GetPostedMetrics).Should(HaveLen(1))

var err error
Eventually(errs).Should(Receive(&err))
Expect(err).To(Equal(expectedErr))
})

Describe("with a slow MetricAdapter", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Name: valueMetric.GetName(),
Type: envelope.GetEventType(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably missed this somewhere else in the CL, but what does the new Type field get us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed it in in the router to determine where to send it. I'm not crazy about it because it adds a specific type from loggregator into the internal representation. If you have other ideas I'm all ears.

events := []events.Envelope_EventType{}

for _, name := range names {
if name == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would allow the config to contains something like foo,,bar. Do we want that, or should it error instead of continuing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I don't think we want to explicitly allow that. The alternate here is to special case empty string here (or move the parsing into the config object).

The reason I added this is because: strings.Split("", ",") => []string{""}

name string
}

func (ie *invalidEvent) Error() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@johnsonj
Copy link
Contributor Author

Thank you @evandbrown ! Nothing too controversial so I'm going to merge.

@johnsonj johnsonj merged commit d14a17a into cloudfoundry-community:develop Sep 22, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants