Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OTLP metric exporter #835

Merged
merged 22 commits into from
Aug 18, 2020
Merged

Add OTLP metric exporter #835

merged 22 commits into from
Aug 18, 2020

Conversation

ocelotl
Copy link
Contributor

@ocelotl ocelotl commented Jun 17, 2020

Fixes #797

@ocelotl ocelotl requested a review from a team June 17, 2020 03:35
@ocelotl ocelotl self-assigned this Jun 17, 2020
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good! I haven't had a chance to review the proto for metrics yet to see if there's anything missing there. Change requested in the comments.

@codeboten
Copy link
Contributor

Please add a changelog entry

@ocelotl ocelotl force-pushed the issue_797 branch 3 times, most recently from cfc36e3 to 2807d26 Compare June 18, 2020 18:48
@ocelotl ocelotl requested a review from codeboten June 19, 2020 14:41
@lzchen
Copy link
Contributor

lzchen commented Jun 22, 2020

Could you fix otcollector_example in the example docs as well to use the new exporters?

@ocelotl
Copy link
Contributor Author

ocelotl commented Jun 24, 2020

Could you fix otcollector_example in the example docs as well to use the new exporters?

Fixed

@aabmass
Copy link
Member

aabmass commented Jul 6, 2020

I noticed there is a new release of opentelemetry-proto v0.4.0. Should we upgrade the proto code and then update this PR against the new version or would it be better as a separate PR?

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM!

Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

docs/getting-started.rst Outdated Show resolved Hide resolved
docs/getting_started/otlpcollector_example.py Outdated Show resolved Hide resolved
Comment on lines +91 to +95
if isinstance(instrument, (Counter, UpDownCounter)):
temporality = MetricDescriptor.Temporality.DELTA
Copy link
Member

@aabmass aabmass Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be based on stateful @lzchen?

If you create a stateful Counter, I believe the exported values would be cumulative since the start of the process?

# interval starts
counter.add(5)
# ...
counter.add(5)

# interval ends, export() passed 10
# next interval starts

counter.add(10)
# interval ends, exporter() passed 20

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes temporality should be based off statefulness. Currently it is a feature of the batcher but this might be changed soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe OTLP is supposed to describe the metrics on the wire rather than the instruments that created them, so that whatever is receiving the OTLP can make sense of it. For instance, if the collector was receiving DELTAs and then was supposed to export them to prometheus (which doesn't accept deltas for a cumulative metric), it knows to convert the DELTAs to CUMULATIVEs before exporting to prometheus.

So in the example above, if the SDK sends 10, 20 to the collector over OTLP with DELTA, then the collector converts them to cumulatives 10, 30 which is sending incorrect values to prometheus because it was already converted to cumulative in the SDK. That's why I think we should choose based on stateful.

  • The metrics API spec doesn't mention "stateful" at all. I think that's why the table there says that Counter/UpDownCounter would be delta, because it is assuming the SDK just sends off the DELTA is sums for the interval (stateful=False)
  • OTEP #131 (was just merged) changes the default behavior of OTLP exporters to send CUMULATIVES for Counter/UpDownCounter and allows configuration. The python sdk's "stateful" seems like that configuration. Point being, I don't think it's imperative to follow the table in the metrics API spec.

I think we will need to revisit this again very soon since OTLP is changing quickly, so I wouldn't worry too much about it not being perfect yet.

Comment on lines +80 to +84
labels=string_key_values,
value=view_data.aggregator.current,
Copy link
Member

@aabmass aabmass Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about start and end times https://github.com/open-telemetry/opentelemetry-proto/blob/e43e1abc40428a6ee98e3bfd79bec1dfa2ed18cd/opentelemetry/proto/metrics/v1/metrics.proto#L282-L286

// Int64DataPoint is a single data point in a timeseries that describes the time-varying
// values of a int64 metric.
message Int64DataPoint {
  // The set of labels that uniquely identify this timeseries.
  repeated opentelemetry.proto.common.v1.StringKeyValue labels = 1;

  // start_time_unix_nano is the time when the cumulative value was reset to zero.
  // This is used for Counter type only. For Gauge the value is not specified and
  // defaults to 0.
  //
  // The cumulative value is over the time interval (start_time_unix_nano, time_unix_nano].
  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
  //
  // Value of 0 indicates that the timestamp is unspecified. In that case the timestamp
  // may be decided by the backend.
  fixed64 start_time_unix_nano = 2;

  // time_unix_nano is the moment when this value was recorded.
  // Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
  fixed64 time_unix_nano = 3;

  // value itself.
  int64 value = 4;
}

@lzchen this is another place we could use the start/end time in the metric SDK 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by end time? If it's not defined in the proto we can't really change it can we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lzchen Sorry, updated my previous comment to include the whole message. Specifically this part:

The cumulative value is over the time interval (start_time_unix_nano, time_unix_nano].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIG Meeting update: add a TODO and issue to start sending these. Changes are needed in the SDK to support this



def _get_resource_data(
sdk_resource_instrumentation_library_data, resource_class, name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want type hints throughout some of these other functions?

@aabmass aabmass self-requested a review August 5, 2020 14:12
@ocelotl ocelotl force-pushed the issue_797 branch 2 times, most recently from 26fdc1f to 1b4b1aa Compare August 10, 2020 20:59
Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we can merge this and iterate 🙂 is there an issue to track the outstanding things here?

@codeboten
Copy link
Contributor

I captured the remaining open items in different issues.

@codeboten codeboten merged commit 452be59 into open-telemetry:master Aug 18, 2020
@ocelotl ocelotl deleted the issue_797 branch September 30, 2020 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an OTLP metric exporter
4 participants