-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Use OTel to export Prometheus Metrics #419
Conversation
Codecov Report
@@ Coverage Diff @@
## main #419 +/- ##
==========================================
+ Coverage 64.93% 64.99% +0.05%
==========================================
Files 11 11
Lines 1369 1374 +5
==========================================
+ Hits 889 893 +4
+ Misses 427 426 -1
- Partials 53 55 +2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Hey @thisthat, thanks for the PR. This looks like a great improvement that will set us up to add trace support in the future. |
Hey @beeme1mr, I am happy to provide a follow up PR with tracing support 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an OTel metrics expert, so I will approve but please consider my approval "low value" 😅 . It all looks right to me though.
I noticed that we haven't documented how to use Prometheus with Flagd. I created a follow up task to address that. |
Hey @thisthat, I spent some time today digging into your PR and had some feedback. It looks like the request/response metric names and properties have changed. Before: After: If possible, it would be nice to keep the original names and values so that we maintain backwards compatibility. Next, it looks like the hitogram buckets have changed in the following ways: Before:
After:
I also noticed that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please maintain backwards compatibility.
Hey @beeme1mr thanks for looking into that! You are right, I have overlooked the bucket size and the subsystem declaration! I'll fix this!
This is because we don't configure any OTel Resource and the default values of the SDK are used. Here's the link to the SemConv: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/semantic_conventions/README.md#service if you don't mind, I will do that in a follow-up together with the tracing part since they go hand-in-hand |
d9a41e2
to
b22dca9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @thisthat, looks good and works well.
FYI, the bucket property names have changed slightly but they seem to be more semantically correct based on the OTel spec.
I think we should throw a I think we can ignore the codecov on this one. |
3eca105
to
6e9c8b8
Compare
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Giovanni Liva <[email protected]>
Signed-off-by: Todd Baert <[email protected]>
6cfff7f
to
3a1f43d
Compare
🤖 I have created a release *beep* *boop* --- ## [0.4.0](v0.3.7...v0.4.0) (2023-03-02) ### ⚠ BREAKING CHANGES * Use OTel to export metrics (metric name changes) ([#419](#419)) ### 🧹 Chore * add additional sections to the release notes ([#449](#449)) ([798f71a](798f71a)) * attach image sbom to release artefacts ([#407](#407)) ([fb4ee50](fb4ee50)) * **deps:** update actions/configure-pages digest to fc89b04 ([#417](#417)) ([04014e7](04014e7)) * **deps:** update amannn/action-semantic-pull-request digest to b6bca70 ([#441](#441)) ([ce0ebe1](ce0ebe1)) * **deps:** update docker/login-action digest to ec9cdf0 ([#437](#437)) ([2650670](2650670)) * **deps:** update docker/metadata-action digest to 3343011 ([#438](#438)) ([e7ebf32](e7ebf32)) * **deps:** update github/codeql-action digest to 32dc499 ([#439](#439)) ([f91d91b](f91d91b)) * **deps:** update google-github-actions/release-please-action digest to d3c71f9 ([#406](#406)) ([6e1ffb2](6e1ffb2)) * disable caching tests in CI ([#442](#442)) ([28a35f6](28a35f6)) * fix race condition on init read ([#409](#409)) ([0c9eb23](0c9eb23)) * integration test stability ([#432](#432)) ([5a6a5d5](5a6a5d5)) * integration tests ([#312](#312)) ([6192ac8](6192ac8)) * reorder release note sections ([df7bfce](df7bfce)) * use -short flag in benchmark tests ([#431](#431)) ([e68a6aa](e68a6aa)) ### 🐛 Bug Fixes * **deps:** update kubernetes packages to v0.26.2 ([#450](#450)) ([2885227](2885227)) * **deps:** update module github.com/bufbuild/connect-go to v1.5.2 ([#416](#416)) ([feb7f04](feb7f04)) * **deps:** update module github.com/open-feature/go-sdk-contrib/providers/flagd to v0.1.9 ([#427](#427)) ([42d2705](42d2705)) * **deps:** update module github.com/open-feature/open-feature-operator to v0.2.29 ([#429](#429)) ([b7fae81](b7fae81)) * **deps:** update module github.com/stretchr/testify to v1.8.2 ([#440](#440)) ([ab3e674](ab3e674)) * **deps:** update module golang.org/x/net to v0.7.0 ([#410](#410)) ([c6133b6](c6133b6)) * **deps:** update module sigs.k8s.io/controller-runtime to v0.14.5 ([#454](#454)) ([f907f11](f907f11)) * remove non-error error log from parseFractionalEvaluationData ([#446](#446)) ([34aca79](34aca79)) ### ✨ New Features * add debug logging for merge behaviour ([#456](#456)) ([dc71e84](dc71e84)) * add Health and Readiness probes ([#418](#418)) ([7f2358c](7f2358c)) * Add version to startup message ([#430](#430)) ([8daf613](8daf613)) * introduce flag merge behaviour ([#414](#414)) ([524f65e](524f65e)) * introduce grpc sync for flagd ([#297](#297)) ([33413f2](33413f2)) * refactor and improve K8s sync provider ([#443](#443)) ([4c03bfc](4c03bfc)) * Use OTel to export metrics (metric name changes) ([#419](#419)) ([eb3982a](eb3982a)) ### 📚 Documentation * add .net flagd provider ([73d7840](73d7840)) * configuration merge docs ([#455](#455)) ([6cb66b1](6cb66b1)) * documentation for creating a provider ([#413](#413)) ([d0c099d](d0c099d)) * updated filepaths for schema store regex ([#344](#344)) ([2d0e9d9](2d0e9d9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Related Issues
In #186, there was a discussion about going with OpenTelemetry or a direct Prometheus implementation.
The agreement was to have a scraping endpoint, i.e.,
pull mode
, and not support thepush
mode.This PR wants to replace the existing direct Prometheus implementation with the vendor-agnostic OpenTelemetry one, maintaining the same feature set.
Example of exposed metrics:
Notes
There are several benefits of using OpenTelemetry. The most prominent one is if we introduce Span supports in flagd, we could get out-of-the-box support for exemplars to pin-point slow requests.
Follow-up Tasks