Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cosmos DB: Operation Level Metrics #1438

Merged

Conversation

sourabh1007
Copy link
Contributor

@sourabh1007 sourabh1007 commented Sep 30, 2024

Introducing operation-level metrics in the .NET Cosmos DB SDK.
This PR includes the registration of semantic conventions in the OpenTelemetry community, ensuring full alignment with OpenTelemetry's established guidelines.

Dimensions

Tag/dimension name Sample value
db.system cosmodb
db.collection.name myCollectionName
db.namespace myDatabaseName
server.address myaccountname.documents.azure.com
server.port 443
db.operation.name query_items
db.cosmosdb.operation_type batch; create; delete
db.response.status_code 200 or 429 etc.
db.cosmos.sub_status_code 1002 etc.
db.cosmos.client_id Guid
db.cosmosdb.consistency_level Eventual, ConsistentPrefix, BoundedStaleness, Strong or Session
db.cosmosdb.machine_id Unique Id for a machine (Dependent of otel generated machine details)

Metrics

Name Unit Type Description
db.cosmosdb.operation.request_charge {request_units} histogram Total request units per operation (sum of RUs for all requested needed when processing an operation) This metric SHOULD be specified with [ExplicitBucketBoundaries] of [ 1, 5, 10, 25, 50, 100, 250, 500, 1000]
db.client.operation.duration s histogram Total end-to-end duration of the operation This metric SHOULD be specified with [ExplicitBucketBoundaries] of [0.001, 0.005, 0.010, 0.050, 0.100, 0.200, 0.500, 1.000, 2.000, 5.000]
db.query.limit {count} histogram For feed operations (query, readAll, readMany, change feed) and batch operations, this meter capture the requested maxItemCount per page/request. This metric SHOULD be specified with [ExplicitBucketBoundaries] of [50, 100, 250, 500, 1000, 2000, 5000]
db.client.response.row_count {count} histogram For feed operations (query, readAll, readMany, change feed) batch operations this meter capture the actual item count in responses from the service. This metric SHOULD be specified with [ExplicitBucketBoundaries] of [10, 50, 100, 250, 500, 1000, 2000, 5000, 10000].
db.cosmosdb.operation.regions_contacted {count} histogram Number of regions contacted when executing an operation. This metric SHOULD be specified with [ExplicitBucketBoundaries] of [1, 2, 3, 5, 10, 20, 50].
db.cosmosdb.client.active_instances {count} updowncounter Number of active SDK client instances.

Added db.cosmosdb.consistency_level in attribute list also, we are going to collect it is traces also
Removed db.cosmosdb.operation_type
Added db.cosmosdb.machine_id in attribute list also, we are going to collect it is traces also

Merge requirement checklist

@sourabh1007 sourabh1007 requested review from a team as code owners September 30, 2024 06:51
@sourabh1007 sourabh1007 force-pushed the users/sourabhjain/otelcosmosmetrics branch from 8106332 to 0e22c02 Compare September 30, 2024 06:54
model/database/common.yaml Outdated Show resolved Hide resolved
model/database/common.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
@sourabh1007 sourabh1007 force-pushed the users/sourabhjain/otelcosmosmetrics branch from 9fd4001 to 229b496 Compare September 30, 2024 16:07
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add markdown for the metrics.

You need to add codegen placeholders like

and metrics will be generated between them.

Please take a look at the existing database metric definitions in https://github.com/open-telemetry/semantic-conventions/blob/v1.27.0/docs/database/database-metrics.md?plain=1 for the examples.

Note that a couple of properties are not yet defined in yaml, but need to be specified in markdown:

Please also document whether cosmos reports any of the standard DB metrics (I assume it does report db.client.operation.duration). If so, please document any difference from the generic metric. E.g. like it's done here - https://github.com/open-telemetry/semantic-conventions/blob/main/docs/dotnet/dotnet-http-metrics.md#metric-httpserverrequestduration

model/database/registry.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/common.yaml Outdated Show resolved Hide resolved
model/database/common.yaml Outdated Show resolved Hide resolved
model/database/common.yaml Outdated Show resolved Hide resolved
model/database/common.yaml Show resolved Hide resolved
Copy link
Contributor

@jcocchi jcocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few nits, but looks good overall

model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/registry.yaml Outdated Show resolved Hide resolved
@sourabh1007 sourabh1007 force-pushed the users/sourabhjain/otelcosmosmetrics branch 3 times, most recently from 5db7514 to 01dc811 Compare October 3, 2024 02:15
model/database/registry.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
docs/database/database-metrics.md Outdated Show resolved Hide resolved
docs/database/database-metrics.md Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
@sourabh1007 sourabh1007 force-pushed the users/sourabhjain/otelcosmosmetrics branch 3 times, most recently from 1a632ac to ee1d400 Compare October 10, 2024 06:29
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor final comments, but it looks good otherwise!

model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
model/database/registry.yaml Show resolved Hide resolved
model/database/registry.yaml Show resolved Hide resolved
model/database/registry.yaml Outdated Show resolved Hide resolved
docs/attributes-registry/db.md Outdated Show resolved Hide resolved
@sourabh1007 sourabh1007 force-pushed the users/sourabhjain/otelcosmosmetrics branch from e311a60 to bff773c Compare October 11, 2024 01:57
model/database/registry.yaml Outdated Show resolved Hide resolved
model/database/registry.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/cosmosdb-metrics.yaml Outdated Show resolved Hide resolved
model/database/metrics.yaml Outdated Show resolved Hide resolved
Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sourabh1007!

I have a couple of open questions, but I'm ok if we spin those out into separate issues, I'll leave that decision to @open-telemetry/specs-semconv-maintainers (I can create the issues if that is what's decided)

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened an issue to track one potential follow-up: #1522

Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for following up!

@lmolkova lmolkova merged commit 63094c6 into open-telemetry:main Oct 28, 2024
13 of 14 checks passed
microsoft-github-policy-service bot pushed a commit to Azure/azure-cosmos-dotnet-v3 that referenced this pull request Nov 24, 2024
#4682)

## Description

1. Added new flag in `CosmosClientTelemetryOptions` i.e.
`IsClientMetricsEnabled`, to enable/disable metrics. By default, it
would be disabled. (inspired from Java SDK
https://github.com/Azure/azure-sdk-for-java/blob/5bc07ca75c7c0520c1098b5a6264258b6e043435/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosClientTelemetryConfig.java#L61)
2. If `enabled`, collecting below metrics ref.
open-telemetry/semantic-conventions#1438

ref. Java Metric Doumentation,
https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos/docs/Metrics.md
ref. Discussion with open telemetry community
open-telemetry/semantic-conventions#1438

this PR has **Contract Changes**.

## Perf Testing
I haven't observed any performance impact from this change. In this
feature, any performance issues would likely stem from the Exporter
implementation or the aggregation interval. The tests were conducted
using a no-op exporter subscribed to these meters to isolate any
performance impact specifically related to data recording


![image](https://github.com/user-attachments/assets/1a0ab16a-7b1b-44fb-a0ff-2eacd87d2d93)

## Type of change
- [] New feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kiran Kumar Kolli <[email protected]>
Co-authored-by: Debdatta Kunda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants