Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database metric semantic conventions #1076

Closed
Changes from 11 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
9196ab8
Add database metric semantic conventions
justinfoote Oct 6, 2020
330e728
Add detailed connection pooling semantic conventions for metrics
justinfoote Oct 7, 2020
4fbd26f
Add exception.type to database metrics semantic conventions
justinfoote Oct 6, 2020
2ae2bbb
Clean up grammar in database semantic conventions
justinfoote Oct 7, 2020
1c2b9e9
Add net.peer labels to database metrics semantic conventions
justinfoote Oct 7, 2020
c220dfd
Link to trace semantic conventions list of known db systems
justinfoote Oct 8, 2020
9924cff
Replace database statement with operation and table name in metric se…
justinfoote Oct 7, 2020
6b66fd3
Replace Attribute name with Label name in database metric semantic co…
justinfoote Oct 8, 2020
10a3ef4
Reorder connection pooling metric instruments in semantic conventions
justinfoote Oct 8, 2020
e9b0bdb
Add duration database metric examples
justinfoote Oct 8, 2020
506fe6e
Fix markdown-lint errors
justinfoote Oct 8, 2020
b16d67d
Add markdown TOC to database metric semantic conventions
justinfoote Oct 9, 2020
deea045
Update database semantic convention; respond to PR feedback
justinfoote Oct 9, 2020
4d96aab
Update changelog
justinfoote Oct 9, 2020
dc92381
Fix markdown-lint errors in database metric semantic conventions
justinfoote Oct 10, 2020
88c6e1c
Update CHANGELOG.md
justinfoote Oct 23, 2020
7bdc85d
Actually add the TOC to database metric semantic conventions
justinfoote Oct 23, 2020
cf83046
Add db.sql.table to db metric semantic conventions
justinfoote Nov 2, 2020
ec58de4
Add more explanation of db connection pool metrics
justinfoote Nov 2, 2020
9f95457
Update database metric semantic convention in response to PR feedback
justinfoote Nov 11, 2020
b2bebfc
Add technology-specific db metric instruments guidelines
justinfoote Nov 19, 2020
e6acc6a
Fix markdown table layout for db metric labels
justinfoote Nov 29, 2020
efba901
Replace "mongo" with "mongodb" in metrics database semantic conventions
justinfoote Nov 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
209 changes: 209 additions & 0 deletions specification/metrics/semantic_conventions/database.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
# Semantic Conventions for Database Metrics

justinfoote marked this conversation as resolved.
Show resolved Hide resolved
This document contains semantic conventions for database client metrics in
OpenTelemetry. When instrumenting database clients, also consider the
[general metric semantic conventions](README.md#general-metric-semantic-conventions).

## Common

The following labels SHOULD be applied to all database metric instruments.

| Label Name | Description | Example | Required |
|------------------------|--------------|----------|----------|
| `db.system` | An identifier for the database management system (DBMS) product being used. [1] | `other_sql` | Yes |
| `db.connection_string` | The connection string used to connect to the database. It is recommended to remove embedded credentials. | `Server=(localdb)\v11.0;Integrated Security=true;` | No |
| `db.user` | Username for accessing the database. | `readonly_user`<br>`reporting_user` | No |
| `net.transport` | Transport protocol used. See note below. See [general network connection attributes](../../trace/semantic_conventions/span-general.md#general-network-connection-attributes). | `IP.TCP`<br>`Unix` | Conditional [2] |
| `net.peer.ip` | Remote address of the peer (dotted decimal for IPv4 or [RFC5952](https://tools.ietf.org/html/rfc5952) for IPv6) | `127.0.0.1` | No |
| `net.peer.port` | Remote port number. | `80`<br>`8080`<br>`443` | No |
| `net.peer.name` | Remote hostname or similar, see note below. | `example.com` | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


**[1]:** See the [database trace semantic conventions](../../trace/semantic_conventions/database.md#connection-level-attributes)
for the list of well-known database system values.

**[2]:** Recommended in general, required for in-process databases (`"inproc"`).

## Call-level Metric Instruments

The following metric instruments SHOULD be iterated for every database operation.

| Name | Instrument | Units | Description |
|----------------------|---------------|--------------|-------------|
| `db.client.duration` | ValueRecorder | milliseconds | The duration of the database operation. |

Database operations SHOULD include execution of queries, including DDL, DML,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Database operations SHOULD include execution of queries, including DDL, DML,
Measured database operations SHOULD include execution of queries, including DDL, DML,

DCL, and TCL SQL statements (and the corresponding operations in non-SQL
databases), as well as connect operations.

### Labels

In addition to the [common](#common) labels, the following labels SHOULD be
applied to all database call-level metric instruments.

| Label Name | Type | Description | Example | Required |
|------------------|--------|--------------|----------|----------|
| `db.name` | string | If no [tech-specific label](#call-level-labels-for-specific-technologies) is defined, this attribute is used to report the name of the database being accessed. For commands that switch the database, this should be set to the target database (even if the command fails). [1] | `customers`<br>`main` | Required if applicable. |
justinfoote marked this conversation as resolved.
Show resolved Hide resolved
| `db.operation` | string | The name of the operation being executed, e.g. the [MongoDB command name](https://docs.mongodb.com/manual/reference/command/#database-operations) such as `findAndModify`. [4][5] | `findAndModify`<br>`HMSET`<br>`SELECT`<br>`CONNECT` | Required if applicable. |
| `db.table` | string | The name of the primary table, collection, segment, etc... that the operation is acting upon. | `user_table` | Required if applicable. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favor of introducing this in the (more elaborate) tracing spec first, as you proposed on #1104, and then just refer to that definition from here to avoid duplication and potentially going out of sync. Once both kinds of semconvs are auto-generated, it should be fine again, however.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 I've created that PR now (here: #1141). After it has some reviews, I'll update this PR like you suggest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #1141 has all the approvals it needs, so I've updated this one to match it.
Just that little bit of manual work to keep these two PRs has reminded me that we really need the semantic convention table generator to work for metrics.

| `exception.type` | string | The type of the exception (its fully-qualified class name, if applicable). The dynamic type of the exception should be preferred over the static type in languages that support it. | `java.sql.SQLException`<br/>`psycopg2.OperationalError` | Required if applicable. |

**[1]:** In some SQL databases, the database name to be used is called "schema name".

**[4]:** For SQL operations, this should be set to the SQL keyword (example: `SELECT` or `INSERT`).

**[5]:** To reduce cardinality, the value for `db.operation` should have parameters
removed or substituted. The resulting value should be a low-cardinality value
represeting the statement or operation being executed on the database. It may be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is supposed to pertain to a db.statement parameter, which is missing in the table.

Suggested change
**[5]:** To reduce cardinality, the value for `db.operation` should have parameters
removed or substituted. The resulting value should be a low-cardinality value
represeting the statement or operation being executed on the database. It may be
**[5]:** To reduce cardinality, the value for `db.operation` should leave in placeholders
and not include any actual parameter values. The resulting value should be low-cardinality
and represent the statement or operation being executed on the database. It may be

Might also be worth adding "it is not recommended to attempt any client-side parsing of db.statement to remove parameters", although users shouldn't be adding formatting them in anyway. The trace conventions say "the value may be sanitized to exclude sensitive information," do you think that is mostly just about not substitution values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I've updated it to be more like these words. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The db.operation label wouldn't have any placeholders I thought, since its just SELECT, UPDATE, etc. I was more thinking that advice would apply to db.statement, unless it was purposefully left out for cardinality reasons?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without db.statement, is there enough granularity? In the MySQL example, all SELECT queries to orders would be aggregated into that same metric

{
  "name": "db.client.duration",
  "labels": {
    "db.operation": "SELECT",
    "db.name": "ShopDb",
    "db.system": "mysql",
    "db.connection_string": "Server=shopdb.example.com;Database=ShopDb;Uid=billing_user;TableCache=true;UseCompression=True;MinimumPoolSize=10;MaximumPoolSize=50;",
    "db.user": "billing_user",
    "db.sql.table": "orders",
    "net.peer.name": "shopdb.example.com",
    "net.peer.ip": "192.0.2.12",
    "net.peer.port": "3306",
    "net.transport": "IP.TCP"
  }
}

a stored procedure name (without arguments), operation name, etc.

### Call-level labels for specific technologies

| Label Name | Description | Example | Required |
|---------------------------|--------------|----------|----------|
| `db.cassandra.keyspace` | The name of the keyspace being accessed. To be used instead of the generic `db.name` attribute. | `mykeyspace` | Yes |
| `db.hbase.namespace` | The [HBase namespace](https://hbase.apache.org/book.html#_namespace) being accessed. To be used instead of the generic `db.name` attribute. | `default` | Yes |
| `db.redis.database_index` | The index of the database being accessed as used in the [`SELECT` command](https://redis.io/commands/select), provided as an integer. To be used instead of the generic `db.name` label. | `0`<br>`1`<br>`15` | Conditional [1] |
| `db.mongodb.collection` | The collection being accessed within the database stated in `db.name`. | `customers`<br>`products` | Yes |

**[1]:** Required, if other than the default database (`0`).

### Examples
justinfoote marked this conversation as resolved.
Show resolved Hide resolved

#### PostgreSQL SELECT Query

For a client executing a query like this:

```SQL
SELECT * FROM public.user_table WHERE user_id = 301;
```

while connected to a PostgreSQL database named "user_db" running on host
`postgres-server:5432`, the following instrument should result:

```json
{
"name": "db.client.duration",
"labels": {
"db.operation": "SELECT",
"db.table": "user_table",
jmacd marked this conversation as resolved.
Show resolved Hide resolved
"db.name": "user_db",
"db.system": "postgresql",
"db.connection_string": "postgresql://postgres-server:5432/user_db",
"db.user": "",
"net.peer.ip": "192.0.10.2",
"net.peer.port": 5432,
"net.peer.name": "postgres-server"
}
}
```

#### MySQL SELECT Query

For a client executing a query like this:

```SQL
SELECT * FROM orders WHERE order_id = 301;
```

while connected to a MySQL database named "ShopDb" running on host
`shopdb.example.com`, the following instrument should result:

```json
{
"name": "db.client.duration",
"labels": {
"db.operation": "SELECT",
"db.table": "orders",
"db.name": "ShopDb",
"db.system": "mysql",
"db.connection_string": "Server=shopdb.example.com;Database=ShopDb;Uid=billing_user;TableCache=true;UseCompression=True;MinimumPoolSize=10;MaximumPoolSize=50;",
"db.user": "billing_user",
"net.peer.name": "shopdb.example.com",
"net.peer.ip": "192.0.2.12",
"net.peer.port": "3306",
"net.transport": "IP.TCP"
}
}

```

#### Redis HMSET

For a client executing a Redis HMSET command like this:

```redis
HMSET myhash field1 'Hello' field2 'World
justinfoote marked this conversation as resolved.
Show resolved Hide resolved
```

while connecting to a Redis instance over a Unix socket, the following instrument
should result:

```json
{
"name": "db.client.duration",
"labels": {
"db.operation": "HMSET",
"db.table": "myhash",
"db.system": "redis",
"db.user": "the_user",
"net.peer.name": "/tmp/redis.sock",
"net.transport": "Unix",
"db.redis.database_index": "15"
}
}
```

#### MongoDB findAndModify

For a mongo client executing the `findAndModify` command like this:

```javascript
{
findAndModify: "people",
query: { name: "Tom", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $inc: { score: 1 } }
}
```

against the database named "userdatabase" while connected to a MongoDB available
at `mongodb.example.com`, the following instrument should result:

```json
{
"name": "db.client.duration",
"labels": {
"db.operation": "findAndModify",
"db.table": "people",
"db.name": "userdatabase",
"db.system": "mongodb",
"db.user": "the_user",
"net.peer.name": "mongodb.example.com",
"net.peer.ip": "192.0.2.14",
"net.peer.port": "27017",
"net.transport": "IP.TCP"
}
}
```

## Connection Pooling Metric Instruments

Otherwise, the following metric instruments SHOULD be collected. They SHOULD
justinfoote marked this conversation as resolved.
Show resolved Hide resolved
have all [common](#common) labels applied to them.

| Name | Instrument | Units | Description |
|---------------------------|---------------|--------------|-------------|
| `db.connection_pool.limit` | ValueObserver | {connections} | The total number of database connections available in the connection pool. |
| `db.connection_pool.usage` | ValueObserver | {connections} | The number of database connections _in use_. |

If the following detailed information is available, instrumentation MAY collect
justinfoote marked this conversation as resolved.
Show resolved Hide resolved
the following metric instruments. They SHOULD have all [common](#common) labels
applied to them.

| Name | Instrument | Units | Description |
|---------------------------|------------|---------------|-------------|
| `db.connections.new` | Counter | {connections} | The number of new connections created. |
| `db.connections.taken` | Counter | {connections} | The number of connections taken from the connection pool. |
| `db.connections.returned` | Counter | {connections} | The number of connections returned to the connection pool. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it anticipated that db.connection_pool.usage is equal to db.connections.taken minus db.connections.returned?

If so, what is the benefit in having the additional instruments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I expect db.connection_pool.usage = db.connections.taken - db .connections.returned. But the breakdown metrics can give much more detail than the use of the connection pool over time.
The usage/limit metrics use a LastValue aggregator by default, meaning you get a periodic snapshot of the number of connections currently in use.
The more detailed metrics allow you to see things like connection churn. With them, a user could differentiate between an application making millions of tiny database operations (and ending a harvest period with 4/5 connections in use) and an application with four very long and slow queries in progress (also ending the harvest period with 4 connections in use).

The connection_pool usage instruments are asynchronous -- they're intended to be polled on a set interval. The more detailed metrics are recording all the actions related to the connection pool in real time.

...but
Having said all of that, I don't have specific use cases to back up the inclusion of these extra instruments.
But, they're already being recorded by go-redis, and I can see the same sort of instruments being possible in a number of other frameworks. And if there's a desire to record them, I think we should spec them so we get consistent names and labels across implementations.

Copy link
Member Author

@justinfoote justinfoote Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a few different ways of explaining this. I ultimately ended up with the pretty simple approach in ec58de4.
I'm certainly open to suggestions about how to clarify what these instruments are all about.

| `db.connections.reused` | Counter | {connections} | The number of connections reused. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the criteria for knowing it's an existing connection that's being re-used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I had used this PR as reference, and had assumed that an instrumentation author would know best how to collect these numbers, but now I'm thinking we need definitions of exactly what value needs to be recorded here, or we run the risk of misunderstanding and inconsistency across instrumentation sources.

I'll give this some thought about how to describe this.

| `db.connections.closed` | Counter | {connections} | The number of connections closed. |