Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user_agent.synthetic.type attribute #1523

Merged
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
e579479
Add synthetic source.
JacksonWeber Oct 28, 2024
984fbee
Update model/http/registry.yaml
JacksonWeber Oct 29, 2024
4d1b110
Update model/http/registry.yaml
JacksonWeber Oct 29, 2024
f925b65
Begin refactoring.
JacksonWeber Oct 31, 2024
6b48773
Update md files.
JacksonWeber Oct 31, 2024
5837106
Clarify synthetic meaning.
JacksonWeber Nov 1, 2024
b4686ff
Shorten id value.
JacksonWeber Nov 1, 2024
568cefc
Update user_agent synthetic value.
JacksonWeber Nov 1, 2024
d52bde9
Merge branch 'jacksonweber-sythetic-source' of https://github.com/Jac…
JacksonWeber Nov 1, 2024
40a7d3b
Update docs.
JacksonWeber Nov 1, 2024
48668f3
Add info regarding self-id.
JacksonWeber Nov 19, 2024
b2020f8
Merge branch 'main' into jacksonweber-sythetic-source
JacksonWeber Nov 19, 2024
c4bdf85
Merge branch 'main' into jacksonweber-sythetic-source
JacksonWeber Nov 19, 2024
91ef4ba
Update md files.
JacksonWeber Nov 19, 2024
2d64e96
Generalize about what components could possibly set the synthetic att…
JacksonWeber Nov 19, 2024
ac73e97
Update description wording.
JacksonWeber Nov 19, 2024
c870957
Update docs and add synthetic to client spans.
JacksonWeber Nov 19, 2024
7dac0c5
Update wording on user-agent registry.
JacksonWeber Nov 19, 2024
1b804df
Update requirement level of synthetic.type for client spans.
JacksonWeber Nov 21, 2024
86f3396
Update requirement level of client spans.
JacksonWeber Nov 21, 2024
03c8ea8
Specify the types of synthetic traffic possible.
JacksonWeber Nov 21, 2024
54dcdd3
Make the enum values more clear.
JacksonWeber Nov 21, 2024
5e0bb16
Merge branch 'main' into jacksonweber-sythetic-source
JacksonWeber Nov 21, 2024
a136fbd
Update markdown.
JacksonWeber Nov 21, 2024
48cca0b
Add formatting.
JacksonWeber Nov 21, 2024
8ce8320
Wording update.
JacksonWeber Nov 21, 2024
101e662
Update should vs. may wording.
JacksonWeber Nov 21, 2024
80ae0a6
Merge branch 'main' into jacksonweber-sythetic-source
JacksonWeber Nov 21, 2024
87e6c4e
Update markdown.
JacksonWeber Nov 21, 2024
25cc8f4
Add new line.
JacksonWeber Nov 21, 2024
6c21f04
Update attribute on client spans to opt_in.
JacksonWeber Nov 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .chloggen/add-synthetic-source.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Use this changelog template to create an entry for release notes.
#
# If your change doesn't affect end users you should instead start
# your pull request title with [chore] or use the "Skip Changelog" label.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
component: user_agent

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add the user_agent.synthetic.type attribute to track if spans and metrics are the result of real users, testing, or bots.
JacksonWeber marked this conversation as resolved.
Show resolved Hide resolved

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
# The values here must be integers.
issues: [1127]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
14 changes: 12 additions & 2 deletions docs/attributes-registry/user-agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,18 @@ Describes user-agent attributes.
|---|---|---|---|---|
| <a id="user-agent-name" href="#user-agent-name">`user_agent.name`</a> | string | Name of the user-agent extracted from original. Usually refers to the browser's name. [1] | `Safari`; `YourApp` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| <a id="user-agent-original" href="#user-agent-original">`user_agent.original`</a> | string | Value of the [HTTP User-Agent](https://www.rfc-editor.org/rfc/rfc9110.html#field.user-agent) header sent by the client. | `CERN-LineMode/2.15 libwww/2.17b3`; `Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1`; `YourApp/1.0.0 grpc-java-okhttp/1.27.2` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| <a id="user-agent-version" href="#user-agent-version">`user_agent.version`</a> | string | Version of the user-agent extracted from original. Usually refers to the browser's version [2] | `14.1.2`; `1.0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| <a id="user-agent-synthetic-type" href="#user-agent-synthetic-type">`user_agent.synthetic.type`</a> | string | Specifies the category of synthetic traffic, such as monitoring, crawler, bot, or another automation. [2] | `bot`; `test` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| <a id="user-agent-version" href="#user-agent-version">`user_agent.version`</a> | string | Version of the user-agent extracted from original. Usually refers to the browser's version [3] | `14.1.2`; `1.0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

**[1]:** [Example](https://www.whatsmyua.info) of extracting browser's name from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant name SHOULD be selected. In such a scenario it should align with `user_agent.version`

**[2]:** [Example](https://www.whatsmyua.info) of extracting browser's version from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name`
**[2]:** This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic, and set this attribute accordingly. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests.

**[3]:** [Example](https://www.whatsmyua.info) of extracting browser's version from original string. In the case of using a user-agent for non-browser products, such as microservices with multiple names/versions inside the `user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align with `user_agent.name`

`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
30 changes: 30 additions & 0 deletions docs/http/http-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ of `[ 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10
| [`network.protocol.version`](/docs/attributes-registry/network.md) | string | The actual version of the protocol used for network communication. [7] | `1.0`; `1.1`; `2`; `3` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the local HTTP server that received the request. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.port`](/docs/attributes-registry/server.md) | int | Port of the local HTTP server that received the request. [9] | `80`; `8080`; `443` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as monitoring, crawler, bot, or another automation. [10] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

**[1]:** HTTP request method value SHOULD be "known" to the instrumentation.
By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods)
Expand Down Expand Up @@ -143,6 +144,8 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin
> Since this attribute is based on HTTP headers, opting in to it may allow an attacker
> to trigger cardinality limits, degrading the usefulness of the metric.

**[10]:** This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic, and set this attribute accordingly. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests.

`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
Expand All @@ -164,6 +167,13 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin
| `PUT` | PUT method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| `TRACE` | TRACE method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |

`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
Expand Down Expand Up @@ -264,6 +274,7 @@ This metric is optional.
| [`network.protocol.version`](/docs/attributes-registry/network.md) | string | The actual version of the protocol used for network communication. [7] | `1.0`; `1.1`; `2`; `3` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the local HTTP server that received the request. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.port`](/docs/attributes-registry/server.md) | int | Port of the local HTTP server that received the request. [9] | `80`; `8080`; `443` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as monitoring, crawler, bot, or another automation. [10] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

**[1]:** HTTP request method value SHOULD be "known" to the instrumentation.
By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods)
Expand Down Expand Up @@ -318,6 +329,8 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin
> Since this attribute is based on HTTP headers, opting in to it may allow an attacker
> to trigger cardinality limits, degrading the usefulness of the metric.

**[10]:** This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic, and set this attribute accordingly. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests.

`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
Expand All @@ -339,6 +352,13 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin
| `PUT` | PUT method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| `TRACE` | TRACE method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |

`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
Expand Down Expand Up @@ -372,6 +392,7 @@ This metric is optional.
| [`network.protocol.version`](/docs/attributes-registry/network.md) | string | The actual version of the protocol used for network communication. [7] | `1.0`; `1.1`; `2`; `3` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the local HTTP server that received the request. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.port`](/docs/attributes-registry/server.md) | int | Port of the local HTTP server that received the request. [9] | `80`; `8080`; `443` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as monitoring, crawler, bot, or another automation. [10] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

**[1]:** HTTP request method value SHOULD be "known" to the instrumentation.
By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods)
Expand Down Expand Up @@ -426,6 +447,8 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin
> Since this attribute is based on HTTP headers, opting in to it may allow an attacker
> to trigger cardinality limits, degrading the usefulness of the metric.

**[10]:** This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic, and set this attribute accordingly. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests.

`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
Expand All @@ -447,6 +470,13 @@ SHOULD include the [application root](/docs/http/http-spans.md#http-server-defin
| `PUT` | PUT method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| `TRACE` | TRACE method. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |

`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
Expand Down
10 changes: 10 additions & 0 deletions docs/http/http-spans.md
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,7 @@ For an HTTP server span, `SpanKind` MUST be `SERVER`.
| [`network.local.address`](/docs/attributes-registry/network.md) | string | Local socket address. Useful in case of a multi-IP host. | `10.1.2.80`; `/tmp/my.sock` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`network.local.port`](/docs/attributes-registry/network.md) | int | Local socket port. Useful in case of a multi-port host. | `65123` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`network.transport`](/docs/attributes-registry/network.md) | string | [OSI transport layer](https://osi-model.com/transport-layer/) or [inter-process communication method](https://wikipedia.org/wiki/Inter-process_communication). [17] | `tcp`; `udp` | `Opt-In` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`user_agent.synthetic.type`](/docs/attributes-registry/user-agent.md) | string | Specifies the category of synthetic traffic, such as monitoring, crawler, bot, or another automation. [18] | `bot`; `test` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

**[1]:** HTTP request method value SHOULD be "known" to the instrumentation.
By default, this convention defines "known" methods as the ones listed in [RFC9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods)
Expand Down Expand Up @@ -452,6 +453,8 @@ The attribute value MUST consist of either multiple header values as an array of

**[17]:** Generally `tcp` for `HTTP/1.0`, `HTTP/1.1`, and `HTTP/2`. Generally `udp` for `HTTP/3`. Other obscure implementations are possible.

**[18]:** This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic, and set this attribute accordingly. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests.

The following attributes can be important for making sampling decisions
and SHOULD be provided **at span creation time** (if provided at all):

Expand Down Expand Up @@ -496,6 +499,13 @@ and SHOULD be provided **at span creation time** (if provided at all):
| `udp` | UDP | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| `unix` | Unix domain socket | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |

`user_agent.synthetic.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `bot` | Bot source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| `test` | Synthetic test source. | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
Expand Down
2 changes: 2 additions & 0 deletions model/http/metrics.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ groups:
> **Warning**
> Since this attribute is based on HTTP headers, opting in to it may allow an attacker
> to trigger cardinality limits, degrading the usefulness of the metric.
- ref: user_agent.synthetic.type
requirement_level: opt_in
- id: metric_attributes.http.client
type: attribute_group
brief: 'HTTP client attributes'
Expand Down
2 changes: 2 additions & 0 deletions model/http/spans.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,5 @@ groups:
requirement_level: opt_in
- ref: http.response.body.size
requirement_level: opt_in
- ref: user_agent.synthetic.type
requirement_level: opt_in
JacksonWeber marked this conversation as resolved.
Show resolved Hide resolved
18 changes: 18 additions & 0 deletions model/user-agent/registry.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,21 @@ groups:
using a user-agent for non-browser products, such as microservices with multiple names/versions inside the
`user_agent.original`, the most significant version SHOULD be selected. In such a scenario it should align
with `user_agent.name`
- id: user_agent.synthetic.type
JacksonWeber marked this conversation as resolved.
Show resolved Hide resolved
trask marked this conversation as resolved.
Show resolved Hide resolved
stability: experimental
brief: >
Specifies the category of synthetic traffic, such as monitoring, crawler, bot, or another automation.
JacksonWeber marked this conversation as resolved.
Show resolved Hide resolved
note: >
This flag can primarily be determined by the contents of the `user_agent.original` attribute. Instrumentations should determine what they consider synthetic or bot traffic,
JacksonWeber marked this conversation as resolved.
Show resolved Hide resolved
and set this attribute accordingly. This attribute can either be set on client spans for self-identification purposes, or on server spans detected to be generated as a result
lmolkova marked this conversation as resolved.
Show resolved Hide resolved
of a synthetic request. This attribute is useful for distinguishing between genuine client traffic and synthetic traffic generated by bots or tests.
type:
members:
- id: bot
value: "bot"
brief: 'Bot source.'
stability: experimental
- id: test
value: "test"
brief: 'Synthetic test source.'
stability: experimental