Status | |
---|---|
Stability | beta: traces, metrics, logs |
Distributions | contrib |
Issues | |
Code Owners | @aabmass, @dashpole, @jsuereth, @punya, @psx95 |
This exporter can be used to send metrics to Google Cloud Monitoring (formerly Stackdriver), traces to Google Cloud Trace, and logs to Google Cloud Logging.
To learn more about instrumentation and observability, including opinionated recommendations for Google Cloud Observability, visit Instrumentation and observability.
In general, authenticating with the Collector exporter follows the same steps as any other app using the steps documented for Application Default Credentials. This section explains the specific use cases relevant to the exporter.
The exporter relies on GCP client libraries to send data to Google Cloud. Use of these libraries requires the caller (the Collector) to be authenticated with a GCP account and project. This should be done using a GCP service account with at minimum the following IAM roles (depending on the type of data you wish to send):
The Compute Engine default service account has all of these permissions by default, but if you are running on a different platform or with a different GCP service account you will need to ensure your service account has these permissions.
Depending on the environment where your Collector is running, you can authenticate one of several ways:
GCE instances
On GCE it is recommended to use the GCP service account associated with your instance. If this is the Compute Engine default service account or another GCP service account with the sufficient IAM permissions, then there is nothing additional you need to do to authenticate the Collector process. Simply run the Collector on your instance, and it will inherit these permissions.
GKE / Workload Identity
On GKE clusters with Workload Identity enabled (including GKE Autopilot), follow the steps to configure a Workload Identity ServiceAccount in your cluster (if you do not already have one). Then, deploy the Collector as you would with any other workload, setting the serviceAccountName
field in the Collector Pod’s .spec
to the WI-enabled ServiceAccount.
In non-WI clusters, you can use the GCP service account associated with the node the same way as in the instructions for GCE instances above.
Non-GCP (AWS, Azure, on-prem, etc.) or alternative service accounts
In non-GCP environments, a service account key or credentials file is required. The exporter will automatically look for this file using the GOOGLE_APPLICATION_CREDENTIALS
environment variable or, if that is unset, one of the other known locations. Note that when using this approach, you may need to explicitly set the project
option in the exporter’s config.
When running the Collector in a Docker container, a credentials file can be passed to the container via volume mounts and environment variables at runtime like so:
docker run \
--volume ~/service-account-key.json:/etc/otelcol-contrib/key.json \
--volume $(pwd)/config.yaml:/etc/otelcol-contrib/config.yaml \
--env GOOGLE_APPLICATION_CREDENTIALS=/etc/otelcol-contrib/key.json \
--expose 4317 \
--expose 55681 \
--rm \
otel/opentelemetry-collector-contrib
Using gcloud auth application-default login
Using gcloud auth application-default login
to authenticate is not recommended for production use. Instead, it’s best to use a GCP service account through one of the methods listed above. The gcloud auth
command can be useful for development and testing on a user account, and authenticating with it follows the same approach as the service account key method above.
These instructions are to get you up and running quickly with the GCP exporter in a local development environment. We'll also point out alternatives that may be more suitable for CI or production.
-
Obtain a Collector binary. Pull a binary or Docker image for the OpenTelemetry contrib collector which includes the GCP exporter plugin through one of the following:
- Download a binary or package of the OpenTelemetry Collector Contrib that is appropriate for your platform, and includes the Google Cloud exporter.
- Pull a Docker image with
docker pull otel/opentelemetry-collector-contrib
- Create your own main package in Go, that pulls in just the plugins you need.
- Use the OpenTelemetry Collector
Builder
to generate the Go main package and
go.mod
.
-
Create a configuration file
config.yaml
. The example below shows a minimal recommended configuration that receives OTLP and sends data to GCP, in addition to verbose logging to help understand what is going on. It uses application default credentials (which we will set up in the next step).Note that this configuration includes the recommended
memory_limiter
andbatch
plugins, which avoid high latency for reporting telemetry, and ensure that the collector itself will stay stable (not run out of memory) by dropping telemetry if needed.receivers: otlp: protocols: grpc: http: exporters: googlecloud: log: default_log_name: opentelemetry.io/collector-exported-log processors: memory_limiter: check_interval: 1s limit_percentage: 65 spike_limit_percentage: 20 batch: resourcedetection: detectors: [gcp] timeout: 10s service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch] exporters: [googlecloud] metrics: receivers: [otlp] processors: [memory_limiter, batch] exporters: [googlecloud] logs: receivers: [otlp] processors: [memory_limiter, batch] exporters: [googlecloud]
-
Set up credentials.
-
Enable billing in your GCP project.
-
Enable the Cloud Metrics and Cloud Trace APIs.
-
Ensure that your user GCP user has (at minimum)
roles/monitoring.metricWriter
androles/cloudtrace.agent
. You can learn about metric-related and trace-related IAM in the GCP documentation. -
Obtain credentials using one of the methods in the Authenticating section above.
-
-
Run the collector. The following runs the collector in the foreground, so please execute it in a separate terminal.
./otelcol-contrib --config=config.yaml
Alternatives
If you obtained OS-specific packages or built your own binary in step 1, you'll need to follow the appropriate conventions for running the collector.
-
Gather telemetry. Run an application that can submit OTLP-formatted metrics and traces, and configure it to send them to
127.0.0.1:4317
(for gRPC) or127.0.0.1:55681
(for HTTP).Alternatives
-
Set up the host metrics receiver, which will gather telemetry from the host without needing an external application to submit telemetry.
-
Set up an application-specific receiver, such as the Nginx receiver, and run the corresponding application.
-
Set up a receiver for some other protocol (such Prometheus, StatsD, Zipkin or Jaeger), and run an application that speaks one of those protocols.
-
-
View telemetry in GCP. Use the GCP metrics explorer and trace overview to view your newly submitted telemetry.
The following configuration options are supported:
project
(default = Fetch from Credentials): GCP project identifier.destination_project_quota
(optional): Counts quota for traces and metrics against the project to which the data is sent (as opposed to the project associated with the Collector's service account. For example, when settingproject_id
or using multi-project export. (default = false)user_agent
(default =opentelemetry-collector-contrib {{version}}
): Override the user agent string sent on requests to Cloud Monitoring (currently only applies to metrics). Specify{{version}}
to include the application version number.impersonate
(optional): Configuration for service account impersonationtarget_principal
: TargetPrincipal is the email address of the service account to impersonate.subject
: (optional) Subject is the sub field of a JWT. This field should only be set if you wish to impersonate as a user. This feature is useful when using domain wide delegation.delegates
: (default = []) Delegates are the service account email addresses in a delegation chain. Each service account must be granted roles/iam.serviceAccountTokenCreatoron the next service account in the chain.
metric
(optional): Configuration for sending metrics to Cloud Monitoring.prefix
(default =workload.googleapis.com
): The prefix to add to metrics.endpoint
(default = monitoring.googleapis.com): Endpoint where metric data is going to be sent to.compression
(optional): Compression format for Metrics gRPC requests. Supported values: [gzip
]. Defaults to no compression.grpc_pool_size
(optional): Sets the size of the connection pool in the GCP client. Defaults to a single connection.use_insecure
(default = false): If true, disables gRPC client transport security. Only has effect if Endpoint is not "".known_domains
(default = [googleapis.com, kubernetes.io, istio.io, knative.dev]): If a metric belongs to one of these domains it does not get a prefix.skip_create_descriptor
(default = false): If set to true, do not send metric descriptors to GCM.instrumentation_library_labels
(default = true): If true, set the instrumentation_source and instrumentation_version labels.create_service_timeseries
(default = false): If true, this will send all timeseries usingCreateServiceTimeSeries
. Implicitly, this setsskip_create_descriptor
to true.create_metric_descriptor_buffer_size
(default = 10): Buffer size for the channel which asynchronously calls CreateMetricDescriptor.service_resource_labels
(default = true): If true, the exporter will copy OTel's service.name, service.namespace, and service.instance.id resource attributes into the GCM timeseries metric labels.resource_filters
(default = []): If provided, resource attributes matching any filter will be included in metric labels. Can be defined byprefix
,regex
, orprefix
ANDregex
.prefix
: Match resource keys by prefix.regex
: Match resource keys by regex.
cumulative_normalization
(default = true): If true, normalizes cumulative metrics without start times or with explicit reset points by subtracting subsequent points from the initial point. It is enabled by default. Since it caches starting points, it may result inincreased memory usage.sum_of_squared_deviation
(default = false): If true, enables calculation of an estimated sum of squared deviation. It is an estimate, and is not exact.experimental_wal
(default = []): If provided, enables use of a write ahead log for time series requests.directory
(default =./
): Path to local directory for WAL file.max_backoff
(default =1h
): Max duration to retry requests on network errors (UNAVAILABLE
orDEADLINE_EXCEEDED
).
trace
(optional): Configuration for sending traces to Cloud Trace.endpoint
(default = cloudtrace.googleapis.com): Endpoint where trace data is going to be sent to.grpc_pool_size
(optional): Sets the size of the connection pool in the GCP client. Defaults to a single connection.use_insecure
(default = false): If true, disables gRPC client transport security. Only has effect if Endpoint is not "".attribute_mappings
(optional): AttributeMappings determines how to map from OpenTelemetry attribute keys to Google Cloud Trace keys. By default, it changes http and service keys so that they appear more prominently in the UI.key
: Key is the OpenTelemetry attribute keyreplacement
: Replacement is the attribute sent to Google Cloud Trace
log
(optional): Configuration for sending metrics to Cloud Logging.endpoint
(default = logging.googleapis.com): Endpoint where log data is going to be sent to.compression
(optional): Compression format for Metrics gRPC requests. Supported values: [gzip
]. Defaults to no compression.grpc_pool_size
(optional): Sets the size of the connection pool in the GCP client. Defaults to a single connection.use_insecure
(default = false): If true, disables gRPC client transport security. Only has effect if Endpoint is not "".default_log_name
(optional): Defines a default name for log entries. If left unset, and a log entry does not have thegcp.log_name
attribute set, the exporter will return an error processing that entry.resource_filters
(default = []): If provided, resource attributes matching any filter will be included in log labels. Can be defined byprefix
,regex
, orprefix
ANDregex
.prefix
: Match resource keys by prefix.regex
: Match resource keys by regex.
compression
(optional): Enable gzip compression for gRPC requests (valid vlaues:gzip
).
sending_queue
(optional): Configuration for how to buffer traces before sending.enabled
(default = true)num_consumers
(default = 10): Number of consumers that dequeue batches; ignored ifenabled
isfalse
queue_size
(default = 1000): Maximum number of batches kept in memory before data; ignored ifenabled
isfalse
; User should calculate this asnum_seconds * requests_per_second
where:num_seconds
is the number of seconds to buffer in case of a backend outagerequests_per_second
is the average number of requests per seconds.
Note: The sending_queue
is provided (and documented) by the Exporter Helper
Beyond standard YAML configuration as outlined in the sections that follow, exporters that leverage the net/http package (all do today) also respect the following proxy environment variables:
- HTTP_PROXY
- HTTPS_PROXY
- NO_PROXY
If set at Collector start time then exporters, regardless of protocol, will or will not proxy traffic as defined by these environment variables.
For metrics and logs, this exporter maps the OpenTelemetry Resource to a Google Cloud Logging or Monitoring Monitored Resource.
The complete mapping logic can be found here. That may be the most helpful reference if you want to map to a specific monitored resource.
If running on GCP, using the GCP resource detector, as shown above, will populate the resource attributes required to map to the appropriate monitored resource.
If you are not running on GCP, you still need to choose a GCP zone or
region to send telemetry to
by setting cloud.availability_zone
or cloud.region
. In addition, you should use the detector associated with other cloud providers, if applicable.
If running on Kubernetes, it is recommended to additionally set k8s.pod.name
,
k8s.namespace.name
, and k8s.container.name
using the k8sattributes
processor.
If you are getting "duplicate timeseries encountered" errors, it is likely because you are missing a required resource attribute, causing a metric from two different instances of an application to end up with the same monitored resource.
The metrics exporter can add metric labels to timeseries, such as when setting
metric.service_resource_labels
, metric.instrumentation_library_labels
(both
on by default), or when using metric.resource_filters
to convert resource
attributes to metric labels.
However, if your metrics already contain any of these labels they will fail to
export to Google Cloud with a Duplicate label key encountered
error. Such
labels from the default features above include:
service_name
service_namespace
service_instance_id
instrumentation_source
instrumentation_version
(Note that these are the sanitized versions of OpenTelemetry attributes, with .
replaced by _
to be compatible with Cloud Monitoring. For example, service_name
comes from the service.name
resource attribute.)
To prevent this, it's recommended to use the transform processor in your collector config to rename existing metric labels to preserve them, for example:
processors:
transform:
metric_statements:
- context: datapoint
statements:
- set(attributes["exported_service_name"], attributes["service_name"])
- delete_key(attributes, "service_name")
- set(attributes["exported_service_namespace"], attributes["service_namespace"])
- delete_key(attributes, "service_namespace")
- set(attributes["exported_service_instance_id"], attributes["service_instance_id"])
- delete_key(attributes, "service_instance_id")
- set(attributes["exported_instrumentation_source"], attributes["instrumentation_source"])
- delete_key(attributes, "instrumentation_source")
- set(attributes["exported_instrumentation_version"], attributes["instrumentation_version"])
- delete_key(attributes, "instrumentation_version")
Note It is not recommended to use these transformations with the googlecloud exporter in a logging or trace pipeline.
The same method can be used for any resource attributes being filtered to metric labels, or metric labels which might collide with the GCP monitored resource used with resource detection.
Keep in mind that your conflicting attributes may contain dots instead of
underscores (eg, service.name
), but these will still collide once all
attributes are normalized to metric labels. In this case you will need to update
the collector config above appropriately.
The logging exporter processes OpenTelemetry log entries and exports them to GCP Cloud Logging. Logs can be collected using one of the opentelemetry-collector-contrib log receivers, such as the filelogreceiver.
Log entries must contain any Cloud Logging-specific fields as a matching OpenTelemetry attribute (as shown in examples from the logs data model). These attributes can be parsed using the various log operators available upstream.
For example, the following config parses the HTTPRequest field from Apache log entries saved in /var/log/apache.log
.
It also parses out the timestamp
and inserts a non-default log_name
attribute and GCP MonitoredResource attribute.
receivers:
filelog:
include: [ /var/log/apache.log ]
start_at: beginning
operators:
- id: http_request_parser
type: regex_parser
regex: '(?m)^(?P<remoteIp>[^ ]*) (?P<host>[^ ]*) (?P<user>[^ ]*) \[(?P<time>[^\]]*)\] "(?P<requestMethod>\S+)(?: +(?P<requestUrl>[^\"]*?)(?: +(?P<protocol>\S+))?)?" (?P<status>[^ ]*) (?P<responseSize>[^ ]*)(?: "(?P<referer>[^\"]*)" "(?P<userAgent>[^\"]*)")?$'
parse_to: attributes["gcp.http_request"]
timestamp:
parse_from: attributes["gcp.http_request"].time
layout_type: strptime
layout: '%d/%b/%Y:%H:%M:%S %z'
converter:
max_flush_count: 100
flush_interval: 100ms
exporters:
googlecloud:
project: my-gcp-project
log:
default_log_name: opentelemetry.io/collector-exported-log
processors:
memory_limiter:
check_interval: 1s
limit_percentage: 65
spike_limit_percentage: 20
resourcedetection:
detectors: [gcp]
timeout: 10s
attributes:
# Override the default log name. `gcp.log_name` takes precedence
# over the `default_log_name` specified in the exporter.
actions:
- key: gcp.log_name
action: insert
value: apache-access-log
service:
logs:
receivers: [filelog]
processors: [memory_limiter, resourcedetection, attributes]
exporters: [googlecloud]
This would parse logs of the following example structure:
127.0.0.1 - - [26/Apr/2022:22:53:36 +0800] "GET / HTTP/1.1" 200 1247
To the following GCP entry structure:
{
"logName": "projects/my-gcp-project/logs/apache-access-log",
"resource": {
"type": "gce_instance",
"labels": {
"instance_id": "",
"zone": ""
}
},
"textPayload": "127.0.0.1 - - [26/Apr/2022:22:53:36 +0800] \"GET / HTTP/1.1\" 200 1247",
"timestamp": "2022-05-02T12:16:14.574548493Z",
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "/",
"status": 200,
"responseSize": "1247",
"remoteIp": "127.0.0.1",
"protocol": "HTTP/1.1"
}
}
The logging exporter also supports the full range of GCP log severity levels,
which differ from the available OpenTelemetry log severity levels.
To accommodate this, the following mapping is used to equate an incoming OpenTelemetry SeverityNumber
to a matching GCP log severity:
OTel SeverityNumber /Name |
GCP severity level |
---|---|
Undefined | Default |
1-4 / Trace | Debug |
5-8 / Debug | Debug |
9-10 / Info | Info |
11-12 / Info | Notice |
13-16 / Warn | Warning |
17-20 / Error | Error |
21-22 / Fatal | Critical |
23 / Fatal | Alert |
24 / Fatal | Emergency |
The upstream severity parser (along with the regex parser) allows for additional flexibility in parsing log severity from incoming entries.
By default, the exporter sends telemetry to the project specified by project
in the configuration. This can be overridden on a per-metrics basis using the gcp.project.id
resource attribute. For example, if a metric has a label project
, you could use the groupbyattrs
processor to promote it to a resource label, and the resource
processor to rename the attribute from project
to gcp.project.id
.
The gcp.project.id
label can be combined with the destination_project_quota
option to attribute quota usage to the project parsed by the label. This feature is currently only available
for traces and metrics. The Collector's default service account will need roles/serviceusage.serviceUsageConsumer
IAM permissions in the destination quota project.
Note that this option will not work if a quota project is already defined in your Collector's GCP credentials. In this case, the telemetry will fail to export with a "project not found" error.
This can be done by manually editing your ADC file (if it exists) to remove the quota_project_id
entry line.
See the Collector feature gates for an overview of feature gates in the collector.