As mentioned above, the first "event" in each ND-JSON stream contains metadata to fold into subsequent events. The metadata that agents should collect includes is are described in the following sub-sections.
- service metadata
- global labels (requires APM Server 7.2 or greater)
The process for proposing new metadata fields is detailed here.
System metadata relates to the host/container in which the service being monitored is running:
- hostname
- host.id
- architecture
- operating system
- container ID
- kubernetes
- namespace
- node name
- pod name
- pod UID
The hostname value(s) reported by the agent are mapped by APM Server to the ECS
host.hostname
and
host.name
fields.
Agents SHOULD return the lower-cased FQDN whenever possible, which might require a DNS query.
Agents SHOULD implement this hostname discovery algorithm wherever possible:
var hostname;
if os == windows
// https://stackoverflow.com/questions/12268885/powershell-get-fqdn-hostname
// https://learn.microsoft.com/en-us/dotnet/api/system.net.dns.gethostentry
hostname = exec "powershell.exe -NoLogo -NonInteractive -NoProfile -ExecutionPolicy Bypass -Command [System.Net.Dns]::GetHostEntry($env:computerName).HostName" // or any equivalent *
if (hostname == null || hostname.length == 0)
hostname = exec "cmd.exe /c hostname" // or any equivalent *
if (hostname == null || hostname.length == 0)
hostname = env.get("COMPUTERNAME")
else
hostname = exec "hostname -f" // or any equivalent *
if (hostname == null || hostname.length == 0)
hostname = env.get("HOSTNAME")
if (hostname == null || hostname.length == 0)
hostname = env.get("HOST")
if (hostname == null || hostname.length == 0)
hostname = readfile("/etc/hostname") // or any equivalent *
if hostname != null
hostname = hostname.toLowerCase().trim() // see details below **
*
this algorithm is using external commands in order to be OS-specific and language-independent, however these
may be replaced with language-specific APIs that provide the equivalent result.
**
in this case, trim()
refers to the removal of all leading and trailing characters of which codepoint is less-than
or equal to U+0020
(space), the toLowerCase()
refers to the replacement of characters in A-Z
with their a-z
equivalents.
In addition to auto-discovery of the hostname, agents SHOULD also expose the ELASTIC_APM_HOSTNAME
config option that
can be used as a manual fallback.
Up to APM Server 7.4, only the system.hostname
field was used for this purpose. Agents communicating with
APM Server of these versions MUST set system.hostname
with the value of ELASTIC_APM_HOSTNAME
, if such is manually
configured. Otherwise, agents MUST set it with the automatically-discovered hostname.
Since APM Server 7.4, system.hostname
field is deprecated in favour of two newer fields:
system.configured_hostname
- it should only be sent when configured by the user through theELASTIC_APM_HOSTNAME
config option. If provided, it is used by the APM Server as the event's hostname.system.detected_hostname
- the hostname automatically detected by the APM agent. It will be used as the event's hostname ifconfigured_hostname
is not provided.
Agents that are APM-Server-version-aware, or that are compatible only with versions >= 7.4, should use the new fields wherever applicable.
APM agents MAY collect the host.id
as an unique identifier for the host.
If they collect it, it MUST be conformant to the OpenTelemetry SemConv for host.id
.
If the APM agent performs correlation of its spans/transactions with universal profiling data, it MUST send the host.id
(see the profiling integration spec) as part of the metadata. The APM agent MAY solely rely on the host.id
provided by the profiling host agent in that case.
On Linux, the container ID and some of the Kubernetes metadata can be extracted by parsing /proc/self/cgroup
. For each line in the file, we split the line according to the format "hierarchy-ID:controller-list:cgroup-path", extracting the "cgroup-path" part. We then attempt to extract information according to the following algorithm:
-
Split the path into
dirname
andbasename
:- split based on the last occurrence of the colon character, if such exists, in order to support paths of containers
created by containerd-cri, where the path part takes the form:
<dirname>:cri-containerd:<container-ID>
- if colon char is not found within the path, the split is done based on the last occurrence of the slash character
- split based on the last occurrence of the colon character, if such exists, in order to support paths of containers
created by containerd-cri, where the path part takes the form:
-
If the
basename
ends with ".scope", check for a hyphen and remove everything up to and including that. This allows us to match.../docker-<container-id>.scope
as well as.../<container-id>
. -
Attempt to extract the Kubernetes pod UID from the
dirname
by matching one of the following regular expressions:(?:^/kubepods[\\S]*/pod([^/]+)$)
(?:kubepods[^/]*-pod([^/]+)\.slice)
If there is a match to either expression, the capturing group contains the pod ID. We then unescape underscores (
_
) to hyphens (-
) in the pod UID. If we match a pod UID then we set the pod name to the hostname, as that's the default in Kubernetes. Finally, we record the basename as the container ID without any further checks. -
If we did not match a Kubernetes pod UID above, then we check if the basename matches one of the following regular expressions:
^[[:xdigit:]]{64}$
^[[:xdigit:]]{8}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{4,}$
^[[:xdigit:]]{32}-[[:digit:]]{1,10}$
(AWS ECS/Fargate environments)
If we match, then the basename is assumed to be a container ID.
Sometimes the KUBERNETES_POD_NAME
is set using the Downward API,
so we set the pod name to its value if it exists.
In a similar manner, you can inform the agent of the node name, namespace, and pod UID, using the environment variables KUBERNETES_NODE_NAME
, KUBERNETES_NAMESPACE
, and KUBERNETES_POD_UID
.
With cgroups v2, the /proc/self/cgroup
contains only 0::/
and does not contain the container ID and we have to parse the /proc/self/mountinfo
with the following algorithm as a fallback.
-
filter the line containing
/etc/hostname
to retrieve the file mount that provides the host name to the container. -
split the line on spaces and take the 3rd element containing the host path.
-
extract the container ID from file path by using a regular expression matching a 64 character hexadecimal ID.
Note: container_metadata_discovery.json provides test cases for parsing /self/proc/*
files.
Process level metadata relates to the process running the service being monitored:
- process ID
- parent process ID
- process arguments
- process title (e.g. "node /app/node_")
Service metadata relates to the service/application being monitored:
- service name and version
- environment name ("production", "development", etc.)
- agent name (e.g. "ruby") and version (e.g. "2.8.1")
- language name (e.g. "ruby") and version (e.g. "2.5.3")
- runtime name (e.g. "jruby") and version (e.g. "9.2.6.0")
- framework name (e.g. "flask") and version (e.g. "1.0.2")
For official Elastic agents, the agent name should just be the name of the language for which the agent is written, in lower case.
Services running on AWS Lambda require specific values for some of the above mentioned fields.
Most of the APM Agents can be activated in several ways. Agents SHOULD collect information about the used activation method and send it in the service.agent.activation_method
field within the metadata.
This field MUST be omitted in version 8.7.0
due to a bug in APM server (preventing properly capturing metrics).
This field SHOULD be included when the APM server version is unknown or at least 8.7.1
.
The intention of this field is to drive telemetry so there is a way to know which activation methods are commonly used. This field MUST produce data with very low cardinality, therefore agents SHOULD use one of the values defined below.
If the agent is unable to infer the activation method, it SHOULD send unknown
.
There are some well-known activation methods which can be used by multiple agents. In those cases, agents SHOULD send the following values in service.agent.activation_method
:
aws-lambda-layer
: when the agent was installed as a Lambda layer.k8s-attach
: when the agent is attached via the K8s webhook.env-attach
: when the agent is activated by setting some environment variables. Only use this if there is a single way to activate the agent via an environment variable. If the given runtime offers multiple environment variables to activate the agent, use more specific values to avoid ambiguity.fleet
: when the agent is activated via fleet.
Cross agent activation methods defined above have higher priority than agent specific values below. If none of the above matches the activation method, agents define specific values for specific scenarios.
Node.js:
require
: when the agent is started via CommonJSrequire('elastic-apm-node').start()
orrequire('elastic-apm-node/start')
.import
: when the agent is started via ESM, e.g.import 'elastic-apm-node/start.js'
.preload
: when the agent is started via the Node.js--require
flag, e.g.node -r elastic-apm-node/start ...
, without usingNODE_OPTIONS
.
Java:
javaagent-flag
: when the agent is attached via the-javaagent
JVM flag.apm-agent-attach-cli
: when the agent is attached via theapm-agent-attach-cli
tool.programmatic-self-attach
: when the agent is attached by manually calling theElasticApmAttacher
API in user code.
.NET:
nuget
: when the agent was installed via a NuGet package.profiler
: when the agent was installed via the CLR Profiler.startup-hook
: when the agent relies on theDOTNET_STARTUP_HOOKS
mechanism to install the agent.
Python:
wrapper
: when the agent was invoked with the wrapper script,elasticapm-run
Cloud provider metadata is collected from local cloud provider metadata services:
- availability_zone
- account
- id
- name
- instance
- id
- name
- machine.type
- project
- id
- name
- provider (required)
- region
This metadata collection is controlled by a configuration value,
CLOUD_PROVIDER
. The default is auto
, which automatically detects the cloud
provider. If set to none
, no cloud metadata will be generated. If set to
any of aws
, gcp
, or azure
, metadata will only be generated from the
chosen provider.
Any intake API requests to the APM server should be delayed until this metadata is available.
A sample implementation of this metadata collection is available in the Python agent.
Fetching of cloud metadata for services running as AWS Lambda functions follow a different approach defined in the tracing-instrumentation-aws-lambda spec.
Metadata about an EC2 instance can be retrieved from the internal metadata endpoint, http://169.254.169.254
.
In the case where a proxy is configured on the application, the agents SHOULD attempt to make
the calls to the metadata endpoint directly, without using the proxy.
This is recommended as those HTTP calls could be caller-sensitive and have to be made directly
by the virtual machine where the APM agent executes, also, the 169.254.x.x
IP address range
is reserved for "link-local" addresses that are not routed.
As an example with curl, first, an API token must be created
TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 300"`
Then, metadata can be retrieved, passing the API token
curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data
From the returned metadata, the following fields are useful
Cloud metadata field | AWS Metadata field |
---|---|
account.id |
accountId |
instance.id |
instanceId |
availability_zone |
availabilityZone |
machine.type |
instanceType |
provider |
aws |
region |
region |
Metadata about a GCP machine instance can be retrieved from the metadata service, documented here.
In the case where a proxy is configured on the application, the agents SHOULD attempt to make
the calls to the metadata endpoint directly, without using the proxy.
This is recommended as those HTTP calls could be caller-sensitive and have to be made directly
by the virtual machine where the APM agent executes, also, the 169.254.x.x
IP address range
is reserved for "link-local" addresses that are not routed.
An example with curl
curl -X GET "http://metadata.google.internal/computeMetadata/v1/?recursive=true" -H "Metadata-Flavor: Google"
From the returned metadata, the following fields are useful
Cloud metadata field | GCP Metadata field |
---|---|
instance.id |
instance.id as a string [1] |
instance.name |
instance.name |
project.id |
project.projectId [2] |
availability_zone |
last part of instance.zone , split by / |
machine.type |
last part of instance.machineType , split by / |
provider |
gcp |
region |
last part of instance.zone split by '/', then remove the last '-'-delimited part (e.g., us-west1 from projects/123456789012/zones/us-west1-b ) |
[1]: Beware JSON parsing the instance.id
field from the HTTP response body, because it is formatted as an integer that is larger JavaScript's Number.MAX_SAFE_INTEGER
. It may require native support for or explicit usage of BigInt types.
[2]: Google cloud project identifiers are described here.
(For comparison and consistency, here is the equivalent collection code for beats.)
Metadata about an Azure VM can be retrieved from the internal metadata
endpoint, http://169.254.169.254
.
In the case where a proxy is configured on the application, the agents SHOULD attempt to make
the calls to the metadata endpoint directly, without using the proxy.
This is recommended as those HTTP calls could be caller-sensitive and have to be made directly
by the virtual machine where the APM agent executes, also, the 169.254.x.x
IP address range
is reserved for "link-local" addresses that are not routed.
An example with curl
curl -X GET "http://169.254.169.254/metadata/instance/compute?api-version=2019-08-15" -H "Metadata: true"
From the returned metadata, the following fields are useful
Cloud metadata field | Azure Metadata field |
---|---|
account.id |
subscriptionId |
instance.id |
vmId |
instance.name |
name |
project.name |
resourceGroupName |
availability_zone |
zone |
machine.type |
vmSize |
provider |
azure |
region |
location |
Azure App Services are a PaaS offering within Azure which does not have access to the internal metadata endpoint. Metadata about an App Service can however be retrieved from environment variables
Cloud metadata field | Environment variable |
---|---|
account.id |
first part of WEBSITE_OWNER_NAME , split by + |
instance.id |
WEBSITE_INSTANCE_ID |
instance.name |
WEBSITE_SITE_NAME |
project.name |
WEBSITE_RESOURCE_GROUP |
provider |
azure |
region |
last part of WEBSITE_OWNER_NAME , split by - , trim end "webspace" and anything following |
The environment variable WEBSITE_OWNER_NAME
has the form
{subscription id}+{app service plan resource group}-{region}webspace{.*}
an example of which is f5940f10-2e30-3e4d-a259-63451ba6dae4+elastic-apm-AustraliaEastwebspace
Cloud metadata for Azure App Services is optional; it is up to each agent to determine whether it is useful to implement for their language ecosystem. See azure_app_service_metadata specs for scenarios and expected outcomes.
Azure Functions running within a consumption/premium plan (see Azure Functions hosting options) are a FaaS offering within Azure that do not have access to the internal Azure metadata endpoint. Metadata about an Azure Function can however be retrieved from environment variables. Note: These environment variables slightly differ from those available to Azure App Services.
Cloud metadata field | Environment variable |
---|---|
account.id |
Token {subscription id} from WEBSITE_OWNER_NAME |
instance.name |
WEBSITE_SITE_NAME |
project.name |
WEBSITE_RESOURCE_GROUP (fallback: {resource group} from WEBSITE_OWNER_NAME ) |
provider |
azure |
region |
REGION_NAME (fallback: {region} from WEBSITE_OWNER_NAME ) |
service.name |
functions see the ECS fields doc. |
The environment variable WEBSITE_OWNER_NAME
has the following form:
{subscription id}+{resource group}-{region}webspace{.*}
Example: d2cd53b3-acdc-4964-9563-3f5201556a81+wolfgangfaas_group-CentralUSwebspace-Linux
Events sent by the agents can have labels associated, which may be useful for custom aggregations, or document-level access control. It is possible to add "global labels" to the metadata, which are labels that will be applied to all events sent by an agent. These are only understood by APM Server 7.2 or greater.
Global labels can be specified via the environment variable ELASTIC_APM_GLOBAL_LABELS
, formatted as a comma-separated
list of key=value
pairs.