Skip to content

Commit

Permalink
Document Antrea component metrics
Browse files Browse the repository at this point in the history
Add the list of metrics which are exposed by Antrea agent and controller
in the integration doc for reference.
  • Loading branch information
ksamoray committed Jun 1, 2020
1 parent 7a04cf3 commit 4a7264e
Showing 1 changed file with 160 additions and 15 deletions.
175 changes: 160 additions & 15 deletions docs/prometheus-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,21 +49,6 @@ rules:
verbs: ["get"]
```
### Antrea Metrics Listener Access
To scrape the metrics from Antrea Controller and Agent, Prometheus needs the
following permissions
```yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: prometheus-antrea
rules:
- nonResourceURLs:
- /metrics
verbs:
- get
```
### Antrea Components Scraping configuration
Add the following jobs to Prometheus scraping configuration to enable metrics
collection from Antrea components
Expand Down Expand Up @@ -106,3 +91,163 @@ The configuration file above can be used to deploy Prometheus Server with
scraping configuration for Antrea services.
To deploy this configuration use
`kubectl apply -f build/yamls/antrea-prometheus.yml`

# Antrea Prometheus Metrics
Antrea Controller and Agents expose various metrics, some are provided by the
Antrea components and others are provided by 3rd party components which are used
by the Antrea components.

Below is a list of metrics, provided by the components and by 3rd parties.

## Antrea Controller Metrics
**antrea_controller_address_group_processed:** The total number of
address-group processed
**antrea_controller_address_group_sync_duration_milliseconds:** The duration
of syncing address-group
**antrea_controller_applied_to_group_processed:** The total number of
applied-to-group processed
**antrea_controller_applied_to_group_sync_duration_milliseconds:** The
duration of syncing applied-to-group
**antrea_controller_length_address_group_queue:** The length of
AddressGroupQueue
**antrea_controller_length_applied_to_group_queue:** The length of
AppliedToGroupQueue
**antrea_controller_length_network_policy_queue:** The length of
InternalNetworkPolicyQueue
**antrea_controller_network_policy_processed:** The total number of
internal-networkpolicy processed
**antrea_controller_network_policy_sync_duration_milliseconds:** The duration
of syncing internal-networkpolicy
**antrea_controller_runtime_info:** Antrea controller runtime info, defined as
labels. The value of the gauge is always set to 1.

## Antrea Agent Metrics
**antrea_agent_local_pod_count:** Number of pods on local node which are
managed by the Antrea Agent.
**antrea_agent_ovs_flow_table:** OVS flow table flow count.
**antrea_agent_runtime_info:** Antrea agent runtime info , defined as labels.
The value of the gauge is always set to 1.

## Common Metrics Provided by Infrastructure
### Apiserver Metrics
**apiserver_audit_event_total:** Counter of audit events generated and sent to
the audit backend.
**apiserver_audit_requests_rejected_total:** Counter of apiserver requests
rejected due to an error in audit logging backend.
**apiserver_client_certificate_expiration_seconds:** Distribution of the
remaining lifetime on the certificate used to authenticate a request.
**apiserver_current_inflight_requests:** Maximal number of currently used
inflight request limit of this apiserver per request kind in last second.
**apiserver_longrunning_gauge:** Gauge of all active long-running apiserver
requests broken out by verb, group, version, resource, scope and component. Not
all requests are tracked this way.
**apiserver_registered_watchers:** Number of currently registered watchers for
a given resources
**apiserver_request_count:** (Deprecated) Counter of apiserver requests broken
out for each verb, group, version, resource, scope, component, client, and HTTP
response contentType and code.
**apiserver_request_duration_seconds:** Response latency distribution in
seconds for each verb, dry run value, group, version, resource, subresource,
scope and component.
**apiserver_request_latencies:** (Deprecated) Response latency distribution in
microseconds for each verb, group, version, resource, subresource, scope and
component.
**apiserver_request_latencies_summary:** (Deprecated) Response latency summary
in microseconds for each verb, group, version, resource, subresource, scope and
component.
**apiserver_request_total:** Counter of apiserver requests broken out for each
verb, dry run value, group, version, resource, scope, component, client, and
HTTP response contentType and code.
**apiserver_response_sizes:** Response size distribution in bytes for each
group, version, verb, resource, subresource, scope and component.
**apiserver_storage_data_key_generation_duration_seconds:** Latencies in
seconds of data encryption key(DEK) generation operations.
**apiserver_storage_data_key_generation_failures_total:** Total number of
failed data encryption key(DEK) generation operations.
**apiserver_storage_data_key_generation_latencies_microseconds:** (Deprecated)
Latencies in microseconds of data encryption key(DEK) generation operations.
**apiserver_storage_envelope_transformation_cache_misses_total:** Total number
of cache misses while accessing key decryption key(KEK).
### Authenticated Metrics
**authenticated_user_requests:** Counter of authenticated requests broken out
by username.
### Etcd Metrics
**etcd_helper_cache_entry_count:** (Deprecated) Counter of etcd helper cache
entries. This can be different from etcd_helper_cache_miss_count because two
concurrent threads can miss the cache and generate the same entry twice.
**etcd_helper_cache_entry_total:** Counter of etcd helper cache entries. This
can be different from etcd_helper_cache_miss_count because two concurrent
threads can miss the cache and generate the same entry twice.
**etcd_helper_cache_hit_count:** (Deprecated) Counter of etcd helper cache
hits.
**etcd_helper_cache_hit_total:** Counter of etcd helper cache hits.
**etcd_helper_cache_miss_count:** (Deprecated) Counter of etcd helper cache
miss.
**etcd_helper_cache_miss_total:** Counter of etcd helper cache miss.
**etcd_request_cache_add_duration_seconds:** Latency in seconds of adding an
object to etcd cache
**etcd_request_cache_add_latencies_summary:** (Deprecated) Latency in
microseconds of adding an object to etcd cache
**etcd_request_cache_get_duration_seconds:** Latency in seconds of getting an
object from etcd cache
**etcd_request_cache_get_latencies_summary:** (Deprecated) Latency in
microseconds of getting an object from etcd cache
### Go Metrics
**go_gc_duration_seconds:** A summary of the GC invocation durations.
**go_goroutines:** Number of goroutines that currently exist.
**go_info:** Information about the Go environment.
**go_memstats_alloc_bytes:** Number of bytes allocated and still in use.
**go_memstats_alloc_bytes_total:** Total number of bytes allocated, even if
freed.
**go_memstats_buck_hash_sys_bytes:** Number of bytes used by the profiling
bucket hash table.
**go_memstats_frees_total:** Total number of frees.
**go_memstats_gc_cpu_fraction:** The fraction of this program's available CPU
time used by the GC since the program started.
**go_memstats_gc_sys_bytes:** Number of bytes used for garbage collection
system metadata.
**go_memstats_heap_alloc_bytes:** Number of heap bytes allocated and still in
use.
**go_memstats_heap_idle_bytes:** Number of heap bytes waiting to be used.
**go_memstats_heap_inuse_bytes:** Number of heap bytes that are in use.
**go_memstats_heap_objects:** Number of allocated objects.
**go_memstats_heap_released_bytes:** Number of heap bytes released to OS.
**go_memstats_heap_sys_bytes:** Number of heap bytes obtained from system.
**go_memstats_last_gc_time_seconds:** Number of seconds since 1970 of last
garbage collection.
**go_memstats_lookups_total:** Total number of pointer lookups.
**go_memstats_mallocs_total:** Total number of mallocs.
**go_memstats_mcache_inuse_bytes:** Number of bytes in use by mcache
structures.
**go_memstats_mcache_sys_bytes:** Number of bytes used for mcache structures
obtained from system.
**go_memstats_mspan_inuse_bytes:** Number of bytes in use by mspan structures.
**go_memstats_mspan_sys_bytes:** Number of bytes used for mspan structures
obtained from system.
**go_memstats_next_gc_bytes:** Number of heap bytes when next garbage
collection will take place.
**go_memstats_other_sys_bytes:** Number of bytes used for other system
allocations.
**go_memstats_stack_inuse_bytes:** Number of bytes in use by the stack
allocator.
**go_memstats_stack_sys_bytes:** Number of bytes obtained from system for
stack allocator.
**go_memstats_sys_bytes:** Number of bytes obtained from system.
**go_threads:** Number of OS threads created.
### HTTP Metrics
**http_request_duration_microseconds:** The HTTP request latencies in
microseconds.
**http_request_size_bytes:** The HTTP request sizes in bytes.
**http_requests_total:** Total number of HTTP requests made.
**http_response_size_bytes:** The HTTP response sizes in bytes.
### Process Metrics
**process_cpu_seconds_total:** Total user and system CPU time spent in
seconds.
**process_max_fds:** Maximum number of open file descriptors.
**process_open_fds:** Number of open file descriptors.
**process_resident_memory_bytes:** Resident memory size in bytes.
**process_start_time_seconds:** Start time of the process since unix epoch in
seconds.
**process_virtual_memory_bytes:** Virtual memory size in bytes.
**process_virtual_memory_max_bytes:** Maximum amount of virtual memory
available in bytes.

0 comments on commit 4a7264e

Please sign in to comment.