Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs for txobservers #945

Merged
merged 1 commit into from
Aug 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
292 changes: 134 additions & 158 deletions docs/vNext/MonitorsAndObservers.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,107 +9,90 @@ order: 5
## Table of Contents

* [Overview](#overview)
* [Monitors](#monitors)
* [Resource](#resource)
* [Process monitor](#process-monitor)
* [Docker monitor](#docker-monitor)
* [Prometheus monitor](#prometheus-monitor)
* [Observers](#observers)
* [Null observer](#null-observer)
* [Local observer](#local-observer)
* [Prometheus observer](#prometheus-observer)
* [Grafana](#grafana-visualization)
* [Transaction](#transaction)
* [Logging](#logging)
* [Prometheus](#prometheus)
* [Resource Charting](#resource-charting)
* [Process charting](#process-charting)
* [Docker charting](#docker-charting)
* [Prometheus charting](#prometheus-charting)


## Overview
Caliper monitors are used to collect statistics on resource utilization during benchmarking, the statistics are collated into a report at the culmination of the benchmark process, rendered charts may also be output as part of the report. Caliper also enables real time reporting of current transaction status through observers, or enhanced data visualization using Prometheus and Grafana.
Caliper monitoring modules are used to collect resource utilization and transaction statistics during test execution, with the output being collated into the generated reports. Caliper monitors resources and transactions using:
- Resource monitors. Collect statistics on resource utilization during benchmarking, with monitoring reset between test rounds.
- Transaction monitors. Collect worker transaction statistics and provide conditional dispatch actions.

The operational precision of the monitors is set through the default Caliper configuration file, and may be overridden by the user to increase or decrease the numeric precision used in the output reports.

## Monitors
The type of monitoring to be performed during a benchmark is declared in the `benchmark configuration file` through the specification one or more monitor types in an array under the label `monitor.type`. The integer interval at which monitors fetch information from their targets, in seconds, is specified as an integer under the label `monitor.interval`.
## Resource
The type of resource monitor to be used within a Caliper benchmark is declared in the `benchmark configuration file` through the specification one or more monitoring modules in an array under the label `monitors.resource`.

Permitted monitors are:
- **None:** The `none` monitor declares that no monitors are to be used during the benchmark.
- **Process:** The `process` monitor enables monitoring of a named process on the host machine, and is most typically used to monitor the resources consumed by the running clients. This monitor will retrieve statistics on: [memory(max), memory(avg), CPU(max), CPU(avg), Network I/O, Disc I/O]
- **Docker:** The `docker` monitor enables monitoring of specified Docker containers on the host or a remote machine, through using the Docker Remote API to retrieve container statistics. This monitor will retrieve statistics on: [memory(max), memory(avg), CPU(max), CPU(avg), Network I/O, Disc I/O]
- **Prometheus:** The `prometheus` monitor enables the retrieval of data from Prometheus. This monitor will only report based on explicit user provided queries that are issued to Prometheus. If defined, the provision of a Prometheus server will cause Caliper to default to using the Prometheus PushGateway.

The following declares the use of no monitors:
```
monitor:
type:
- none
```

The following declares the use of docker, process and prometheus monitors:
```
monitor:
type:
- docker
- process
- prometheus
```

Each declared monitor must be accompanied by a block that describes the required configuration of the monitor.
Each declared resource monitoring module is accompanied with options required to configure each of the named monitors. A common option for all modules is `interval`, which is used to configure the refresh interval at which point resource utilization is measured by the monitor.

### Process Monitor
The process monitor definition consists of an array of `[command, arguments, multiOutput]` key:value pairs.
- command: names the parent process to monitor
- arguments: filters on the parent process being monitored
- multiOutput: enables handling of the discovery of multiple processes and may be one of:
- avg: take the average of process values discovered under `command/name`
- sum: sum all process values discovered under `command/name`

The following declares the monitoring of all local `node` processes that match `fabricClientWorker.js`, with the average of all discovered processes being taken.
The process monitoring module options comprise:
- interval: monitor update interval
- processes: of an array of `[command, arguments, multiOutput]` key:value pairs.
- command: names the parent process to monitor
- arguments: filters on the parent process being monitored
- multiOutput: enables handling of the discovery of multiple processes and may be one of:
- avg: take the average of process values discovered under `command/name`
- sum: sum all process values discovered under `command/name`

The following declares the monitoring of all local `node` processes that match `caliper.js`, with a 3 second update frequency, and the average of all discovered processes being taken.
```
monitor:
type:
- process
process:
processes:
- command: node
arguments: fabricClientWorker.js
multiOutput: avg
monitors:
resource:
- module: process
options:
interval: 3
processes: [{ command: 'node', arguments: 'caliper.js', multiOutput: 'avg' }]
```
### Docker Monitor
The docker monitor definition consists of an array of container names that may relate to local or remote docker containers that are listed under a name label. If all local docker containers are to be monitored, this may be achieved by providing `all` as a name
The docker monitoring module options comprise:
- interval: monitor update interval
- containers: an array of container names that may relate to local or remote docker containers to be monitored. If all **local** docker containers are to be monitored, this may be achieved by providing `all` as a name

The following declares the monitoring of two named docker containers; one local and the other remote.
The following declares the monitoring of two named docker containers; one local and the other remote, with a 5second update frequency:
```
monitor:
type:
- docker
docker:
containers:
- peer0.org1.example.com
- http://192.168.1.100:2375/orderer.example.com
monitors:
resource:
- module: docker
options:
interval: 5
containers:
- peer0.org1.example.com
- http://192.168.1.100:2375/orderer.example.com
```

The following declares the monitoring of all local docker containers:
The following declares the monitoring of all local docker containers, with a 5second update frequency:
```
monitor:
type:
- docker
docker:
containers:
- all
monitors:
resource:
- module: docker
options:
interval: 5
containers:
- all
```

### Prometheus Monitor
[Prometheus](https://prometheus.io/docs/introduction/overview/) is an open-source systems monitoring and alerting toolkit that scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs. It stores all scraped samples locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts. [Grafana](https://grafana.com/) or other API consumers can be used to visualize the collected data.

Caliper clients may use a Prometheus [PushGateway](https://prometheus.io/docs/practices/pushing/) in order to publish transaction statistics, as an alternative to reporting statistics locally. By doing so we enable all Caliper clients to publish to a remote URL and gain access to the capabilities of Prometheus, which includes the ability to scrape data from configured targets. If a Prometheus monitor is specified, Caliper will default to publishing all transaction statistics to the Prometheus PushGateway.

All data stored on Prometheus may be queried by Caliper using the Prometheus query [HTTP API](https://prometheus.io/docs/prometheus/latest/querying/api/). At a minimum this may be used to perform aggregate queries in order to report back the transaction statistics, though it is also possible to perform custom queries in order to report back information that has been scraped from other connected sources. Queries issued are intended to generate reports and so are expected to result in either a single value, or a vector that can be condensed into a single value through the application of a statistical routine. It is advisable to create required queries using Grafana to ensure correct operation before transferring the query into the monitor. Please see [Prometheus](https://prometheus.io) and [Grafana](https://grafana.com/grafana) documentation for more information.

#### Configuring The Prometheus Monitor
The prometheus monitor definition consists of:
The prometheus monitoring module options comprise:
- interval: monitor update interval
- url: The Prometheus URL, used for direct queries
- push_url: The Prometheus Push Gateway URL
- metrics: The queries to be run for inclusion within the Caliper report, comprised of to keys: `ignore` and `include`.
- `ignore` a string array that is used as a blacklist for report results. Any results where the component label matches an item in the list, will *not* be included in a generated report.
- `include` a series of blocks that describe the queries that are to be run at the end of each Caliper test.
Expand All @@ -127,26 +110,26 @@ The `include` block is defined by:

The following declares a Prometheus monitor that will run two bespoke queries between each test within the benchmark
```
monitor:
type:
- prometheus
prometheus:
url: "http://localhost:9090"
push_url: "http://localhost:9091"
metrics:
ignore: [prometheus, pushGateway, cadvisor, grafana, node-exporter]
include:
Endorse Time (s):
query: rate(endorser_propsal_duration_sum{chaincode="marbles:v0"}[5m])/rate(endorser_propsal_duration_count{chaincode="marbles:v0"}[5m])
step: 1
label: instance
statistic: avg
Max Memory (MB):
query: sum(container_memory_rss{name=~".+"}) by (name)
step: 10
label: name
statistic: max
multiplier: 0.000001
monitors:
resource:
- module: prometheus
options:
interval: 5
url: "http://localhost:9090"
metrics:
ignore: [prometheus, pushGateway, cadvisor, grafana, node-exporter]
include:
Endorse Time (s):
query: rate(endorser_propsal_duration_sum{chaincode="marbles:v0"}[1m])/rate(endorser_propsal_duration_count{chaincode="marbles:v0"}[1m])
step: 1
label: instance
statistic: avg
Max Memory (MB):
query: sum(container_memory_rss{name=~".+"}) by (name)
step: 10
label: name
statistic: max
multiplier: 0.000001
```
The two queries above will be listed in the generated report as "Endorse Time (s)" and "Max Memory (MB)" respectively:
- **Endorse Time (s):** Runs the listed query with a step size of 1; filters on return tags using the `instance` label; exclude the result if the instance value matches any of the string values provided in the `ignore` array; if the instance does not match an exclude option, then determine the average of all return results and return this value to be reported under "Endorse Time (s)".
Expand All @@ -156,44 +139,41 @@ The two queries above will be listed in the generated report as "Endorse Time (s
#### Obtaining a Prometheus Enabled Network
A sample network that includes a docker-compose file for standing up a Prometheus server, a Prometheus PushGateway and a linked Grafana analytics container, is available within the companion [caliper-benchmarks repository](https://github.com/hyperledger/caliper-benchmarks/tree/master/networks/prometheus-grafana).

## Transaction
Transaction monitors are used by Caliper workers to act on the completion of transactions. They are used internally to aggregate and dispatch transaction statistics to the manager process to enable transaction statistics aggregation for progress reporting via the default observer, and report generation.

## Observers
The type of observer to use during a benchmark is declared in the `benchmark configuration file` through the specification a supported observer type in under the label `observer.type`. The integer interval at which observers fetch information from their targets, in seconds, is specified as an integer under the label `observer.interval`; this is a required property for local and prometheus observers.
The default observer, used for progress reporting by consuming information from the internal transaction monitor, may be updated through configuration file settings:
- `caliper-progress-reporting-enabled`: boolean flag to enable progress reporting, default true
- `caliper-progress-reporting-interval`: numeric value to set the update frequency, in milliseconds (default 5000)

Permitted observers are:
- none
- local
- prometheus
Additional transaction monitoring modules include:
- logging
- prometheus-push

### None Observer
A `none` observer is used to ignore all transaction submissions of all clients. The following specifies the use of a none observer that will omit the console display of any transaction statistics during the benchmark process.
One or more transaction modules may be specified by naming them as modules with an accompanying options block in an array format under `monitors.transaction`.

```
observer:
type: none
```

### Local Observer
A `local` observer is used to view current transaction submissions of all clients on a local host machine. The following specifies the use of a local observer that collects and reports current transaction status at 1 second intervals.
### Logging
The `logging` transaction module is used to log aggregated transaction statistics at the completion of a test round, within the worker. The following specifies the use of a `logging` transaction observer. No options are required by the module.

```
observer:
type: local
interval: 1
monitors:
transaction:
- module: logging
```

If a Prometheus monitor is in use, then a Prometheus observer should also be used.

### Prometheus Observer
A `prometheus` observer is used to view current transaction submissions of all clients that are reporting transactions to a Prometheus server. The following specifies the use of a Prometheus observer that collects and reports current transaction status at 5 second intervals.
### Prometheus
The `prometheus-push` transaction module is used to dispatch current transaction submissions of all clients to a Prometheus server, via a push gateway. The following specifies the use of a `prometheus-push` transaction module that sends current transaction statistics to a push gateway located at `http://localhost:9091` at 5 second intervals.

```
observer:
type: prometheus
interval: 5
monitors:
transaction:
- module: prometheus-push
options:
interval: 5
push_url: "http://localhost:9091"
```

Use of a Prometheus observer is predicated on the availability and use of a Prometheus monitor. The observer will extract required URL information from the relevant sections under the Prometheus monitor specification.
Use of a `prometheus-push` transaction module is predicated on the availability and use of a Prometheus monitor.

### Grafana Visualization
Grafana is an analytics platform that may be used to query and visualize metrics collected by Prometheus. Caliper clients will send the following to the PushGateway:
Expand Down Expand Up @@ -239,19 +219,14 @@ charting:
### Process Charting
The process resource monitor exposes the following metrics: Memory(max), Memory(avg), CPU%(max), CPU%(avg).

The following declares the monitoring of any running processes named `fabricClientWorker.js` and `runBenchmarkCommand.js`, with charting options specified to produce bar charts for `all` available metrics. Charts will be produced containing data from all monitored processes:
The following declares the monitoring of any running processes named `caliper.js`, with charting options specified to produce bar charts for `all` available metrics. Charts will be produced containing data from all monitored processes:
```
monitor:
type:
- process
process:
processes:
- command: node
arguments: fabricClientWorker.js
multiOutput: avg
- command: node
arguments: runBenchmarkCommand.js
multiOutput: avg
monitors:
resource:
- module: process
options:
interval: 3
processes: [{ command: 'node', arguments: 'caliper.js', multiOutput: 'avg' }]
charting:
bar:
metrics: [all]
Expand All @@ -261,49 +236,50 @@ The docker resource monitor exposes the following metrics: Memory(max), Memory(a

The following declares the monitoring of all local docker containers, with charting options specified to produce bar charts for `Memory(avg)` and `CPU%(avg)`, and polar charts for `all` metrics. Charts will be produced containing data from all monitored containers:
```
monitor:
type:
- docker
docker:
containers:
- all
monitors:
resource:
- module: docker
options:
interval: 5
containers:
- all
charting:
bar:
metrics: [Memory(avg), CPU%(avg)]
polar:
metrics: [all]
bar:
metrics: [Memory(avg), CPU%(avg)]
polar:
metrics: [all]
```

### Prometheus Charting
The Prometheus monitor enables user definition of all metrics within the configuration file.

The following declares the monitoring of two user defined metrics `Endorse Time(s)` and `Max Memory(MB)`. Charting options are specified to produce polar charts filtered on the metric `Max Memory (MB)`, and bar charts of all user defined metrics.
```
monitor:
type:
- prometheus
prometheus:
push_url: "http://localhost:9091"
url: "http://localhost:9090"
metrics:
ignore: [prometheus, pushGateway, cadvisor, grafana, node-exporter]
include:
Endorse Time(s):
query: rate(endorser_propsal_duration_sum{chaincode="marbles:v0"}[5m])/rate(endorser_propsal_duration_count{chaincode="marbles:v0"}[5m])
step: 1
label: instance
statistic: avg
Max Memory(MB):
query: sum(container_memory_rss{name=~".+"}) by (name)
step: 10
label: name
statistic: max
multiplier: 0.000001
charting:
polar:
metrics: [Max Memory (MB)]
bar:
metrics: [all]
monitors:
resource:
- module: prometheus
options:
interval: 5
url: "http://localhost:9090"
metrics:
ignore: [prometheus, pushGateway, cadvisor, grafana, node-exporter]
include:
Endorse Time (s):
query: rate(endorser_propsal_duration_sum{chaincode="marbles:v0"}[1m])/rate(endorser_propsal_duration_count{chaincode="marbles:v0"}[1m])
step: 1
label: instance
statistic: avg
Max Memory (MB):
query: sum(container_memory_rss{name=~".+"}) by (name)
step: 10
label: name
statistic: max
multiplier: 0.000001
charting:
polar:
metrics: [Max Memory (MB)]
bar:
metrics: [all]
```

## License
Expand Down
Loading