Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for Prometheus #589

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 2 additions & 18 deletions docs/vNext/Architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,16 +82,8 @@ monitor:
```
* **test** - defines the metadata of the test, as well as multiple test rounds with specified workload:
* **name&description** : human readable name and description of the benchmark, the value is used by the report generator to show in the testing report.
* **clients** : defines the client type as well as relevant arguments, the 'type' property must be 'local' or 'zookeeper'
* **clients** : defines the client type as well as relevant arguments, the 'type' property must be 'local'
* local: In this case, local processes will be forked and act as blockchain clients. The number of forked clients should be defined by 'number' property.
* zookeeper: In this case, clients could be located on different machines and take tasks from master via zookeeper. Zookeeper server address as well as the number of simulated blockchain clients which launch locally by zookeeper client should be defined. A example of zookeeper configuration defined is as below:
```
"type": "zookeeper",
"zoo" : {
"server": "10.229.42.159:2181",
"clientsPerHost": 5
}
```
* **label** : hint for the test. For example, you can use the transaction name as the label name to tell which transaction is mainly used to test the performance. The value is also used as the context name for *blockchain.getContext()*. For example, developers may want to test performance of different Fabric channels, in that case, tests with different label can be bound to different fabric channels.
* **txNumber** : defines an array of sub-rounds with different transaction numbers to be run in each round. For example, [5000,400] means totally 5000 transactions will be generated in the first round and 400 will be generated in the second.
* **txDuration** : defines an array of sub-rounds with time based test runs. For example [150,400] means two runs will be made, the first test will run for 150 seconds, and the second will run for 400 seconds. If specified in addition to txNumber, the txDuration option will take precedence.
Expand All @@ -102,7 +94,7 @@ monitor:
* **monitor** - defines the type of resource monitors and monitored objects, as well as the time interval for the monitoring.
* docker : a docker monitor is used to monitor specified docker containers on local or remote hosts. Docker Remote API is used to retrieve remote container's stats. Reserved container name 'all' means all containers on the host will be watched. In above example, the monitor will retrieve the stats of two containers per second, one is a local container named 'peer0.org1.example.com' and another is a remote container named 'orderer.example.com' located on host '192.168.1.100', 2375 is the listening port of Docker on that host.
* process : a process monitor is used to monitor specified local process. For example, users can use this monitor to watch the resource consumption of simulated blockchain clients. The 'command' and 'arguments' properties are used to specify the processes. The 'multiOutput' property is used to define the meaning of the output if multiple processes are found. 'avg' means the output is the average resource consumption of those processes, while 'sum' means the output is the summing consumption.
* others : to be implemented.
* prometheus : uses a configured Prometheus server to publish and query metrics

### Master

Expand Down Expand Up @@ -131,14 +123,6 @@ The client invokes a test module which implements user defined testing logic.The

A local client will only be launched once at beginning of the first test round, and be destroyed after finishing all the tests.

#### Zookeeper Clients

In this mode, multiple zookeeper clients are launched independently. A zookeeper client will register itself after launch and watch for testing tasks. After testing, a znode which contains the result of performance statistics will be created.

A zookeeper client also forks multiple child processes (local clients) to do the actual testing work as described above.

For more details, please refer to [Zookeper Client Design](./Zookeeper_Client_Design.md).

### User Defined Test Module

A test module implements functions that actually generate and submit transactions. By this way, developers can implement their own testing logic and integrate it with the benchmark engine.
Expand Down
2 changes: 1 addition & 1 deletion docs/vNext/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ This contains samples that may be run using the caliper-cli, and extended to inc
- network: contains blockchain (network) configuration files

### caliper-cli
This is the Caliper CLI that enables the running of a benchmark and interaction with zookeeper clients/services.
This is the Caliper CLI that enables the running of a benchmark

### caliper-core
Contains all the Caliper core code. Interested developers can follow the code flow from the above `run-benchmark.js` file, that enters `caliper-flow.js` in the core package.
Expand Down
33 changes: 0 additions & 33 deletions docs/vNext/Getting_Started.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,39 +74,6 @@ Only one `only` flag is permitted to be supplied at a time, and may not be used
caliper benchmark run --caliper-workspace ./packages/caliper-samples --caliper-benchconfig benchmark/simple/config.yaml --caliper-networkconfig network/fabric-v1.4/2org1peercouchdb/fabric-node.yaml --caliper-flow-only-test
```

## Run Benchmark with Distributed Clients (Experimental)

In this way, multiple clients can be launched on distributed hosts to run the same benchmark.

1. Start the ZooKeeper service using the Caliper CLI:
```bash
caliper zooservice start
```
2. Launch a caliper-zoo-client on each target machine using the Caliper CLI:
```bash
caliper zooclient start -w ~/myCaliperProject -a <host-address>:<port> -n my-sut-config.yaml
```

3. Modify the client type setting in configuration file to 'zookeeper'.

Example:
```
"clients": {
"type": "zookeeper",
"zoo" : {
"server": "10.229.42.159:2181",
"clientsPerHost": 5
}
}
```

4. Launch the benchmark on any machine as usual.

> Note:
> * Zookeeper is used to register clients and exchange messages. A launched client will add a new znode under /caliper/clients/. The benchmark checks the directory to learn how many clients are there, and assign tasks to each client according to the workload.
> * There is no automatic time synchronization between the clients. You should manually synchronize time between target machines, for example using 'ntpdate'.
> * The blockchain configuration file must exist on machines which run the client, and the relative path (relative to the caliper folder) of the file must be identical. All referenced files in the configuration must also exist.

## How to Contribute

See [Contributing](./CONTRIBUTING.md)
Expand Down
1 change: 1 addition & 0 deletions docs/vNext/Logging_Control.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ layout: pageNext
title: "Logging Control"
categories: reference
permalink: /vNext/logging/
order: 6
---

Caliper builds on the [winston](https://github.com/winstonjs/winston) logger module to provide a flexible, multi-target logging mechanism. Using the logging functionality can be split into two aspects:
Expand Down
204 changes: 204 additions & 0 deletions docs/vNext/MonitorsAndObservers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
---
layout: pageNext
title: "Monitors and Observers"
categories: reference
permalink: /vNext/caliper-monitors/
order: 5
---

## Table of Contents

* [Overview](#overview)
* [Monitors](#monitors)
* [Process monitor](#process-monitor)
* [Docker monitor](#docker-monitor)
* [Prometheus monitor](#prometheus-monitor)
* [Observers](#observers)
* [Null observer](#null-observer)
* [Local observer](#local-observer)
* [Prometheus observer](#prometheus-observer)
* [Grafana](#grafana-visualization)

## Overview
Caliper monitors are used to collect statistics on resource utilization during benchmarking, the statistics are collated into a report at the culmination of the benchmark process. Caliper also enables real time reporting of current transaction status through observers, or enhanced data visualization using Prometheus and Grafana.

## Monitors
The type of monitoring to be performed during a benchmark is declared in the `benchmark configuration file` through the specification one or more monitor types in an array under the label `monitor.type`. The integer interval at which monitors fetch information from their targets, in seconds, is specified as an integer under the label `monitor.interval`.

Permitted monitors are:
- **None:** The `none` monitor declares that no monitors are to be used during the benchmark.
- **Process:** The `process` monitor enables monitoring of a named process on the host machine, and is most typically used to monitor the resources consumed by the running clients. This monitor will retrieve statistics on: [memory(max), memory(avg), CPU(max), CPU(avg), Network I/O, Disc I/O]
- **Docker:** The `docker` monitor enables monitoring of specified Docker containers on the host or a remote machine, through using the Docker Remote API to retrieve container statistics. This monitor will retrieve statistics on: [memory(max), memory(avg), CPU(max), CPU(avg), Network I/O, Disc I/O]
- **Prometheus:** The `prometheus` monitor enables the retrieval of data from Prometheus. This monitor will only report based on explicit user provided queries that are issued to Prometheus. If defined, the provision of a Prometheus server will cause Caliper to default to using the Prometheus PushGateway.

The following declares the use of no monitors:
```
monitor:
type:
- none
```

The following declares the use of docker, process and prometheus monitors:
```
monitor:
type:
- docker
- process
- prometheus
```

Each declared monitor must be accompanied by a block that describes the required configuration of the monitor.

### Process Monitor
The process monitor definition consists of an array of `[command, arguments, multiOutput]` key:value pairs.
- command: names the parent process to monitor
- arguments: filters on the parent process being monitored
- multiOutput: enables handling of the discovery of multiple processes and may be one of:
- avg: take the average of process values discovered under `command/name`
- sum: sum all process values discovered under `command/name`

The following declares the monitoring of all local `node` processes that match `fabricClientWorker.js`, with the average of all discovered processes being taken.
```
monitor:
type:
- process
process:
- command: node
arguments: fabricClientWorker.js
multiOutput: avg
```
### Docker Monitor
The docker monitor definition consists of an array of container names that may relate to local or remote docker containers that are listed under a name label. If all local docker containers are to be monitored, this may be achieved by providing `all` as a name

The following declares the monitoring of two named docker containers; one local and the other remote.
```
monitor:
type:
- docker
docker:
name:
- peer0.org1.example.com
- http://192.168.1.100:2375/orderer.example.com
```

The following declares the monitoring of all local docker containers:
```
monitor:
type:
- docker
docker:
name:
- all
```

### Prometheus Monitor
[Prometheus](https://prometheus.io/docs/introduction/overview/) is an open-source systems monitoring and alerting toolkit that scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs. It stores all scraped samples locally and runs rules over this data to either aggregate and record new time series from existing data or generate alerts. [Grafana](https://grafana.com/) or other API consumers can be used to visualize the collected data.

Caliper clients may use a Prometheus [PushGateway](https://prometheus.io/docs/practices/pushing/) in order to publish transaction statistics, as an alternative to reporting statistics locally. By doing so we enable all Caliper clients to publish to a remote URL and gain access to the capabilities of Prometheus, which includes the ability to scrape data from configured targets. If a Prometheus monitor is specified, Caliper will default to publishing all transaction statistics to the Prometheus PushGateway.

All data stored on Prometheus may be queried by Caliper using the Prometheus query [HTTP API](https://prometheus.io/docs/prometheus/latest/querying/api/). At a minimum this may be used to perform aggregate queries in order to report back the transaction statistics, though it is also possible to perform custom queries in order to report back information that has been scraped from other connected sources. Queries issued are intended to generate reports and so are expected to result in either a single value, or a vector that can be condensed into a single value through the application of a statistical routine. It is advisable to create required queries using Grafana to ensure correct operation before transferring the query into the monitor. Please see [Prometheus](https://prometheus.io) and [Grafana](https://grafana.com/grafana) documentation for more information.

#### Configuring The Prometheus Monitor
The prometheus monitor definition consists of:
- url: The Prometheus URL, used for direct queries
- push_url: The Prometheus Push Gateway URL
- metrics: The queries to be run for inclusion within the Caliper report, comprised of to keys: `ignore` and `include`.
- `ignore` a string array that is used as a blacklist for report results. Any results where the component label matches an item in the list, will *not* be included in a generated report.
- `include` a series of blocks that describe the queries that are to be run at the end of each Caliper test.

The `include` block is defined by:
- query: the query to be issued to the Prometheus server at the end of each test. Note that Caliper will add time bounding for the query so that only results pertaining to the test round are included.
- step: the timing step size to use within the range query
- label: a string to match on the returned query and used when populating the report
- statistic: if multiple values are returned, for instance if looking at a specific resource over a time range, the statistic will condense the values to a single result to enable reporting. Permitted options are:
- max: return the maximum from all values
- min: return the minimum from all values
- avg: return the average from all values
- multiplier: An optional multiplier that may be used to convert exported metrics into a more convenient value (such as converting bytes to GB)

The following declares a Prometheus monitor that will run two bespoke queries between each test within the benchmark
```
monitor:
type:
- prometheus
prometheus:
url: "http://localhost:9090"
push_url: "http://localhost:9091"
metrics:
ignore: [prometheus, pushGateway, cadvisor, grafana, node-exporter]
include:
Endorse Time (s):
query: rate(endorser_propsal_duration_sum{chaincode="marbles:v0"}[5m])/rate(endorser_propsal_duration_count{chaincode="marbles:v0"}[5m])
step: 1
label: instance
statistic: avg
Max Memory (MB):
query: sum(container_memory_rss{name=~".+"}) by (name)
step: 10
label: name
statistic: max
multiplier: 0.000001
```
The two queries above will be listed in the generated report as "Endorse Time (s)" and "Max Memory (MB)" respectively:
- **Endorse Time (s):** Runs the listed query with a step size of 1; filters on return tags using the `instance` label; exclude the result if the instance value matches any of the string values provided in the `ignore` array; if the instance does not match an exclude option, then determine the average of all return results and return this value to be reported under "Endorse Time (s)".
- **Max Memory (MB):** Runs the listed query with a step size of 10; filter return tags using the `name` label; exclude the result if the instance value matches any of the string values provided in the `ignore` array; if the instance does not match an exclude option, then determine the maximum of all return results; multiply by the provided multiplier and return this value to be reported under "Max Memory (MB)".


#### Obtaining a Prometheus Enabled Network
A sample network that includes a docker-compose file for standing up a Prometheus server, a Prometheus PushGateway and a linked Grafana analytics container, is available within the companion [caliper-benchmarks repository](https://github.com/hyperledger/caliper-benchmarks/tree/master/networks/prometheus-grafana).


## Observers
The type of observer to use during a benchmark is declared in the `benchmark configuration file` through the specification a supported observer type in under the label `observer.type`. The integer interval at which observers fetch information from their targets, in seconds, is specified as an integer under the label `observer.interval`; this is a required property for local and prometheus observers.

Permitted observers are:
- none
- local
- prometheus

### None Observer
A `none` observer is used to ignore all transaction submissions of all clients. The following specifies the use of a none observer that will omit the console display of any transaction statistics during the benchmark process.

```
observer:
type: none
```

### Local Observer
A `local` observer is used to view current transaction submissions of all clients on a local host machine. The following specifies the use of a local observer that collects and reports current transaction status at 1 second intervals.

```
observer:
type: local
interval: 1
```

If a Prometheus monitor is in use, then a Prometheus observer should also be used.

### Prometheus Observer
A `prometheus` observer is used to view current transaction submissions of all clients that are reporting transactions to a Prometheus server. The following specifies the use of a Prometheus observer that collects and reports current transaction status at 5 second intervals.

```
observer:
type: prometheus
interval: 5
```

Use of a Prometheus observer is predicated on the availability and use of a Prometheus monitor. The observer will extract required URL information from the relevant sections under the Prometheus monitor specification.

### Grafana Visualization
Grafana is an analytics platform that may be used to query and visualize metrics collected by Prometheus. Caliper clients will send the following to the PushGateway:
- caliper_tps
- caliper_latency
- caliper_send_rate
- caliper_txn_submitted
- caliper_txn_success
- caliper_txn_failure
- caliper_txn_pending

Each of the above are sent to the PushGateway, tagged with the following labels:
- instance: the current test label
- round: the current test round
- client: the client identifier that is sending the information

We are currently working on a Grafana dashboard to give you immediate access to the metrics published above, but in the interim please feel free to create custom queries to view the above metrics that are accessible in real time.
1 change: 1 addition & 0 deletions docs/vNext/Rate_Controllers.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ layout: pageNext
title: "Rate Controllers"
categories: reference
permalink: /vNext/rate-controllers/
order: 4
---

The rate at which transactions are input to the blockchain system is a key factor within performance tests. It may be desired to send transactions at a specified rate or follow a specified profile. Caliper permits the specification of custom rate controllers to enable a user to perform testing under a custom loading mechanism. A user may specify their own rate controller or use one of the default options:
Expand Down
1 change: 1 addition & 0 deletions docs/vNext/Runtime_Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ layout: pageNext
title: "Runtime Configuration"
categories: reference
permalink: /vNext/runtime-config/
order: 3
---

Caliper relies on the [nconf](https://github.com/indexzero/nconf) package to provide a flexible and hierarchical configuration mechanism for runtime-related settings. Hierarchical configuration means that a runtime setting can be set or overridden from multiple sources/locations, and there is a priority order among them.
Expand Down
1 change: 1 addition & 0 deletions docs/vNext/Writing_Adapters.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ layout: pageNext
title: "Writing Adapters"
categories: reference
permalink: /vNext/writing-adaptors/
order: 2
---

## How to write your own blockchain adapter
Expand Down
Loading