Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-353 Prometheus metrics standardization #1508

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions docs/modules/ROOT/pages/list-of-metrics.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,19 @@

The table below lists the metrics with their explanations in grouped by their relevant subjects.

The metrics are collected per member and specific to the local member from which
The metrics are collected per member and are specific to the local member from which
you collect them. For example, the distributed data structure metrics
reflect the local statistics of that data structure for the portion
held in that member.

Some metrics may store cluster-wide agreed value, that is, they may show the values obtained
Some metrics may store cluster-wide agreed values, that is, they may show the values obtained
by communicating with other members in the cluster. This type of
metrics reflect the member's local view of the cluster (consider split-brain scenarios). The `clusterStartTime` is an example of this type of
metrics, and its value in the local member is obtained by communicating
metric reflects the member's local view of the cluster (consider split-brain scenarios). The `clusterStartTime` is an example of this type of
metric, and its value in the local member is obtained by communicating
with the master member.

NOTE: If you use Management Center to export cluster-wide metrics to Prometheus, Management Center reformats the metrics to align with Prometheus best practice recommendations. See link:https://docs.hazelcast.com/management-center/latest/integrate/prometheus-monitoring[Prometheus Monitoring].

.Streaming Engine Cluster-Wide Metrics
[%collapsible]
====
Expand Down
176 changes: 14 additions & 162 deletions docs/modules/maintain-cluster/pages/monitoring.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -420,8 +420,8 @@ same metric with different values.

The metric collection runs in regular intervals on each member, but note
that the metric collection on different cluster members happens at
different moments in time. So if you try to correlate metrics from
different members, they can be from different moments of time.
different moments in time. If you try to correlate metrics from
different members, they may have different timestamps.

=== Hazelcast Metrics

Expand Down Expand Up @@ -530,11 +530,11 @@ they have form further sub-nodes in the resulting tree structure.
==== Prometheus

Prometheus is a popular monitoring system and time series database.
Setting up monitoring via Prometheus consists of two steps. First step
Setting up monitoring via Prometheus consists of two steps. The first step
is exposing an HTTP endpoint with metrics. The second step is setting up
Prometheus server, which pulls the metrics in a specified interval.

The Prometheus javaagent is already part of the Hazelcast
The Prometheus Java agent is already part of the Hazelcast
distribution and just needs to be enabled. Enable the agent and expose
all metrics via HTTP endpoint by setting an environment variable
PROMETHEUS_PORT, you can change the port to any available port:
Expand All @@ -551,6 +551,8 @@ Prometheus enabled on port 8080

The metrics are available on `\http://localhost:8080`.

NOTE: You will need to configure Prometheus to scrape each member individually, manage changes from scaling and restarts, and collate the metrics from all members. If you are an Enterprise customer, you can configure Prometheus to scrape metrics from Management Center to easily collect metrics for the whole cluster. See link:https://docs.hazelcast.com/management-center/latest/integrate/prometheus-monitoring[Prometheus Monitoring].
Rob-Hazelcast marked this conversation as resolved.
Show resolved Hide resolved

For a guide on how to set up Prometheus server go to the
https://prometheus.io/docs/prometheus/latest/getting_started[Prometheus website].

Expand Down Expand Up @@ -2485,189 +2487,39 @@ com.hazelcast.sql.HazelcastSqlException: Cannot resolve IMap schema because it d
```


* Prometheus: You can use this 3rd party tool to filter alert metrics. See the
* Prometheus: You can use this third-party tool to filter alert metrics. See the
xref:maintain-cluster:monitoring.adoc#prometheus-2[Prometheus section] for details.

Besides the above channels, you can also benefit from Hazelcast logging mechanism as an indirect
way of getting alerts. See the xref:maintain-cluster:monitoring.adoc#logging[Logging section] for details.

To learn the possible actions on the alerts, see the xref:troubleshoot:remedies-for-alerts.adoc[Actions and Remedies for Alerts section].

== Integrating with 3rd Party Tools
== Integrating with Third-Party Tools
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly everything here was redundant with the MC docs (mostly an exact copy), so I've replaced the duplicate content with links.


Management Center monitors your entire Hazelcast cluster, making it much easier to integrate with third-party tools without needing to connect to individual members, manage changes from scaling and restarts, or collate metrics from all members.

=== Prometheus

Hazelcast Management Center can expose the metrics collected from cluster members to https://prometheus.io/[Prometheus^]. This
Hazelcast Management Center can expose the metrics collected from cluster members to https://prometheus.io/[Prometheus]. This
feature can be turned on by setting the `hazelcast.mc.prometheusExporter.enabled` system property to `true`.

Prometheus can be configured to scrape Management Center in `prometheus.yml` as follows:

[source,yaml]
----
scrape_configs:
- job_name: 'HZ MC'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:8080'] # replace this address with the network address of Hazelcast Management Center
----

After starting Prometheus with this configuration, all metrics will be exported to Prometheus with the `hz_` prefix. The metrics
are also available via the xref:maintain-cluster:monitoring.adoc#jmx-api-per-member[member JMX API].
With the default configuration, Management Center exports all metrics reported by the cluster members. Since it can be overly
verbose for some use cases, the metrics can be filtered with the `hazelcast.mc.prometheusExporter.filter.metrics.included`
or the `hazelcast.mc.prometheusExporter.filter.metrics.excluded` system properties, both being comma-separated lists of
metric names.

Example of starting Management Center with specifying the metrics exported to Prometheus:

[source,bash,subs="attributes+"]
----
java -jar -Dhazelcast.mc.prometheusExporter.enabled=true \
-Dhazelcast.mc.prometheusExporter.filter.metrics.included=hz_topic_totalReceivedMessages,hz_map_totalPutLatency \
-jar hazelcast-management-center-{full-version}.jar
----

Example of starting Management Center with specifying the metrics to be excluded from the Prometheus export:

[source,bash,subs="attributes+"]
----
java -jar -Dhazelcast.mc.prometheusExporter.enabled=true \
-Dhazelcast.mc.prometheusExporter.filter.metrics.excluded=hz_os_systemLoadAverage,hz_memory_freeHeap \
-jar hazelcast-management-center-{full-version}.jar
----

By default, Prometheus connects via the same IP and port as the Management Center web interface. It is possible to
override the port number using the `-Dhazelcast.mc.prometheusExporter.port` system property. Let's say you have
started Management Center as shown below:

[source,bash,subs="attributes+"]
----
java -jar -Dhazelcast.mc.prometheusExporter.enabled=true \
-Dhazelcast.mc.prometheusExporter.port=2222 \
-jar hazelcast-management-center-{full-version}.jar
----

Then, the Prometheus endpoint will be available at `http://localhost:2222/metrics`, which should be reflected by the
Prometheus configuration as below:

[source,yaml]
----
scrape_configs:
- job_name: 'HZ MC'
static_configs:
- targets: ['localhost:2222']
----

NOTE: If you want to visualize the Prometheus metrics using Grafana, then you can start with
https://grafana.com/grafana/dashboards/13183[this dashboard].

This is an Enterprise feature. See link:https://docs.hazelcast.com/management-center/latest/integrate/prometheus-monitoring[Prometheus Monitoring] for details.

=== AppDynamics

You can use the xref:{page-latest-supported-mc}@management-center:jmx:jmx.adoc[Clustered JMX] interface to integrate the <<management-center, Hazelcast Management Center>>
with *AppDynamics*. To perform this integration, attach the AppDynamics
Java agent to the Management Center.

For agent installation, see the
http://docs.appdynamics.com/display/PRO14S/Install%2Bthe%2BApp%2BAgent%2Bfor%2BJava[Install the App Agent for Java] page.

For monitoring on AppDynamics, see the
http://docs.appdynamics.com/display/PRO14S/Monitor%2BJMX%2BMBeans#MonitorJMXMBeans-UsingAppDynamicsforJMXMonitoring[Using AppDynamics for JMX Monitoring] page.

After installing AppDynamics agent, you can start the Management Center as shown below:

[source,bash,subs="attributes+"]
----
java -javaagent:/path/to/javaagent.jar \
-Dhazelcast.mc.jmx.enabled=true \
-Dhazelcast.mc.jmx.port=9999 -jar hazelcast-management-center-{full-version}.jar
----

When the Management Center starts, you should see the logs below:

```
Started AppDynamics Java Agent Successfully.
Hazelcast Management Center starting on port 8080 at path : /
```
This is an Enterprise feature. See link:https://docs.hazelcast.com/management-center/latest/integrate/jmx#integrating-jmx-with-appdynamics[Integrating JMX with AppDynamics] for details.

=== New Relic

You can use the xref:{page-latest-supported-mc}@management-center:jmx:jmx.adoc[Clustered JMX] interface to integrate the <<management-center, Hazelcast Management Center>>
with New Relic. To perform this integration, attach the New Relic Java agent
and provide an extension file that describes which metrics will be sent to New Relic.

See http://docs.newrelic.com/docs/java/custom-jmx-instrumentation-by-yml[Custom JMX instrumentation by YAML]
on the New Relic webpage.

The following is an example Map monitoring `.yml` file for New Relic:

[source,plain]
----
name: Clustered JMX
version: 1.0
enabled: true

jmx:
- object_name: ManagementCenter[clustername]:type=Maps,name=mapname
metrics:
- attributes: PutOperationCount, GetOperationCount, RemoveOperationCount, Hits, BackupEntryCount, OwnedEntryCount, LastAccessTime, LastUpdateTime
- type: simple
- object_name: ManagementCenter[clustername]:type=Members,name="member address in double quotes"
metrics:
- attributes: OwnedPartitionCount
- type: simple
----

Put the `.yml` file in the `extensions` directory in your New Relic
installation. If an `extensions` directory does not exist there, create one.

After you set your extension, attach the New Relic Java agent and
start the Management Center as shown below.

[source,bash,subs="attributes+"]
----
java -javaagent:/path/to/newrelic.jar -Dhazelcast.mc.jmx.enabled=true\
-Dhazelcast.mc.jmx.port=9999 -jar hazelcast-management-center-{full-version}.jar
----

If your logging level is set to `FINER`, you should see the log listing
in the file `newrelic_agent.log`, which is located in the `logs` directory
in your New Relic installation. The following is an example log listing:

```
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINE:
JMX Service : querying MBeans (1)
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
JMX Service : MBeans query ManagementCenter[dev]:type=Members,
name="192.168.2.79:5701", matches 1
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric OwnedPartitionCount : 68
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
JMX Service : MBeans query ManagementCenter[dev]:type=Maps,name=orders,
matches 1
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric Hits : 46,593
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric BackupEntryCount : 1,100
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric OwnedEntryCount : 1,100
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric RemoveOperationCount : 0
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric PutOperationCount : 118,962
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric GetOperationCount : 0
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric LastUpdateTime : 1,401,962,426,811
Jun 5, 2014 14:18:43 +0300 [72696 62] com.newrelic.agent.jmx.JmxService FINER:
Recording JMX metric LastAccessTime : 1,401,962,426,811
```

Then you can navigate to your New Relic account and create custom dashboards.
See link:https://docs.newrelic.com/docs/query-your-data/explore-query-data/dashboards/introduction-dashboards/[Get Started with Dashboards].

While you are creating the dashboard, you should see the metrics that
you are sending to New Relic from the Management Center in the **Metrics**
section under the JMX directory.
This is an Enterprise feature. See link:https://docs.hazelcast.com/management-center/latest/integrate/jmx#integrating-jmx-with-new-relic[Integrating JMX with New Relic] for details.