Skip to content

Commit

Permalink
Doc and test updates
Browse files Browse the repository at this point in the history
  • Loading branch information
FlorianVeaux committed Oct 3, 2019
1 parent 067a410 commit 8b09ff7
Show file tree
Hide file tree
Showing 13 changed files with 1,905 additions and 348 deletions.
64 changes: 45 additions & 19 deletions mapr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,22 @@

## Overview

This check monitors [MapR][1] through the Datadog Agent.
This check monitors [MapR][1] 6.1+ through the Datadog Agent.

## Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates][2] for guidance on applying these instructions.
Follow the instructions below to install and configure this check for an Agent running on a host.

### Installation

The MapR check is included in the [Datadog Agent][2] package. However, additional installation steps are necessary:
The MapR check is included in the [Datadog Agent][2] package but requires additional setup operations.

1. Add `/opt/mapr/lib/` to your `ld.so.conf` file. The agent uses the mapr-streams-python library which requires access to some shared libraries.
2. Create a password for the `dd-agent` user, then add this user to every node of the cluster with the same `UID`/`GID` so it is recognized by MapR. See [Managing users and groups][10] for additional details.
3. Install the agent on every host you want to monitor.
4. Generate a [long-lived ticket][8] for the `dd-agent` user.
5. Make sure the ticket is readable by the `dd-agent` user.

1. Download and extract the [MapR Client][12].
2. Update `LD_LIBRARY_PATH` and `DYLD_LIBRARY_PATH` as explained in the [MapR documentation][9] (usually with `/opt/mapr/lib/)`.
3. Set `JAVA_HOME` (if you are running on macOS install system Java).
3. Install the [mapr-streams-python][7] library.
4. Create a password for the `dd-agent` user, then add this user to every node of the cluster with the same `UID`/`GID` so it is recognized by MapR. See [Managing users and groups][10] for additional details.
5. If security is enabled on the cluster (recommended), generate a [long-lived ticket][8] for the `dd-agent` user.

### Configuration
#### Metric collection
Expand All @@ -28,8 +28,35 @@ The MapR check is included in the [Datadog Agent][2] package. However, additiona

#### Log collection

MapR uses fluentD for logs. Use the [fluentd datadog plugin][11] to collect MapR logs.

MapR uses fluentD for logs. Use the [fluent datadog plugin][11] to collect MapR logs.
The following command will download and install the plugin into the right directory.

`curl https://raw.githubusercontent.com/DataDog/fluent-plugin-datadog/master/lib/fluent/plugin/out_datadog.rb -o /opt/mapr/fluentd/fluentd-<VERSION>/lib/fluentd-<VERSION>-linux-x86_64/lib/app/lib/fluent/plugin/out_datadog.rb`

Then update the `/opt/mapr/fluentd/fluentd-<VERSION>/etc/fluentd/fluentd.conf` with the following section.

```
<match *>
@type copy
<store> # This section is here by default and sends the logs to ElasticCache for Kibana.
@include /opt/mapr/fluentd/fluentd-1.4.0/etc/fluentd/es_config.conf
include_tag_key true
tag_key service_name
</store>
<store> # This new section also forwards the logs to Datadog
@type datadog
@id dd_agent
include_tag_key true
dd_source mapr
dd_tags "flo:test"
service <YOUR_SERVICE_NAME>
api_key <YOUR_API_KEY>
</store>
```

Refer to [fluent_datadog_plugin][11] documentation for more details about the option you can use.


### Validation

[Run the Agent's status subcommand][5] and look for `mapr` under the Checks section.
Expand All @@ -42,30 +69,29 @@ See [metadata.csv][13] for a list of default metrics provided by this integratio

### Service Checks

The MapR check does not include any service checks.
- `mapr.can_connect`:
Returns CRITICAL if the Agent fails to connect to the MapR monitoring streams, otherwise returns UP.

### Events

The MapR check does not include any events.




## Troubleshooting

Need help? Contact [Datadog support][6].

[1]: https://mapr.com
[2]: https://docs.datadoghq.com/agent/autodiscovery/integrations
[2]: https://app.datadoghq.com/account/settings#agent
[3]: https://github.com/DataDog/integrations-core/blob/master/mapr/datadog_checks/mapr/data/conf.yaml.example
[4]: https://docs.datadoghq.com/agent/guide/agent-commands/?tab=agentv6#start-stop-and-restart-the-agent
[5]: https://docs.datadoghq.com/agent/guide/agent-commands/?tab=agentv6#agent-status-and-information
[6]: https://docs.datadoghq.com/help
[7]: https://mapr.com/docs/52/MapR_Streams/MapRStreamsPythonExample.html
[8]: https://docs.datadoghq.com/integrations/oracle/
[7]: https://mapr.com/docs/61/MapR_Streams/MapRStreamsPythonExample.html
[8]: https://mapr.com/docs/61/SecurityGuide/GeneratingServiceTicket.html
[9]: https://mapr.com/docs/60/MapR_Streams/MapRStreamCAPISetup.html
[10]: https://mapr.com/docs/61/AdministratorGuide/c-managing-users-and-groups.html
[11]: https://www.rubydoc.info/gems/fluent-plugin-datadog
[12]: https://mapr.com/docs/61/AdvancedInstallation/SettingUptheClient-install-mapr-client.html
[13]: https://github.com/DataDog/integrations-core/blob/master/mapr/metadata.csv

[14]: http://upstart.ubuntu.com/cookbook/#environment-variables
[15]: https://www.freedesktop.org/software/systemd/man/systemd.service.html#Command%20lines
12 changes: 11 additions & 1 deletion mapr/assets/service_checks.json
Original file line number Diff line number Diff line change
@@ -1 +1,11 @@
[]
[
{
"agent_version": "6.15.0",
"integration":"mapr",
"check": "mapr.can_connect",
"statuses": ["ok", "critical"],
"groups": ["topic"],
"name": "Can connect and subscribe to mapr topic",
"description": "Returns `CRITICAL` if the agent fails to subscribe to the stream topic, `OK` otherwise."
}
]
171 changes: 171 additions & 0 deletions mapr/datadog_checks/mapr/common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# (C) Datadog, Inc. 2019
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)

COUNT_METRICS = {
'mapr.cache.lookups_data',
'mapr.cache.lookups_dir',
'mapr.cache.lookups_inode',
'mapr.cache.lookups_largefile',
'mapr.cache.lookups_meta',
'mapr.cache.lookups_smallfile',
'mapr.cache.lookups_table',
'mapr.cache.misses_data',
'mapr.cache.misses_dir',
'mapr.cache.misses_inode',
'mapr.cache.misses_largefile',
'mapr.cache.misses_meta',
'mapr.cache.misses_smallfile',
'mapr.cache.misses_table',
'mapr.cldb.rpc_received',
'mapr.cldb.rpcs_failed',
'mapr.db.append_bytes',
'mapr.db.append_rpcrows',
'mapr.db.append_rpcs',
'mapr.db.cdc.sent_bytes',
'mapr.db.checkandput_bytes',
'mapr.db.checkandput_rpcrows',
'mapr.db.checkandput_rpcs',
'mapr.db.flushes',
'mapr.db.forceflushes',
'mapr.db.fullcompacts',
'mapr.db.get_bytes',
'mapr.db.get_readrows',
'mapr.db.get_resprows',
'mapr.db.get_rpcs',
'mapr.db.increment_bytes',
'mapr.db.increment_rpcrows',
'mapr.db.increment_rpcs',
'mapr.db.minicompacts',
'mapr.db.put_bytes',
'mapr.db.put_readrows',
'mapr.db.put_rpcrows',
'mapr.db.put_rpcs',
'mapr.db.repl.sent_bytes',
'mapr.db.scan_bytes',
'mapr.db.scan_readrows',
'mapr.db.scan_resprows',
'mapr.db.scan_rpcs',
'mapr.db.table.read_bytes',
'mapr.db.table.read_rows',
'mapr.db.table.resp_rows',
'mapr.db.table.rpcs',
'mapr.db.table.value_cache_hits',
'mapr.db.table.value_cache_lookups',
'mapr.db.table.write_bytes',
'mapr.db.table.write_rows',
'mapr.db.ttlcompacts',
'mapr.db.updateandget_bytes',
'mapr.db.updateandget_rpcrows',
'mapr.db.updateandget_rpcs',
'mapr.db.valuecache_hits',
'mapr.db.valuecache_lookups',
'mapr.drill.queries_completed',
'mapr.fs.bulk_writes',
'mapr.fs.bulk_writesbytes',
'mapr.fs.kvstore_delete',
'mapr.fs.kvstore_insert',
'mapr.fs.kvstore_lookup',
'mapr.fs.kvstore_scan',
'mapr.fs.local_readbytes',
'mapr.fs.local_reads',
'mapr.fs.local_writebytes',
'mapr.fs.local_writes',
'mapr.fs.read_bytes',
'mapr.fs.read_cachehits',
'mapr.fs.read_cachemisses',
'mapr.fs.reads',
'mapr.fs.statstype_create',
'mapr.fs.statstype_lookup',
'mapr.fs.statstype_read',
'mapr.fs.statstype_write',
'mapr.fs.write_bytes',
'mapr.fs.writes',
'mapr.io.write_bytes',
'mapr.io.writes',
'mapr.rpc.bytes_recd',
'mapr.rpc.bytes_sent',
'mapr.rpc.calls_recd',
'mapr.streams.listen_bytes',
'mapr.streams.listen_msgs',
'mapr.streams.listen_rpcs',
'mapr.streams.produce_bytes',
'mapr.streams.produce_msgs',
'mapr.streams.produce_rpcs',
'mapr.volmetrics.read_ops',
'mapr.volmetrics.write_ops',
}

MONOTONIC_COUNTER_METRICS = {
'mapr.cldb.containers_created',
'mapr.process.context_switch_involuntary',
'mapr.process.context_switch_voluntary',
'mapr.process.cpu_time.syst',
'mapr.process.cpu_time.user',
'mapr.process.disk_octets.read',
'mapr.process.disk_octets.write',
'mapr.process.disk_ops.read',
'mapr.process.disk_ops.write',
'mapr.process.page_faults.majflt',
'mapr.process.page_faults.minflt',
}

GAUGE_METRICS = {
'mapr.alarms.alarm_raised',
'mapr.cldb.cluster_cpu_total',
'mapr.cldb.cluster_cpubusy_percent',
'mapr.cldb.cluster_disk_capacity',
'mapr.cldb.cluster_diskspace_used',
'mapr.cldb.cluster_memory_capacity',
'mapr.cldb.cluster_memory_used',
'mapr.cldb.containers',
'mapr.cldb.containers_unusable',
'mapr.cldb.disk_space_available',
'mapr.cldb.nodes_in_cluster',
'mapr.cldb.nodes_offline',
'mapr.cldb.storage_pools_cluster',
'mapr.cldb.storage_pools_offline',
'mapr.cldb.volumes',
'mapr.db.cdc.pending_bytes',
'mapr.db.get_currpcs',
'mapr.db.index.pending_bytes',
'mapr.db.put_currpcs',
'mapr.db.repl.pending_bytes',
'mapr.db.scan_currpcs',
'mapr.db.table.latency',
'mapr.db.valuecache_usedSize',
'mapr.drill.allocator_root_peak',
'mapr.drill.allocator_root_used',
'mapr.drill.blocked_count',
'mapr.drill.count',
'mapr.drill.fd_usage',
'mapr.drill.fragments_running',
'mapr.drill.heap_used',
'mapr.drill.non_heap_used',
'mapr.drill.queries_running',
'mapr.drill.runnable_count',
'mapr.drill.waiting_count',
'mapr.io.read_bytes',
'mapr.io.reads',
'mapr.process.cpu_percent',
'mapr.process.data',
'mapr.process.mem_percent',
'mapr.process.rss',
'mapr.process.vm',
'mapr.status.ok',
'mapr.streams.listen_currpcs',
'mapr.topology.disks_total_capacity',
'mapr.topology.disks_used_capacity',
'mapr.topology.utilization',
'mapr.volmetrics.read_latency',
'mapr.volmetrics.read_throughput',
'mapr.volmetrics.write_latency',
'mapr.volmetrics.write_throughput',
'mapr.volume.logical_used',
'mapr.volume.quota',
'mapr.volume.snapshot_used',
'mapr.volume.total_used',
'mapr.volume.used',
}

ALLOWED_METRICS = GAUGE_METRICS.union(COUNT_METRICS).union(MONOTONIC_COUNTER_METRICS)
34 changes: 23 additions & 11 deletions mapr/datadog_checks/mapr/data/conf.yaml.example
Original file line number Diff line number Diff line change
@@ -1,27 +1,39 @@
init_config:

instances:
## @param hostname - string - required
## The MapR host to monitor.
-
## @param hostname - string - optional - default: `socket.getfqdn()`
## The MapR host to monitor. This is used to find the correct topic to read metrics from.
## https://mapr.com/docs/61/AdministratorGuide/spyglass-on-streams.html
#
- mapr_host: <MAPR_HOST>
# hostname: <FQDN>

## @param topic_path - string - required
## @param stream_path - string - optional - default: /var/mapr/mapr.monitoring/metricstreams
## The MapR topic path.
#
topic_path: /var/mapr/mapr.monitoring/metricstreams
# stream_path: /var/mapr/mapr.monitoring/metricstreams

## @param whitelist - list - required
## List regexes of metrics to collect. They will be prefixed with `mapr.`
## @param streams_count - integer - optional - default: 2
## The mapr setting for the number of monitoring streams.
## If this value does not exactly match what you've configured on mapr,
## the integration will not be able to find the correct topic to read metrics from.
## See https://mapr.com/docs/61/AdministratorGuide/spyglass-on-streams.html for more information
#
whitelist:
- fs.*
# streams_count: 2

## @param mapr_ticketfile_location - string - optional
## @param metric_whitelist - list - optional
## List of regexes for metrics to collect. Note that you can only collect metrics starting with "mapr.",
## see https://github.com/DataDog/integrations-core/blob/master/mapr/datadog_checks/mapr/common.py
## for the list of metrics you can collect.
## Collect all mapr related metrics by default
#
# metric_whitelist: ['.*']

## @param ticket_location - string - optional
## The path to the MapR user ticket, if included it overrides the MAPR_TICKETFILE_LOCATION environment variable.
## Either the environment variable or this config option needs to be set if security is enabled on the cluster.
#
# mapr_ticketfile_location: <MAPR_TICKETFILE_LOCATION>
# ticket_location: <TICKETFILE_LOCATION>

## @param tags - list of key:value elements - optional
## A list of tags to attach to every metric, event, and service check emitted by this integration.
Expand Down
Loading

0 comments on commit 8b09ff7

Please sign in to comment.