Skip to content

Latest commit

 

History

History
33 lines (26 loc) · 2.17 KB

README.md

File metadata and controls

33 lines (26 loc) · 2.17 KB

Dashboard for monitoring a node

There are 2 main files:

  • telegraf/telegraf.toml is a configuration file for Telegraf. It uses the Tail plugin to:
    • read log files generated by sn_node
    • extract essential information
    • send it to an InfluxDB database
  • node_monitoring.json is a json export of the dashboard from InfluxDB OSS. It can be imported in InfluxDB OSS but not in InfluxDB Cloud.
  • node_monitor.json is a template export of the dashboard from InfluxDB Cloud. I didn't succeed in reimporting it (neither in InfluxDB OSS, nor in InfluxDB Cloud).

A few elements in Telegraf configuration must be modified depending on your context:

  • node: Name of the node displayed in the dashboard. You can monitor several nodes and in this case each node must have its own telegraf agent, each with a different node name in the configuration file. At runtime the choice list in the dashboard selects the node you want to monitor.
  • files: The log files to be watched by telegraf agent. For rotating files the array must contain 2 elements (one ending with 'sn_node.log.*' and the other ending with 'sn_node.log'). Note that ~ character doesn't work.
  • urls: InfluxDB URL
  • token: InfluxDB token. It must have write access to 'SN' bucket.
  • organization: The InfluxDB organization. It is freely defined in InfluxDB OSS but is the user email address for InfluxDB Cloud.

The InfluxDB bucket name must be defined as 'SN' (hardcoded in node_monitoring.json).


Remarks for my personal tests in a local docker network:

There are 2 stacks each with one service:

  • docker-compose-influxdb.yml: InfluxDB database
  • docker-compose-telegraf.yml: Telegraf agent

Useful commands:

  • To just relaunch telegraf agent: docker stack rm telegraf && docker stack deploy -c docker-compose-telegraf.yml telegraf && sleep 2 && docker ps -a and then docker logs <container_id> to observe telegraf agent own logs
  • To empty SN bucket: ./empty_bucket.sh
  • Useful range dates for my static test cases:
    • for logs-local-*: from 22022-02-21 22:51:00 to 2022-02-21 23:02:00
    • for ../docker_tmp/logs/*: from 2022-02-26 10:45:30 to 2022-02-26 10:55:32