Skip to content

Latest commit

 

History

History
143 lines (114 loc) · 10.3 KB

README.md

File metadata and controls

143 lines (114 loc) · 10.3 KB

Ansible playbook for https://metrics.shields.io

This Ansible playbook can be used to setup monitoring (https://metrics.shields.io) for Shields.io. It installs Prometheus, Telegraf, Grafana, NGINX and Let's Encrypt certificates (using Certbot).

Prometheus configuration contains all instances (servers) of shields.io. Grafana contains dashboards and worldPing plugin.

If you want to make changes in existing Grafana dashboards you have to update these files (explanation + instruction) and run this role. You can always save changes as a new dashboard: Dashboard settings > Save As ...

worldPing has to be enabled manually. It also requires Grafana.com API Key.

How to use it?

  1. Install python dependencies in a python 3 virtual environment:
pip install -r requirements.txt
  1. Prepare an inventory file inventory.ini:
metrics ansible_host=metrics.example.com ansible_port=22 ansible_user=ubuntu ansible_python_interpreter=/usr/bin/python3
  1. Copy a SSH key to remote server
  2. Install required Ansible roles:
ansible-galaxy install -r requirements.yml
  1. Define properties in variables.yml:
metrics_domain: metrics.example.com
mertics_grafana_admin_password: !vault |
          $ANSIBLE_VAULT;1.1;AES256
          ...
metrics_grafana_github_client_id: github_client_id
metrics_grafana_github_client_secret: !vault |
          $ANSIBLE_VAULT;1.1;AES256
          ...
mertics_prometheus_password: !vault |
          $ANSIBLE_VAULT;1.1;AES256
          ...
metrics_certbot_email: [email protected]

E-mail address (metrics_certbot_email) is used by Certbot for notification about certificates that are about to expire (doc).

You can encrypt passwords/secrets using Ansible Vault:

ansible-vault encrypt_string --ask-vault-pass --stdin-name 'my_key'
  1. Run a playbook:
ansible-playbook shields-io-metrics.yml -i inventory.ini -e @variables.yml --ask-vault-pass --ask-become-pass

Component versions

Services

Name The latest version Version used in this playbook
Grafana
Nginx
Prometheus

Ansible roles

Name The latest version Version used in this playbook
cloudalchemy.prometheus
cloudalchemy.grafana
cloudalchemy.node_exporter
cloudalchemy.blackbox-exporter
dj-wasabi.telegraf
nginxinc.nginx

Updating components

Grafana

  • update metrics_grafana_version in versions.yml file
  • run the playbook with grafana tags: ansible-playbook shields-io-metrics.yml -i inventory.ini -e @variables.yml -e @versions.yml --ask-vault-pass --ask-become-pass --tags grafana

Prometheus

  • update metrics_prometheus_version in versions.yml file
  • run the playbook with prometheus tags: ansible-playbook shields-io-metrics.yml -i inventory.ini -e @variables.yml -e @versions.yml --ask-vault-pass --ask-become-pass --tags prometheus

Nginx

  • update metrics_nginx_version in versions.yml file
  • run the playbook with nginx,certbot-nginx tags: ansible-playbook shields-io-metrics.yml -i inventory.ini -e @variables.yml -e @versions.yml --ask-vault-pass --ask-become-pass --tags nginx,certbot-nginx

Node exporter, Blackbox exporter

These components do not have a fixed version in versions.yml file or in shields-io-metrics.yml file. The playbook is using the default version value defined in corresponding Ansible roles. Usually, the new versions of Ansible role for these components are released shortly after releasing components. To update the component simply update the Ansible role (instructions below).

Telegraf

⚠️ apt repository containing Telegraf contains only the latest version (dj-wasabi/ansible-telegraf#95 (comment), influxdata/telegraf#5685)

  • update metrics_telegraf_version in versions.yml file
  • run the playbook with telegraf tags: ansible-playbook shields-io-metrics.yml -i inventory.ini -e @variables.yml -e @versions.yml --ask-vault-pass --ask-become-pass --tags telegraf

Any Ansible role

  • update version in requirements.yml
  • run ansible-galaxy install -r requirements.yml --force
  • run the playbook with one of the tags:
    • blackbox-exporter
    • node-exporter
    • prometheus
    • grafana
    • nginx,certbot-nginx
    • telegraf

Resources

Resource Path Access restrictions
Grafana / public access for all dashboards; administration using username admin and password from mertics_grafana_admin_password variable or using GitHub authentication
Prometheus /prometheus requires username prometheus and password from mertics_prometheus_password variable
Telegraf /telegraf requires username telegraf and password from mertics_telegraf_password variable
or username telegraf-staging and password from mertics_telegraf_staging_password variable
or username telegraf-production and password from mertics_telegraf_production_password variable

https://metrics.shields.io/ uses one single-core virtual host with 2 GB RAM VPS SSD 1 with Ubuntu 18.04.

GitHub authentication

Grafana allows to authenticate with GitHub. At https://metrics.shields.io maintainers from core team can log into Grafana using GitHub with 'Editor' role. Currently GitHub OAuth application used for Grafana at metrics.shields.io is owned by @platan.

Testing/running locally

Vagrant can be used to test the configuration or run monitoring locally (documentation).

  1. Start a virtual server and run the playbook:
# go to repo directory
cd repo-dir
# or `OBJC_DISABLE_INITIALIZE_FORK_SAFETY=true vagrant up` if you have get `objc[7750]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.`
vagrant up
  1. Now you can visit:

Credentials are defined in variables-local.yml.

It is possible to run Ansible manually against local machine:

ansible-playbook shields-io-metrics.yml --private-key .vagrant/machines/metrics/virtualbox/private_key -i .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory -e @variables-local.yml -e @versions.yml --tags grafana

Finally you can stop (vagrant halt) or remove (vagrant destroy) the virtual server.