You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Node exporter service is not working after running the upgrade command. It looks like an old systemd service configuration has been applied.
To check the installed version the command /opt/node_exporter/node_exporter --version is called which returns too much information when called directly on the vm and in the case of Ansible it shows nothing:
root@ec2-xx-xx-xx-xx:~# /opt/node_exporter/node_exporter --version
node_exporter, version 0.16.0 (branch: HEAD, revision: d42bd70f4363dced6b77d8fc311ea57b63387e4f)
build user: root@a67a9bc13a69
build date: 20180515-15:52:42
go version: go1.9.6
2020-09-17T12:56:19.9159473Z[38;21m12:56:19 INFO cli.engine.ansible.AnsibleCommand - TASK [upgrade : Node Exporter | Print version] *********************************[0m
2020-09-17T12:56:20.0773448Z[38;21m12:56:20 INFO cli.engine.ansible.AnsibleCommand - ok: [ec2-34-247-70-225.eu-west-1.compute.amazonaws.com] => {[0m
2020-09-17T12:56:20.0774876Z[38;21m12:56:20 INFO cli.engine.ansible.AnsibleCommand - "msg": [[0m
2020-09-17T12:56:20.0775705Z[38;21m12:56:20 INFO cli.engine.ansible.AnsibleCommand - "Installed version: ",[0m
2020-09-17T12:56:20.0781098Z[38;21m12:56:20 INFO cli.engine.ansible.AnsibleCommand - "Target version: 1.0.1"[0m
When running the epicli upgrade command, the update of the node exporter will always be started, which in the case of the next versions, where we will probably not update the node exporter again, is redundant in my opinion. I think we should compare the current version with the target version and make the update dependent on it.
To Reproduce
Steps to reproduce the bug:
Deploy a new cluster from v0.7 branch (or use the following image epiphanyplatform/epicli:0.7.1). One master vm should be enough to reproduce it.
Run epicli upgrade -b /path/to/build/dir from develop branch.
Expected behavior
Node exporter service is working properly after upgrading to 1.0.1.
Actual behavior
Node exporter service failed to start.
root@ec2-xx-xx-xx-xx:/opt/node_exporter# systemctl status prometheus-node-exporter.service
● prometheus-node-exporter.service - Service that runs Prometheus Node Exporter
Loaded: loaded (/etc/systemd/system/prometheus-node-exporter.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2020-09-17 13:35:07 UTC; 13min ago
Process: 13791 ExecStart=/opt/node_exporter/node_exporter --collector.conntrack --collector.diskstats --collector.entropy --collector.filefd --collector.filesystem --collector.loadavg --collector.mdadm --colle Main PID: 13791 (code=exited, status=1/FAILURE)
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: prometheus-node-exporter.service: Main process exited, code=exited, status=1/FAILURE
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: prometheus-node-exporter.service: Failed with result 'exit-code'.
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: prometheus-node-exporter.service: Service hold-off time over, scheduling restart.
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: prometheus-node-exporter.service: Scheduled restart job, restart counter is at 5.
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: Stopped Service that runs Prometheus Node Exporter.
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: prometheus-node-exporter.service: Start request repeated too quickly.
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: prometheus-node-exporter.service: Failed with result 'exit-code'.
Sep 17 13:35:07 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com systemd[1]: Failed to start Service that runs Prometheus Node Exporter.
OS (please complete the following information):
OS: [all]
Cloud Environment (please complete the following information):
Cloud Provider [all]
Logs:
Sep 17 12:55:54 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23143]: time="2020-09-17T12:55:54Z" level=error msg="ERROR: diskstats collector failed after 0.000153s: invalid line for /proc/diskstats for nvme0n1p1" source="collector.go:132"
Sep 17 12:56:09 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23143]: time="2020-09-17T12:56:09Z" level=error msg="Error reading textfile collector directory \"/var/lib/prometheus/node-exporter\": open /var/lib/prometheus/node-exporter: no such file or directory" source="textfile.go:192"
Sep 17 12:56:09 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23143]: time="2020-09-17T12:56:09Z" level=error msg="ERROR: diskstats collector failed after 0.000145s: invalid line for /proc/diskstats for nvme0n1" source="collector.go:132"
Sep 17 12:56:40 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[22664]: node_exporter: error: unknown long flag '--collector.netdev.ignored-devices', try --help
Sep 17 12:56:40 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[22724]: node_exporter: error: unknown long flag '--collector.netdev.ignored-devices', try --help
Sep 17 12:56:40 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[22747]: node_exporter: error: unknown long flag '--collector.netdev.ignored-devices', try --help
Sep 17 12:56:41 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[22775]: node_exporter: error: unknown long flag '--collector.netdev.ignored-devices', try --help
Sep 17 12:56:41 ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com prometheus-node-exporter[22801]: node_exporter: error: unknown long flag '--collector.netdev.ignored-devices', try --help
The text was updated successfully, but these errors were encountered:
Describe the bug
/opt/node_exporter/node_exporter --version
is called which returns too much information when called directly on the vm and in the case of Ansible it shows nothing:epicli upgrade
command, the update of the node exporter will always be started, which in the case of the next versions, where we will probably not update the node exporter again, is redundant in my opinion. I think we should compare the current version with the target version and make the update dependent on it.To Reproduce
Steps to reproduce the bug:
epiphanyplatform/epicli:0.7.1
). One master vm should be enough to reproduce it.epicli upgrade -b /path/to/build/dir
from develop branch.Expected behavior
Node exporter service is working properly after upgrading to 1.0.1.
Actual behavior
Node exporter service failed to start.
OS (please complete the following information):
Cloud Environment (please complete the following information):
Logs:
The text was updated successfully, but these errors were encountered: