Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #2747 upgrade node_exporter #2761

Merged
merged 20 commits into from
Dec 20, 2021
Merged

Conversation

romsok24
Copy link
Contributor

This PR was created to satisfy the work scope described in EPI#2747, that is to upgrade Prometheus Node Exporter to the newest version.

What was done:

  • requrements.... file --> updated for every distro
  • tested with epicli on ubu18
  • run the: epicli test -b ....... -g <component ie. node_exporter>
  • tested epicli upgrade -b build/dir/ (node_exporter is not in the scope of epicli upgrade command )

@przemyslavic
Copy link
Collaborator

Why not upgrading to v1.3.0? https://github.com/prometheus/node_exporter/releases/

@przemyslavic
Copy link
Collaborator

/azp run

@romsok24
Copy link
Contributor Author

@przemyslavic
Copy link
Collaborator

/azp run

@romsok24
Copy link
Contributor Author

@przemyslavic We have a GO for 1.3.0 so I've updated this PR by changing the node-exporter ver to 1.3.0.
Could U pls run a tests?

@przemyslavic
Copy link
Collaborator

/azp run

@przemyslavic
Copy link
Collaborator

  1. I see errors in the logs:
[root@ec21-1-1-1 ~]# journalctl -f -t prometheus-node-exporter
-- Logs begin at Mon 2021-11-29 11:11:55 UTC. --
Nov 29 12:30:27 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:30:27.527Z caller=node_exporter.go:115 level=info collector=xfs
Nov 29 12:30:27 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:30:27.527Z caller=node_exporter.go:115 level=info collector=zfs
Nov 29 12:30:27 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:30:27.527Z caller=node_exporter.go:199 level=info msg="Listening on" address=:9100
Nov 29 12:30:27 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:30:27.527Z caller=tls_config.go:195 level=info msg="TLS is disabled." http2=false
Nov 29 12:30:42 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:30:42.517Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
Nov 29 12:30:57 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:30:57.523Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
Nov 29 12:31:12 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:31:12.519Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
Nov 29 12:31:27 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:31:27.516Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
Nov 29 12:31:42 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:31:42.521Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
Nov 29 12:31:57 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:31:57.516Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
Nov 29 12:32:12 ec21-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[23035]: ts=2021-11-29T12:32:12.518Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
  1. There is an option to install Node Exporter as a DaemonSet for AKS/EKS and I think that needs to be updated as well:
    https://github.com/epiphany-platform/epiphany/blob/eee05061a291e6084a55587d05164a8232cc0af8/schema/common/defaults/configuration/node-exporter.yml#L11 + requirements.txt

@przemyslavic
Copy link
Collaborator

/azp run

@przemyslavic
Copy link
Collaborator

przemyslavic commented Dec 3, 2021

Upgrade fails:

2021-12-03T11:47:04.5966730Z 11:47:04 INFO cli.engine.ansible.AnsibleCommand - TASK [upgrade : Node Exporter as System Service | Update systemd service configuration] ***
2021-12-03T11:47:04.7325943Z 11:47:04 ERROR cli.engine.ansible.AnsibleCommand - An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.errors.AnsibleUndefinedVariable: 'specification' is undefined
2021-12-03T11:47:04.7327711Z 11:47:04 ERROR cli.engine.ansible.AnsibleCommand - fatal: [ci-upgnexazceflannel-monitoring-vm-0]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'specification' is undefined"}
2021-12-03T11:47:04.7546660Z 11:47:04 ERROR cli.engine.ansible.AnsibleCommand - An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.errors.AnsibleUndefinedVariable: 'specification' is undefined
2021-12-03T11:47:04.7548551Z 11:47:04 ERROR cli.engine.ansible.AnsibleCommand - fatal: [ci-upgnexazceflannel-repository-vm-0]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'specification' is undefined"}

and it still complains about /var/lib/prometheus/node-exporter directory (this is for apply mode)

Dec 03 13:05:04 ec2-1-1-1-1.eu-west-1.compute.amazonaws.com prometheus-node-exporter[3026]: ts=2021-12-03T13:05:04.706Z caller=textfile.go:208 level=error collector=textfile msg="failed to read textfile collector directory" path="\"/var/lib/prometheus/node-exporter\"" err="open \"/var/lib/prometheus/node-exporter\": no such file or directory"
[root@ec2-1-1-1-1 ~]# ls -lh /var/lib/prometheus/
total 0
drwxr-x---. 3 root node_exporter 32 Dec  3 11:29 node-exporter

Maybe this is an issue with escaping the path as the directory already exists or access permissions.

- '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|run)($|/)'
- '--collector.netdev.device-blacklist="^$"'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|run)($|/)'
- '--collector.netdev.device-exclude="^$"'
- '--collector.textfile.directory="/var/lib/prometheus/node-exporter"'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be

Suggested change
- '--collector.textfile.directory="/var/lib/prometheus/node-exporter"'
- '--collector.textfile.directory=/var/lib/prometheus/node-exporter'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@przemyslavic
Copy link
Collaborator

/azp run

@przemyslavic
Copy link
Collaborator

przemyslavic commented Dec 6, 2021

  1. Node exporter v1.3.1 has already been released 😏
  2. The issue with non-existent path /var/lib/prometheus/node-exporter still occurs after the upgrade - the fix has to be applied also for upgrade mode
  3. There are issues with service definition on Ubuntu. The service starts properly, however it says it's ignoring unknown escape sequences:
[operations@ci-nexazurubuflannel-repository-vm-0 ~]$ sudo systemctl status prometheus-node-exporter.service
● prometheus-node-exporter.service - Service that runs Prometheus Node Exporter
   Loaded: loaded (/etc/systemd/system/prometheus-node-exporter.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2021-12-06 09:49:30 UTC; 30min ago
 Main PID: 14556 (node_exporter)
    Tasks: 4 (limit: 4074)
   CGroup: /system.slice/prometheus-node-exporter.service
           └─14556 /opt/node_exporter/node_exporter --collector.conntrack --collector.diskstats --collector.entropy --collector.filefd --collector.filesyste

Dec 06 09:49:30 ci-nexazurubuflannel-repository-vm-0 prometheus-node-exporter[14556]: ts=2021-12-06T09:49:30.490Z caller=node_exporter.go:115 level=info col
Dec 06 09:49:30 ci-nexazurubuflannel-repository-vm-0 prometheus-node-exporter[14556]: ts=2021-12-06T09:49:30.490Z caller=node_exporter.go:115 level=info col
Dec 06 09:49:30 ci-nexazurubuflannel-repository-vm-0 prometheus-node-exporter[14556]: ts=2021-12-06T09:49:30.490Z caller=node_exporter.go:199 level=info msg
Dec 06 09:49:30 ci-nexazurubuflannel-repository-vm-0 prometheus-node-exporter[14556]: ts=2021-12-06T09:49:30.492Z caller=tls_config.go:195 level=info msg="T
Dec 06 09:50:19 ci-nexazurubuflannel-repository-vm-0 systemd[1]: /etc/systemd/system/prometheus-node-exporter.service:7: Ignoring unknown escape sequences:
Dec 06 09:50:19 ci-nexazurubuflannel-repository-vm-0 systemd[1]: /etc/systemd/system/prometheus-node-exporter.service:7: Ignoring unknown escape sequences:
Dec 06 09:50:19 ci-nexazurubuflannel-repository-vm-0 systemd[1]: /etc/systemd/system/prometheus-node-exporter.service:7: Ignoring unknown escape sequences:
Dec 06 09:50:19 ci-nexazurubuflannel-repository-vm-0 systemd[1]: /etc/systemd/system/prometheus-node-exporter.service:7: Ignoring unknown escape sequences:
Dec 06 09:50:20 ci-nexazurubuflannel-repository-vm-0 systemd[1]: /etc/systemd/system/prometheus-node-exporter.service:7: Ignoring unknown escape sequences:
Dec 06 09:50:20 ci-nexazurubuflannel-repository-vm-0 systemd[1]: /etc/systemd/system/prometheus-node-exporter.service:7: Ignoring unknown escape sequences:
  1. Shouldn't we also upgrade a node exporter release to a new version of a chart when executing the upgrade command?

@@ -27,7 +27,7 @@ Note that versions are default versions and can be changed in certain cases thro
| Logstash OSS | 7.12.0 | https://github.com/elastic/logstash | [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| Prometheus | 2.10.0 | https://github.com/prometheus/prometheus | [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| Grafana | 7.3.5 | https://github.com/grafana/grafana | [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| Node Exporter | 1.0.1 | https://github.com/prometheus/node_exporter | [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| Node Exporter | 1.3.0 | https://github.com/prometheus/node_exporter | [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 1.3.1 now

@przemyslavic
Copy link
Collaborator

As discussed yesterday with @romsok24, we still need to update the helm chart.
And another thing is when upgrading, we should keep the user's configuration and not render the template from scratch (thus losing the user's configuration.

@romsok24 romsok24 marked this pull request as ready for review December 10, 2021 09:36
cicharka
cicharka previously approved these changes Dec 10, 2021
cicharka
cicharka previously approved these changes Dec 13, 2021
@seriva seriva requested review from sbbroot and cicharka December 15, 2021 09:21
seriva
seriva previously approved these changes Dec 16, 2021
sbbroot
sbbroot previously approved these changes Dec 16, 2021
@przemyslavic
Copy link
Collaborator

/azp run

@przemyslavic
Copy link
Collaborator

@romsok24 please resolve conflicts in COMPONENTS.md. Then I can approve (already tested ✔️ :) )

przemyslavic
przemyslavic previously approved these changes Dec 17, 2021
@romsok24 romsok24 dismissed stale reviews from przemyslavic, sbbroot, seriva, and cicharka via 1bbfb8c December 20, 2021 07:25
@romsok24
Copy link
Contributor Author

Issue #2747: upgrade node_exporter

@romsok24 romsok24 merged commit 31f165a into develop Dec 20, 2021
@romsok24 romsok24 deleted the feature/upgr-node-eporter-2747 branch December 20, 2021 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants