-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
return 500 status when any node from cluster is unavailable #30
Comments
Thank you for taking the time to report this issue. Since version 1.2.1 pve exporter writes stack traces to |
Looking through the changes, I suspect that this might be a problem introduced in #22 . It is possible that the lxc and qemu config cannot be accessed if a node is down. If this is the case, then we'd need to filter out nodes which are down after calling |
I opened PR #31 which might fix the problem. I also attached the source distribution and a python wheel, so it is easier for you to test whether the fix works. |
Hello. thanks for fast answer!
And 595 Errors during connection establishment, proxy handshake: No route to host - b'' |
Also tried to install from sources from PR #31 to local dev machine. |
I just published 1.2.2 which should fix this problem. Thanks again for the report. |
Hello I have a similar problem, when a node is offline, the exporter sometimes returns 500 error.
I changed the file to print the
The offline node (node3) doesn't have the |
Thanks @gigelu for the trace. I will have a look later. |
It looks like the PVE docs used to stress the importance of adding names and ips of all nodes to |
@gigelu opened PR #41 which drops |
Sorry for the late response.
No, I didn't (this is a test cluster). But adding them didn't fix the problem.
I wasn't able to install from those files (the source files were missing). Yes, changing the L.E.: from reading the linked proxmox file, shouldn't you remove the |
Good point. The cheap answer: I never had reports about The reason why I am tempted to remove the
From
The rest of the labels denoted by |
I understand now, thanks for the explanation. |
Fix in #41 is part of |
I have proxmox cluster with similar nodes 5.4-13 version.
I used exporter version 1.1.2 and all was ok.
With 'http://host:9221/pve' url I get the summary cluster status and nodes info.
But I tried to update exporter to 1.2.0 version and found some troubles.
When one of cluster nodes unavailable - the exporter report 500 status page and no any metrics.
Also the page http://host:9221/pve?target=proxmox-08 (for example) says '595 Errors during connection establishment'
So, when one of my cluster nodes is unavailable - the exporter not show any metrics, only error page.
The previous version 1.1.2 works fine and shows metrics. Here is shown that unavailable node has 'pve_up' metric '0'.
The text was updated successfully, but these errors were encountered: