-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes input plugin not working (deprecated /stats/summary endpoint?) #6959
Comments
We should put together some documentation about what needs done to switch to the replacement and anyway we can smooth the transition. I could definitely use some help from the community on this. I am assuming similar metrics can be captured with the prometheus input plugin. It would be good to gather a listing of the new metrics because switching over will likely change all metrics and break dashboards/alerts. It also looks like it should also be possible to use the |
Hello @danielnelson , thank you for your reply. The cadvisor endpoint support will be removed in Kubernetes 1.19 (kubernetes/kubernetes#76660) so I would recommend using the |
@danielnelson for managed kubernetes, not sure you can ask to add this flag so even as a temporary fix, it won't work for many (most ?) people @masual : so it would mean we need to deploy the metrics server first to use then this plugin ? or should we use only kube_inventory plugin ? |
I could make it work with the help of @rawkode: As endpoint, you need:
be sure to have:
and as ClusterRole (I use ClusterRoleAggregations):
Tested on k8s 1.17.0 on OVH K8S Managed Service |
... and available soon as an helm chart for the deployment of telegraf as a daemonset => influxdata/helm-charts#16 |
I have the same problem, I follow these recommentations, but same error: Is there any solution or another documentation to fix the problem? I Checked I have configured rbac permissions, this output:
I have this config applied in yamls:
Mi Pod use this, + use token via secrets applied in configMap, other plugins like kube_inventory works fine with this:
|
@jmorcar have a look at what we did for telegraf-ds chart as we get it working => https://github.com/influxdata/helm-charts/tree/master/charts/telegraf-ds |
I think the plugin is expecting a URL to the Node's API, not the API-server's API. So the telegraph container runs on every node, in a daemonset, configured with something like |
I have checked right now with NODE IP variable, here HOSTIP, captured via fieldPath: status.hostIP, but is answer is Forbidden:
While if I use the previous command I posted, the query is permitted with data:
(Both queries are exec inside the container Telegraf and use the service account created in yaml definition) For the creation the serviceaccount , telegraf-reader , I followed the guide posted by kube_inventory plugin in GitHub. I checked telegraf-reader has privilegies to query resources like /api/v1/namespaces/default/pods...for that I created ClusterRole and rolebindings. Before that, it was when all answers of any resource query was Forbidden, but not right now, so URL should be the problem. I checked "Kubernetes.default.svc" is same "kubernetes" short name, both are the ClusterIP for default to the Kubernetes cluster. I will have to check source code to kubernetes input plugin for telegraf to find the exac query return a "404 not found" |
I don't found the ClusterRole or role bindings definitions on template charts, so I think the deploy will have the Forbidden error. I posted a suggest to include this documentation in charts because, yaml definition calling the service account is not sufficient, if you don't created RBAC permissions before. |
here is the role and rolebinding The telegraf-ds chart works fine for me - did you try it on your cluster ? |
Thanks! I have applied now... and same problem: 2020-04-03T17:21:20Z E! [inputs.kubernetes] Error in plugin: https://kubernetes/stats/summary returned HTTP status 404 Not Found |
@jmorcar if you are going through the Kubernetes API, you need the proxy endpoint. It's usually best to go through the NODEIP from the downwardAPI. I see mentions of that above, but I couldn't work out what problem you had with that approach. By any chance are you on GKE? They do block access to the Kubelet this way (last time I checked) |
Thanks at all, I found the problem, I was using a Deployment defintion, instead of Daemonset. Related problem when you change to daemonset is like commented @alanjcastonguay or @rawkode , you have to use NODEIP:10250, like this:
So I have changed my yaml for the official helm chart like recommended @nsteinmetz because I had to change/add too params in my yaml. The official chart is OK, deploy in the namespace that you need and collect all metrics ok. Conclusion: https://github.com/influxdata/helm-charts/tree/master/charts/telegraf-ds |
Try creating a Service Account and ClusterRoleBinding for telegraf using the yaml configuration below. Mind the namespace.
Faced similar issue, after applying the yaml telegraf was able to authenticate in the cluster to scrape the metrics. |
I am using telegraf-ds chart but getting below error in the pod logs. 2021-02-11T17:32:50Z W! [inputs.kubernetes] Collection took longer than expected; not complete after interval of 10s |
I worked fine.
|
Closing, from the discussion it seems this issue is resolved (there have been significant changes to the k8 input plugin and dependencies updated) and also has a viable workaround by using the official helm chart. Please re-open if this isn't the case. |
Relevant telegraf.conf:
System info:
Ubuntu 18.04
k3s v1.17.2+k3s1
Telegraf image: telegraf:1.12.2
Steps to reproduce:
Configure the Kubernetes input plugin in a Telegraf container.
Expected behavior:
The plugin should colect the Kubernetes metrics.
Actual behavior:
The Telegraf plugin log shows that Kubernetes API server returned a 403 Forbiden error code. After adding to the RBAC Service Account of the pod the following rules:
the error is 404. No metrics are being collected.
Additional info:
The input plugin kube_intentory seems to be working just fine but the plugin kubernetes is not capable of obtaining any metric, as described. Looking at the code, the kubernetes input pluging calls the /stats/summary Kubernetes API server endpoint.
/stats/summary endpoint was planned to be depracated (kubernetes/kubernetes#68522) but it seems that it is already removed.
The text was updated successfully, but these errors were encountered: