-
Notifications
You must be signed in to change notification settings - Fork 829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eks-prow-build-cluster: Monitoring solution #5165
Comments
/milestone v1.28 |
@xmudrii Can i pickup this ? |
@memetics19 As far as I know @wozniakjan is already working on this. |
hey, yeah thanks for assigning me. I haven't found the time yet but next week should finally have the capacity to move this forward. |
|
I guess exposing it as documented in https://repost.aws/knowledge-center/eks-kubernetes-services-cluster should be sufficient, or do we want to get a dedicated domain for it as well? |
@wozniakjan I think we can get a dedicated domain for it. |
@xmudrii, @ameukam is this the GKE monitoring stack mentioned above?https://github.com/kubernetes/k8s.io/tree/main/infra/gcp/terraform/k8s-infra-monitoring I will take a look if there are any tools to aggregate GCP cloud monitoring with the self-hosted Prometheus we use in AWS. Judging from the wording of the task, it's desired to display GCP metrics in the AWS Prometheus, not the other way around. |
This is i
This is inaccurate. This link you provided is only for specific resources and not related the build clusters. We already aggregate metrics in https://monitoring.prow.k8s.io. you can find resources for this monitoring stack here:https://github.com/kubernetes/test-infra/tree/master/config/prow/cluster/monitoring |
that is perfect, thank you very much! |
I browsed the monitoring stack for GKE prow build clusters and I think before connecting both stacks, it could make sense to get a parity between the dashboards. I am currently working on trying to see how many of the original dashboards make sense here for the EKS build cluster #5324. Then my idea is to leverage Prometheus remote-write capability. One of the Prometheus instances would provide a single pane of glass and the other would expose its metrics for scraping. I considered Prometheus agent mode and I think in this case it's not that important and having the remote-write Prometheus capable of serving its metrics as well (which wouldn't be possible in the agent mode) has a bigger value than the resource optimization. Especially, since we are already exposing it through its own grafana. |
increased parity between GKE and EKS grafana is getting merged in #5324. However, after #5316 merged, a side quest popped up. There is a desire to restrict some boards for public access. Namely, https://monitoring-eks.prow.k8s.io/d/node-exporter-full/node-exporter-full is considered to be potentially oversharing. There was also a valid opinion that this dashboard could be very useful and we shouldn't get rid of it entirely. @pkprzekwas and I had the following idea:
|
disabling sensitive boards in #5387 |
@pkprzekwas @xmudrii, I got a much easier idea about monitoring integration than Prometheus remote-write capability. How about we just embed grafana dashboards as iframes? Given anonymous access is enabled (for readonly) on both, we could just set this on the exporting dashboard: [security]
allow_embedding = true and on the importing dashboard (the single pane of glass) set this: [panels]
disable_sanitize_html = true the independent panels can be then integrated as iframes, for example: {
"type": "text",
"content": "<iframe src=\"https://monitoring-eks.prow.k8s.io/d-solo/g4Okc0_4k/boskos-server-dashboard?orgId=1&panelId=2\" width=\"450\" height=\"200\" frameborder=\"0\"></iframe>",
"mode": "html"
} we wouldn't be able to query across multiple clusters but it could be ok first step to just have a common place to see what is going on. |
That's a decent low hanging fruit. As our grafana instances are pubic and read only, there shouldn't be much difference between interacting with original ones and facaded with embedded iframes. |
kubernetes/test-infra#29920 proposing to allow dashboard embedding, let's see how that goes. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
Most of tasks are done. We're yet to come up with a single plane of glass monitoring solution, but I'll create a new issue to track that |
@xmudrii: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We have a simple monitoring solution based on Prometheus and Grafana in eks-prow-build-cluster. However, that monitoring stack is not exposed at all and we should look into unifying monitoring for GKE and EKS clusters.
Tasks
/priority important-longterm
The text was updated successfully, but these errors were encountered: