Feedback related to metrics #5497

ckdarby · 2020-10-09T16:04:18Z

While rolling out Kubernetes scaling for Presto we weren't thrilled about some aspects or weren't able to find information in the docs to make the below suggestions moot.

These are the notes I made while we went through the experience.

Metrics be exposed as rest endpoints.

We'd rather avoid running JMX exporter for prometheus and having ports open for the JMX. With rest endpoints, it is straight forward to funnel everything through an ingress controller and force authentication, logging, and auditing.

Coordinator offers a single endpoint to pull in all node metrics

We'd rather be able to send a request to the coordinator and have it handle pulling in the individual worker metrics than us building tooling around asking k8s for all existing workers and making the individual requests out.

It appears someone else in the community asked a similar feature, here.

Cluster level metrics vs node level metrics

There are times we're interested in monitoring individual health of each node but also metrics that would be across the cluster. Cluster level metrics like total memory in use across all queries, total queued queries, etc. When using JMX it is very confusing when pointing at a worker node if it is reporting the stats of what it is executing or if they're the stats across the whole cluster or if it has asked the coordinator, etc.

hashhar · 2024-02-26T08:02:59Z

There's now OpenMetrics support available. No docs yet (in progress) but for now instructions can be found at #1581 (comment).

It should address the first and last points however the middle point is not addressed by it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback related to metrics #5497

Feedback related to metrics #5497

ckdarby commented Oct 9, 2020

hashhar commented Feb 26, 2024

Feedback related to metrics #5497

Feedback related to metrics #5497

Comments

ckdarby commented Oct 9, 2020

Metrics be exposed as rest endpoints.

Coordinator offers a single endpoint to pull in all node metrics

Cluster level metrics vs node level metrics

hashhar commented Feb 26, 2024