Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics for AWS API calls #4179

Merged
merged 1 commit into from
Aug 9, 2021

Conversation

bpineau
Copy link
Contributor

@bpineau bpineau commented Jul 5, 2021

Provide metrics for AWS API calls; helps identifying slowness, throttling causes, and errors bursts.
That's similar to what GCE provider does.

Provide metrics for AWS API calls; helps identifying slowness,
throttling causes, and errors surges.
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 5, 2021
@bpineau
Copy link
Contributor Author

bpineau commented Jul 30, 2021

@gjtempleton may I request your input on that one?
/assign @gjtempleton

Copy link
Member

@gjtempleton gjtempleton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, this looks really useful!

One small query for you.

Namespace: caNamespace,
Name: "aws_request_duration_seconds",
Help: "Time taken by AWS requests, by method and status code, in seconds",
Buckets: []float64{0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 2.0, 5.0, 10.0, 20.0, 30.0, 60.0},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very slightly worried about the number of histogram buckets here, though the low number of endpoint label values is making me less concerned.

Do we have any idea of the potential cardinality of the status label?

Copy link
Contributor Author

@bpineau bpineau Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I run (a slightly older version of) this patch for over a year on all clusters, managing several thousand ASGs. Got those 10 distinct status over last 12 months:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for that.

@gjtempleton
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 9, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bpineau, gjtempleton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2021
@k8s-ci-robot k8s-ci-robot merged commit 7e5f8f0 into kubernetes:master Aug 9, 2021
evansheng pushed a commit to airbnb/autoscaler that referenced this pull request Mar 24, 2022
jiancheung pushed a commit to airbnb/autoscaler that referenced this pull request Jul 29, 2022
akirillov pushed a commit to airbnb/autoscaler that referenced this pull request Oct 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants