Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cdk_infra] Add node auto scaling to EKS clusters #1059

Open
bryan-aguilar opened this issue Jan 26, 2023 · 0 comments
Open

[cdk_infra] Add node auto scaling to EKS clusters #1059

bryan-aguilar opened this issue Jan 26, 2023 · 0 comments
Labels
EKS EKS related issues

Comments

@bryan-aguilar
Copy link
Contributor

There exists a possibility that during the Collector CI multiple EKS test cases will run against a single cluster. This could cause over utilization of node resources which will cause pods not to be scheduled. We have currently only seen issues with hitting caps due to CPU requests. Current resource quotas for deployments can be found by searching for limits = in the terraform directory. Example here. Currently, in most cases there is no request quota set but CPU limits set at .2.

EKS Clusters should be setup in a way that does not restrict how many tests that can be run in parallel. We should also not have to continually tweak requests/limits based on how many test cases may be running in parallel. To better accommodate this we could set up node a node autoscaler that can handle the increased test load on the clusters.

A temporary solution would also be to increase the minimum amount of nodes in the managed node group. This comes with a tradeoff in cost and should not be considered a long term solution.

@bryan-aguilar bryan-aguilar added the EKS EKS related issues label Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS EKS related issues
Projects
None yet
Development

No branches or pull requests

1 participant