-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Updating the vmsize for e2e cilium to avoid resource scarcity #2014
Conversation
dbbab4b
to
2550c63
Compare
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
2550c63
to
81108b4
Compare
@@ -32,7 +32,7 @@ steps: | |||
mkdir -p ~/.kube/ | |||
echo "Create AKS Overlay cluster" | |||
make -C ./hack/swift azcfg AZCLI=az REGION=$(REGION_OVERLAY_CLUSTER_TEST) | |||
make -C ./hack/swift overlay-no-kube-proxy-up AZCLI=az REGION=$(REGION_OVERLAY_CLUSTER_TEST) SUB=$(SUB_AZURE_NETWORK_AGENT_TEST) CLUSTER=${{ parameters.clusterName }}-$(make revision) | |||
make -C ./hack/swift overlay-no-kube-proxy-up AZCLI=az REGION=$(REGION_OVERLAY_CLUSTER_TEST) SUB=$(SUB_AZURE_NETWORK_AGENT_TEST) CLUSTER=${{ parameters.clusterName }}-$(make revision) VM_SIZE=Standard_DS4_v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard_DS4_v2
is a HUGE jump up in VM size and cost (12x!). I originally picked B2s as the cheapest viable option - did you evaluate any other SKUs? Maybe we could start at Standard_B2ms
which has double the mem at only double the cost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried/tested with Standard_D4_v3 for the perf dashboard( which use goldpinger) that why choose a similar one for our e2e tests.
Will try to test with Standard_B2ms
and see if that works.
Thanks !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Standard_B2ms
worked without the resource limit. @rbtr . Thanks for the help
Will try another run to be sure.
@@ -76,3 +76,10 @@ spec: | |||
port: 8080 | |||
initialDelaySeconds: 5 | |||
periodSeconds: 5 | |||
resources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource block is omitted intentionally so that the goldpinger Pods will overprovision on the nodes. When you add this, only (node mem)/100Mi Goldpinger pods can be scheduled on the Node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the node mem is 8gb( in case Standard_B2ms
) the number of pods that can be schedule on that node would be 80. And we need to scale the pods till 100.
Do you recommend adding a smaller limit to accommodate approx ~110 pods OR remove the limit altogether so that we can overprovision on that node
7dccfc5
to
13e9c22
Compare
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
13e9c22
to
4b495eb
Compare
…2014) CI: Testing the e2e test for cilium
…2014) CI: Testing the e2e test for cilium
* fix: assume invalid semver CNI has the required dump state command (#2078) * fix: Updating the vmsize for e2e cilium to avoid resource scarcity (#2014) CI: Testing the e2e test for cilium --------- Co-authored-by: Vipul Singh <[email protected]>
Reason for Change:
Issue Fixed:
Requirements:
Notes: