-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use region specific API calls with VMSS #1850
Use region specific API calls with VMSS #1850
Conversation
success 1, 20mins /test pull-cluster-api-provider-azure-e2e-exp |
success 2, 22mins /test pull-cluster-api-provider-azure-e2e-exp |
success 3, 23mins /test pull-cluster-api-provider-azure-e2e-exp |
if err != nil { | ||
return "", errors.Wrap(err, "failed to parse the base URI of client") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
success 4, 23mins /test pull-cluster-api-provider-azure-e2e-exp |
23 min is right around what I’d expect for the total duration of this test. Looking good! |
Actually, with the latest bits to reconcile pools all together on cluster create, I wonder if we can’t squeeze that lower. Delete is kind of slow though, so we’ll see what makes sense (unrelated to this PR) |
5 passes, 5th 23mins |
c87eec2
to
d47efd7
Compare
@CecileRobertMichon the PR history is now solid green for the exp job. I think we are solid. So solid that we may want to consider adding the AKS related tests to the regular pr e2e test. |
I think with the transient error / requeue stuff that I added to the AzureManagedMachinePool reconciler, I think we are close to as good as we can get. The initial reconcile of AMMP only starts when control plane transitions to ready. Once ready, the reconciler will try to reach goal state persistently with requeues if it reaches what is deemed a transient error, a 404 for the VMSS other similar errors that will be resolved as infra comes online. I don't think this is the most elegant solution, but it seems to be resilient. |
success 7?, 22mins. |
return a.aliasAuth.BaseURI() | ||
} | ||
|
||
sansScheme := path.Join(fmt.Sprintf("%s.%s", a.Region, a.parsedURL.Host), a.parsedURL.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually want path.join() (os filepath separator) here, versus a hardcoded slash? Wouldn’t this now be platform dependent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, I might be thinking filepath.join()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are good: https://cs.opensource.google/go/go/+/refs/tags/go1.17.3:src/path/path.go;l=174
Before the PR which reconciled all AMMP with cluster creation, we would create the cluster, wait for it to reconcile fully, then trigger creation of nodepools which takes ~5 min. With that approach I would expect to see around 25min runs based on my previous testing. since we don’t need that extra 5min anymore (all pools should be created), I wonder if we have room to squeeze the times lower somewhere, or if that’s just as good as it gets. |
Nice work with the URL scheme/host/path parsing /lgtm |
I like where this is going! I think we need to do some tinkering now that tests are more stable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great, just one comment about using the Location() func
return &azureManagedMachinePoolService{ | ||
scope: scope, | ||
agentPoolsSvc: agentpools.New(scope), | ||
scaleSetsSvc: scalesets.NewClient(scope), | ||
} | ||
scaleSetsSvc: scalesets.NewClient(authorizer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
d47efd7
to
613a2ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-1.0 |
@CecileRobertMichon: #1850 failed to apply on top of branch "release-1.0":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Ok that's fair @ cherry-pick bot :) @devigned do you think we should cherry-pick this to release-1.0? |
With the next release coming very soon and managed clusters being experimental, I lean toward not dealing with the cherry pick merge conflict. |
/cherry-pick release-0.5 |
@jackfrancis: #1850 failed to apply on top of branch "release-0.5":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
When listing VMSSes in a resource group, Azure responds with values it has in the regional cache. Unfortunately, there are sometimes spikes in regional replication for these caches. To avoid cross region replication issues, this PR introduces a region specific client for AzureManageMachinePools by transforming the client baseURI from something like
https://management.azure.com
tohttps://{region}.management.azure.com
to ensure the request is directed to the region where the VMSSes should be.fixes #1720
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: