-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Azure AKS] 403s received from Azure when listing VMSS instances #1674
Comments
From the logs, the client '6053ad54-5340-4d95-8842-9f2ac12c4566' hasn't been authorized to VMSS APIs. Could you configure the AAD for it? |
@feiskyer yes, that's the point. The client has permission and the autoscaler works, then it loses permission (perhaps during token refresh?) and the autoscaler restarts. This occurs very frequently. |
@danmassie What period have you observed for the issue? Actually, the token refresh is doing automatically whenever it's expired. I'm wondering whether there're other potential issues. |
Have the same issue. AKS-Engine 0.35.1. K8S version 13.5. |
@oronboni Could you share the logs of CA? Which version of CA are you using? |
@feiskyer additional data: Autoscaler version 1.13.2. Failed to get nodes from apiserver: Get https://10.0.0.1:443/api/v1/nodes: dial tcp 10.0.0.1:443: i/o timeout. After deleting the pod I get the following error: When I gave the user identity permissions on the vmss the error stopped but pods stay in pending status: When changing the amount of nodes manually the pods start. If additional logs required for investigate please mention the pods or the log file required. |
@oronboni thanks for the information, so the issue is actually different from Dan's. Identity is required for CA to operate VMSS. By |
@feiskyer my max nodes configured in AKS-ENGINE to 50 but the current nodes are 5. Log from the autoscaler: Also I think that Dan's problem is the same because I had the same error before I gave to the user identity manually permissions. with the additional permissions the this error disappeared. |
@oronboni So is your cluster scaling-up by CA now? you can run
Dan has claimed |
@feiskyer thank you for your quick replay. I gave the following permissions:
Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="LinkedAuthorizationFailed" |
@oronboni surprised owner role is still not authorized. could you open a ticket on Azure portal? |
I checked permissions in old clusters that created with AKS-Engine 0.31.1. I removed from all the resources the owenr and add to the resource group contributor and the scaleset add 3 instances. Second problem (seems K8S issue) more instances required (pods in pending status) The auto scaler log: |
@oronboni Did CA work now? the addon config above looks good to me. |
@feiskyer thank you for your assistance, yes after the change CA works (the above configuration worked in the past but by checking CA yaml I saw that the nodes configuration was: - --nodes=1:5) From my side everything works (but there are issues I think you should check in AKS-Engine): Issue 2: with the configuration above the CA defined the nodes was max 1:5 (took the default values and ignore the json configuration file) change to max-nodes, min-nodes solve the issue (the old configuration worked in the past) |
@oronboni Glad to see it works now, and thanks for providing the details. I think those two issues should be fixed in aks-engine. Would get them involved. |
/close |
@feiskyer: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
aks-engine 0.27.0
kubernetes 1.13.1
cluster autoscaler v1.13.1
The cluster autoscaler periodically receives 403s from Azure which results in the pod restarting where it resumes normal behaviour until receiving a further 403.
E0208 08:54:43.057787 1 azure_scale_set.go:199] VirtualMachineScaleSetVMsClient.List failed for k8s-agentpool1-39472669-vmss: compute.VirtualMachineScaleSetVMsClient#List: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Serv ice returned an error. Status=403 Code="AuthorizationFailed" Message="The client '6053ad54-5340-4d95-8842-9f2ac12c4566' with object id '6053ad54-5340-4d95-8842-9f2ac12c4566' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/ virtualMachines/read' over scope '/subscriptions/xxx-xxx-xxx-xxx/resourceGroups/k8s/providers/Microsoft.Compute/virtualMachineScaleSets/k8s-agentpool1-39472669-vmss'." F0208 08:54:43.057854 1 azure_cloud_provider.go:139] Failed to create Azure Manager: compute.VirtualMachineScaleSetVMsClient#List: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="A uthorizationFailed" Message="The client 'xxx-xxx-xxx-xxx' with object id 'xxx-xxx-xxx-xxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/virtualMachines/read' over scope '/subscr iptions/xxx-xxx-xxx-xxx/resourceGroups/k8s/providers/Microsoft.Compute/virtualMachineScaleSets/k8s-agentpool1-39472669-vmss'."
The text was updated successfully, but these errors were encountered: