-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DefaultAzureCredential failed to retrieve a token from the included credentials #19974
Comments
This bug impacts the reliability of our production service. Unlike a classic web service, where the credentials are retrieved once on startup, we have a bunch of standalone executables that are constantly launching. It seems like IMDS server does not throttle real HTTP requests, if we keep curling the IMDS endpoint, the response always succeed in about 130ms. As soon as there is a TCP probe made before the actual HTTP request, IMDS server consistently throttles the request once in a while. The issue occured after switching from |
Thank you for your feedback. Tagging and routing to the team member best able to assist. |
HI @wumingcp
I'm not sure why your workaround fails, because the MSI result is only cached with the instance of a Alternatively, if you'd like to gate the initialization of your container on the availability of the endpoint, you could try this pod that waits for a 200 from the endpoint: |
Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you! |
Describe the bug
Currently we face an issue on our production, and we use DefaultAzureCredential to get access token by MSI, then use this token to access Azure Blob, but it may fail to get the token for sometimes, not all. Here is the callstack
Unhandled exception. Azure.Identity.CredentialUnavailableException: DefaultAzureCredential failed to retrieve a token from the included credentials.
---> System.AggregateException: Multiple exceptions were encountered while attempting to authenticate. (EnvironmentCredential authentication unavailable. Environment variables are not fully configured.) (ManagedIdentityCredential authentication unavailable. No Managed Identity endpoint found.) (SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.) (Visual Studio Token provider can't be accessed at C:\Users\ContainerAdministrator\AppData\Local.IdentityService\AzureServiceAuth\tokenprovider.json) (Stored credentials not found. Need to authenticate user in VSCode Azure Account.) (Please run 'az login' to set up account)
---> Azure.Identity.CredentialUnavailableException: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
at Azure.Identity.CredentialDiagnosticScope.FailWrapAndThrow(Exception ex)
at Azure.Identity.EnvironmentCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
at Azure.Identity.EnvironmentCredential.GetTokenAsync(TokenRequestContext requestContext, CancellationToken cancellationToken)
at Azure.Identity.DefaultAzureCredential.GetTokenFromSourcesAsync(TokenCredential[] sources, TokenRequestContext requestContext, Boolean async, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
---> (Inner Exception hierarchical namespace similar to a file system and CloudBlobContainer class inconsistency #1) Azure.Identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. No Managed Identity endpoint found.
at Azure.Identity.ManagedIdentityClient.AuthenticateAsync(Boolean async, TokenRequestContext context, CancellationToken cancellationToken)
at Azure.Identity.ManagedIdentityCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
at Azure.Identity.CredentialDiagnosticScope.FailWrapAndThrow(Exception ex)
at Azure.Identity.ManagedIdentityCredential.GetTokenImplAsync(Boolean async, TokenRequestContext requestContext, CancellationToken cancellationToken)
at Azure.Identity.ManagedIdentityCredential.GetTokenAsync(TokenRequestContext requestContext, CancellationToken cancellationToken)
at Azure.Identity.DefaultAzureCredential.GetTokenFromSourcesAsync(TokenCredential[] sources, TokenRequestContext requestContext, Boolean async, CancellationToken cancellationToken)<---
....
Expected behavior
We expect the DefaultAzureCredential can always get token by MSI
Environment:
Our production run as a pod one AKS, and we use windows OS
Actual behavior (include Exception or Stack Trace)
We also did some investigation to try to figure out the root cause, and we found it caused by
TCPClient
Using a powershell script to run the code on pod for 100 times
Result: It can repro this issue
PS C:\run> 0..100 | ForEach-Object {.\MSITest.exe}
Download file to rawlog-7040a57e-f4f5-451f-beed-07750e50b2fc.txt
Download file to rawlog-3fd3b9d0-d284-47d0-9318-695671232a2d.txt
....
Download file to rawlog-c17b7789-a03b-4dd8-9baa-ce7d47372859.txt
Download file to rawlog-3be4a0cc-952a-4aa7-9698-2ba8b8278035.txt
Unhandled exception. Azure.Identity.CredentialUnavailableException: DefaultAzureCredential failed to retrieve a token from the included credentials.
azure-sdk-for-net/sdk/identity/Azure.Identity/src/ImdsManagedIdentitySource.cs
Line 40 in 2a0a498
Since it uses
TcpClient
to try to connect MSI, then we use the following code to test againUsing powershell to run this code on pod for 100 times
Result:
PS C:\run> 0..100 | ForEach-Object {.\MSITest.exe}
Connect to 169.254.169.254, elapsed time 29ms
Connect to 169.254.169.254, elapsed time 23ms
Connect to 169.254.169.254, elapsed time 24ms
Connect to 169.254.169.254, elapsed time 23ms
Connect to 169.254.169.254, elapsed time 23ms
...
Connect to 169.254.169.254, elapsed time 3046ms
Connect to 169.254.169.254, elapsed time 23ms
Connect to 169.254.169.254, elapsed time 29ms
Connect to 169.254.169.254, elapsed time 23ms
Connect to 169.254.169.254, elapsed time 24ms
..
Connect to 169.254.169.254, elapsed time 49ms
Connect to 169.254.169.254, elapsed time 24ms
Connect to 169.254.169.254, elapsed time 25ms
Connect to 169.254.169.254, elapsed time 38ms
Connect to 169.254.169.254, elapsed time 3025ms
Connect to 169.254.169.254, elapsed time 24ms
Connect to 169.254.169.254, elapsed time 25ms
Connect to 169.254.169.254, elapsed time 25ms
...
Conclusion:
DefaultAzureCredential use 1s as timeout in souce code and according to the test, TCPClient may use about 3s to get response from MSI. In this case, the DefaultAzureCredential treats the MSI is unavaiable, then cannot get access token
azure-sdk-for-net/sdk/identity/Azure.Identity/src/ImdsManagedIdentitySource.cs
Line 20 in 2a0a498
Result:
PS C:\run> 0..100 | ForEach-Object {.\MSITest.exe}
Download file to rawlog-461b232a-1363-46d3-861d-e4037044ae04.txt
Download file to rawlog-e9922d48-1866-4bba-a0b4-1ae6c64a9aae.txt
...
Download file to rawlog-e75064ab-3cb6-49d6-b966-d6c30f4191e5.txt
Get CredentialUnavailableException and re-create a new DefaultAzureCredential
Get CredentialUnavailableException and re-create a new DefaultAzureCredential
Get CredentialUnavailableException and re-create a new DefaultAzureCredential
Get CredentialUnavailableException and re-create a new DefaultAzureCredential
Get CredentialUnavailableException and re-create a new DefaultAzureCredential
Unhandled exception. Azure.Identity.CredentialUnavailableException: DefaultAzureCredential failed to retrieve a token from the included credentials.
But it doesn't work, because DefaultAzureCredential uses cache for the TCP connection result, always use the first connection result, it doesn't retry to connect MSI again
azure-sdk-for-net/sdk/identity/Azure.Identity/src/ManagedIdentityClient.cs
Line 15 in 2a0a498
The text was updated successfully, but these errors were encountered: