-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClientModel: ClientRetryPolicy should consider retry-after
header
#44222
Comments
Thank you for your feedback. Tagging and routing to the team member best able to assist. |
Sorry to bump this old issue - I've been using Azure OpenAI, and I notice that it uses SCM's |
Hi @Pilchie - a fix for this issue was released in SCM 1.1.0-beta.6. If you reference this version, do you still see the issue? |
Hi @annelo-msft - thanks for looking at this, and sorry for the delay. Today I tried out pinning SCM 1.1.0-beta7 and seeing what happened. Unfortunately, with the default retry policy, this doesn't actually help. Looking at the code for |
Nevermind - I am getting an I think there might still be an issue with |
Hi @Pilchie --
This is interesting, and if there's a bug in SCM, I'd love to know about it so we can get it fixed! Would you be willing to share a quick repro case for this? |
Interesting - I'm not sure how the delay got set to 1 day. Maybe both headers were sent. As for the repro, I was using the code below, with an Azure OpenAI endpoint set to the minimum rate limit: using Azure.AI.OpenAI;
using Azure.Identity;
using OpenAI.Embeddings;
using System.ClientModel.Primitives;
const string endpointEnvironmentVariableName = "AZURE_OPENAI_ENDPOINT";
const string apiKeyEnvironmentVariableName = "AZURE_OPENAI_APIKEY";
var endpoint = Environment.GetEnvironmentVariable(endpointEnvironmentVariableName) ?? throw new Exception($"Set '{endpointEnvironmentVariableName}'.");
var apiKey = Environment.GetEnvironmentVariable(apiKeyEnvironmentVariableName) ?? throw new Exception($"Set '{apiKeyEnvironmentVariableName}'.");
var client = new AzureOpenAIClient(
new Uri(endpoint),
apiKey,
new AzureOpenAIClientOptions
{
RetryPolicy = new ClientRetryPolicy(),
});
var embeddingClient = client.GetEmbeddingClient("text-embedding-3-large");
const int count = 100;
var inputs = new List<string>(count);
for (int i = 0; i < count; i++)
{
inputs.Add($"This is a string to embed. It is number {i}.");
}
var result = embeddingClient.GenerateEmbeddings(inputs);
Console.WriteLine(result.GetRawResponse().Status); Note that I was unable to get a breakpoint to hit inside the retry method, or to step into it, so it's possible something else was happening - I could only inspect the state once it was in the 1-day |
Confirmed that |
Hello! Thank you for sharing the details of your use case. I was able to reproduce the case you are seeing using the sample you shared. It sounds like you are seeing, and I also see, that Azure OpenAI is returning a |
And actually, since this appears to be the correct behavior that this issue was tracking, I am going to go close this issue, but @Pilchie, feel free to open another issue in the repo if you believe there is a bug in the ClientRetryPolicy, or to request details on how to customize the retry policy in your application. Thanks! |
I think it's technically correct, but practically infeasible. Having an application unable to make a request for 24 hours as soon as it budges over the quota isn't feasible. I get that the issue is actually with Azure OpenAI returning an impractical value in |
@Pilchie, I see your point about hanging an application that gets a |
I agree that this doesn't put you in a good position - it would be nicer if you could rely on services to provide useful values for |
Yes, I was thinking about this as well. We have MaxDelay in Azure.Core, but the implementation will use the delay from a service retry-after header even if it is larger than the user-specified value of MaxDelay. Which makes me think it would be better to provide a different option for specifying a maximum retry-after value, or some timeout threshold for retry delays. The question still stands, though -- if this threshold is exceeded, how does the client communicate that to the caller? The contract for the method you're calling in your sample, |
Well, my experience with Azure OpenAI is that even if they return a |
Ah, I see. I did not realize that you are looking to retry the request before the service's recommended wait time has completed. I don't think SCM types should make this easy for users. Services typically set the It sounds like there are a couple things happening here, which would need to be addressed independently:
Thanks! |
I understand why you don't want to make it easy, but practically what do you actually expect users to do in the case of buggy services like this, where they clearly will respond well before the RetryAfter they send? I expect that as soon as this ships and is taken up by customers, many of them will want the same. It's good that it's possible via cancellation token, and we'll put that in place, but I expect you'll want to document it as well. Do you have any pointers on how/where to report an issue with the service for 1? |
Great!
I sent you some offline. Thanks for your engagement on this, and thanks for helping us make our Azure SDK clients great! |
Azure.Core's retry policy uses the default
DelayStrategy
implementation to check and wait to retry based on the response RetryAfter header. Since we don't haveDelayStrategy
in SCM, we did not implement this logic inClientRetryPolicy
. We should implement this, though, and may be able to address this issue as part of the addition of LRO abstractions to SCM in #42114. We could have an internal type that implements exponential backoff and respects the RetryAfter header by default, and make it overridable if someone specifies a fixed polling interval. Subtypes of the retry policy or LRO implementations from TypeSpec could then override this behavior if needed for their service-specific implementations.The text was updated successfully, but these errors were encountered: