ClientModel: ClientRetryPolicy should consider `retry-after` header #44222

annelo-msft · 2024-05-23T21:28:19Z

Azure.Core's retry policy uses the default DelayStrategy implementation to check and wait to retry based on the response RetryAfter header. Since we don't have DelayStrategy in SCM, we did not implement this logic in ClientRetryPolicy. We should implement this, though, and may be able to address this issue as part of the addition of LRO abstractions to SCM in #42114. We could have an internal type that implements exponential backoff and respects the RetryAfter header by default, and make it overridable if someone specifies a fixed polling interval. Subtypes of the retry policy or LRO implementations from TypeSpec could then override this behavior if needed for their service-specific implementations.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-05-23T21:29:08Z

Thank you for your feedback. Tagging and routing to the team member best able to assist.

Pilchie · 2024-07-29T19:14:43Z

Sorry to bump this old issue - I've been using Azure OpenAI, and I notice that it uses SCM's ClientPiplelinePolicy, but OpenAI's rate limiting does specify the retry after header. Any chance you know of an implementation of an SCM retry policy that does take this into account?

annelo-msft · 2024-08-05T17:33:46Z

Hi @Pilchie - a fix for this issue was released in SCM 1.1.0-beta.6. If you reference this version, do you still see the issue?

Pilchie · 2024-09-10T21:19:53Z

Hi @annelo-msft - thanks for looking at this, and sorry for the delay. Today I tried out pinning SCM 1.1.0-beta7 and seeing what happened. Unfortunately, with the default retry policy, this doesn't actually help. Looking at the code for GetNextDelay, I think the issue is that tryCount is 0, so nextDelayMilliseconds ends up being 1 day, and the retryAfter.TotalMilliseconds is smaller than nextDelayMilliseconds.

Pilchie · 2024-09-10T21:55:47Z

Nevermind - I am getting an x-ms-retry-after of 1 day, so that seems ok.

I think there might still be an issue with message.RetryCount being zero leading to a negative time to wait, which means there will be an immediate retry the first time, but that's not as big an issue to me.

annelo-msft · 2024-09-10T23:17:32Z

Hi @Pilchie -- x-ms-retry-after is an Azure-specific header, so not supported by default by SCM, because SCM provides general HTTP support, but not Azure-specific features. For Azure.AI.OpenAI, @trrwilson was adding capabilities into that client to support the Azure-specific features -- I don't know what the status of that is currently -- if that PR has been merged and what client version has it. @trrwilson could you share that info?

I think there might still be an issue with message.RetryCount being zero leading to a negative time to wait, which means there will be an immediate retry the first time, but that's not as big an issue to me.

This is interesting, and if there's a bug in SCM, I'd love to know about it so we can get it fixed! Would you be willing to share a quick repro case for this?

Pilchie · 2024-09-11T15:57:04Z

Interesting - I'm not sure how the delay got set to 1 day. Maybe both headers were sent. As for the repro, I was using the code below, with an Azure OpenAI endpoint set to the minimum rate limit:

using Azure.AI.OpenAI;
using Azure.Identity;
using OpenAI.Embeddings;
using System.ClientModel.Primitives;

const string endpointEnvironmentVariableName = "AZURE_OPENAI_ENDPOINT";
const string apiKeyEnvironmentVariableName = "AZURE_OPENAI_APIKEY";

var endpoint = Environment.GetEnvironmentVariable(endpointEnvironmentVariableName) ?? throw new Exception($"Set '{endpointEnvironmentVariableName}'.");
var apiKey = Environment.GetEnvironmentVariable(apiKeyEnvironmentVariableName) ?? throw new Exception($"Set '{apiKeyEnvironmentVariableName}'.");

var client = new AzureOpenAIClient(
    new Uri(endpoint),
    apiKey,
    new AzureOpenAIClientOptions
    {
        RetryPolicy = new ClientRetryPolicy(),
    });

var embeddingClient = client.GetEmbeddingClient("text-embedding-3-large");

const int count = 100;
var inputs = new List<string>(count);
for (int i = 0; i < count; i++)
{
    inputs.Add($"This is a string to embed. It is number {i}.");
}
var result = embeddingClient.GenerateEmbeddings(inputs);

Console.WriteLine(result.GetRawResponse().Status);

Note that I was unable to get a breakpoint to hit inside the retry method, or to step into it, so it's possible something else was happening - I could only inspect the state once it was in the 1-day Wait call.

Pilchie · 2024-09-11T16:11:40Z

Confirmed that RetryAfter was the header that was sent (in addition to x-ratelimit-reset-tokens and x-ratelimit-remaining-requests)

annelo-msft · 2024-10-11T22:48:04Z

Hello! Thank you for sharing the details of your use case. I was able to reproduce the case you are seeing using the sample you shared. It sounds like you are seeing, and I also see, that Azure OpenAI is returning a Retry-After header with a value of 86400. This indeed will cause the default ClientRetryPolicy to wait to retry the response for 86400 seconds. I believe that is the correct behavior according to https://www.rfc-editor.org/rfc/rfc7231#section-7.1.3. If you believe the behavior is incorrect, would you be able to share the specific behavior you are expecting to see in this case? And/or, if this behavior is problematic for your application, I'm happy to share approaches to customize the System.ClientModel retry policy to better suit your needs -- let me know if that would be helpful. Thanks!

annelo-msft · 2024-10-11T22:49:43Z

And actually, since this appears to be the correct behavior that this issue was tracking, I am going to go close this issue, but @Pilchie, feel free to open another issue in the repo if you believe there is a bug in the ClientRetryPolicy, or to request details on how to customize the retry policy in your application. Thanks!

Pilchie · 2024-10-12T18:37:50Z

I think it's technically correct, but practically infeasible. Having an application unable to make a request for 24 hours as soon as it budges over the quota isn't feasible.

I get that the issue is actually with Azure OpenAI returning an impractical value in RetryAfter, but I think if you ship a version of the Azure OpenAI SDK that uses this value, you will effectively break many applications that are working today.

annelo-msft · 2024-10-14T16:27:23Z

@Pilchie, I see your point about hanging an application that gets a Retry-After value of 24 hours from a service. It does feel like clients should be robust to services returning very-long retry after values somehow, I agree with that. What would you expect the client behavior to be in this case? The model in these clients is typically a simple request/response paradigm -- i.e. a service method does not return until the service has returned a response that the client can deserialize into a valid return type. It feels like the option we have to notify the caller that they must either suspend the client thread for 24 hours or try a different approach would be to throw an exception. If we threw, we could notify callers that some threshold had been exceeded/would be exceeded if they decided to wait. This would be a breaking change if we introduced the threshold without an explicit opt-in, so I believe we would want to let the caller specify that somehow as a way of opting-in to that behavior. Would that be consistent with your expectations for client behavior? Or is there a different behavior you would expect to see in such a case?

Pilchie · 2024-10-14T19:56:44Z

I agree that this doesn't put you in a good position - it would be nicer if you could rely on services to provide useful values for RetryAfter: I also don't think that throwing is a great developer experience. Perhaps an alternative would be to allow the user to optionally specify a maximum value to wait before retrying, or a maximum value of RetryAfter: to consider valid?

annelo-msft · 2024-10-14T20:31:05Z

Yes, I was thinking about this as well. We have MaxDelay in Azure.Core, but the implementation will use the delay from a service retry-after header even if it is larger than the user-specified value of MaxDelay. Which makes me think it would be better to provide a different option for specifying a maximum retry-after value, or some timeout threshold for retry delays.

The question still stands, though -- if this threshold is exceeded, how does the client communicate that to the caller? The contract for the method you're calling in your sample, GenerateEmbeddings, is that it will return a valid ClientResult<OpenAIEmbedding> when the method returns. I think the only thing the client can do if the service hasn't sent the response content to deserialize at the time the threshold is exceeded is throw an exception.

Pilchie · 2024-10-14T21:58:59Z

Well, my experience with Azure OpenAI is that even if they return a RetryAfter of a 1 day, the call works if you retry after 1 minute, so I would expect that if you exceed a hypothetical new maximum, you would retry again, even though it isn't the 1 day yet, and thus the app would continue to work, because the next retry would be sucessful.

annelo-msft · 2024-10-15T15:56:36Z

even if they return a RetryAfter of a 1 day, the call works if you retry after 1 minute,

Ah, I see. I did not realize that you are looking to retry the request before the service's recommended wait time has completed.

I don't think SCM types should make this easy for users. Services typically set the Retry-After value for a reason, and retrying before that time should be unsuccessful. Moreover, retrying before that time could be harmful to the service and is not recommended.

It sounds like there are a couple things happening here, which would need to be addressed independently:

If you believe the Azure OpenAI service is returning an incorrect Retry-After header value, please file an issue with the service team so they can address this. Let me know if you need help tracking down the right way to file such an issue.
If you would like to cancel the service call prior to the service returning a success response, I recommend passing a CancellationToken to the service method (GenerateEmbeddings), and cancelling the operation as needed to prevent your app from hanging.
1. If you want to encode a different retry behavior in your client based on undocumented service behaviors, SCM-based clients will not prevent this -- you can always write a custom retry policy and replace the client's default retry policy by setting an instance of your policy on client options. Again, this is not recommended, just a note regarding possibility.

Thanks!

Pilchie · 2024-10-15T23:49:55Z

I understand why you don't want to make it easy, but practically what do you actually expect users to do in the case of buggy services like this, where they clearly will respond well before the RetryAfter they send? I expect that as soon as this ships and is taken up by customers, many of them will want the same. It's good that it's possible via cancellation token, and we'll put that in place, but I expect you'll want to document it as well.

Do you have any pointers on how/where to report an issue with the service for 1?

annelo-msft · 2024-10-16T18:27:02Z

It's good that it's possible via cancellation token

Great!

Do you have any pointers on how/where to report an issue with the service for 1?

I sent you some offline. Thanks for your engagement on this, and thanks for helping us make our Azure SDK clients great!

annelo-msft added Client This issue points to a problem in the data-plane of the library. Azure.Core System.ClientModel Base Core library labels May 23, 2024

annelo-msft added this to the 2024-06 milestone May 23, 2024

github-actions bot assigned annelo-msft May 23, 2024

github-actions bot added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label May 23, 2024

annelo-msft removed the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label May 23, 2024

jsquire modified the milestones: 2024-06, Backlog Jun 19, 2024

trrwilson mentioned this issue Jun 19, 2024

Best way to check for breaches of rate limit using the Assistants API? openai/openai-dotnet#66

Open

annelo-msft modified the milestones: Backlog, 2024-08 Jul 18, 2024

annelo-msft mentioned this issue Jul 18, 2024

SCM: Add support for Retry-After header to default ClientRetryPolicy #45078

Merged

annelo-msft closed this as completed in #45078 Jul 19, 2024

Pilchie mentioned this issue Aug 22, 2024

Remove KeyVault Dependency and Initialize clients using Managed identity directly. Also, update BlobTriggerFunction to read blobstream and do document analysis on read chunks. AzureCosmosDB/document-vector-pipeline#2

Merged

Pilchie reopened this Sep 10, 2024

annelo-msft closed this as completed Oct 11, 2024

github-actions bot locked and limited conversation to collaborators Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClientModel: ClientRetryPolicy should consider `retry-after` header #44222

ClientModel: ClientRetryPolicy should consider `retry-after` header #44222

annelo-msft commented May 23, 2024

github-actions bot commented May 23, 2024

Pilchie commented Jul 29, 2024

annelo-msft commented Aug 5, 2024

Pilchie commented Sep 10, 2024

Pilchie commented Sep 10, 2024

annelo-msft commented Sep 10, 2024

Pilchie commented Sep 11, 2024

Pilchie commented Sep 11, 2024 •

edited

Loading

annelo-msft commented Oct 11, 2024

annelo-msft commented Oct 11, 2024 •

edited

Loading

Pilchie commented Oct 12, 2024

annelo-msft commented Oct 14, 2024

Pilchie commented Oct 14, 2024

annelo-msft commented Oct 14, 2024 •

edited

Loading

Pilchie commented Oct 14, 2024

annelo-msft commented Oct 15, 2024 •

edited

Loading

Pilchie commented Oct 15, 2024

annelo-msft commented Oct 16, 2024

ClientModel: ClientRetryPolicy should consider retry-after header #44222

ClientModel: ClientRetryPolicy should consider retry-after header #44222

Comments

annelo-msft commented May 23, 2024

github-actions bot commented May 23, 2024

Pilchie commented Jul 29, 2024

annelo-msft commented Aug 5, 2024

Pilchie commented Sep 10, 2024

Pilchie commented Sep 10, 2024

annelo-msft commented Sep 10, 2024

Pilchie commented Sep 11, 2024

Pilchie commented Sep 11, 2024 • edited Loading

annelo-msft commented Oct 11, 2024

annelo-msft commented Oct 11, 2024 • edited Loading

Pilchie commented Oct 12, 2024

annelo-msft commented Oct 14, 2024

Pilchie commented Oct 14, 2024

annelo-msft commented Oct 14, 2024 • edited Loading

Pilchie commented Oct 14, 2024

annelo-msft commented Oct 15, 2024 • edited Loading

Pilchie commented Oct 15, 2024

annelo-msft commented Oct 16, 2024

ClientModel: ClientRetryPolicy should consider `retry-after` header #44222

ClientModel: ClientRetryPolicy should consider `retry-after` header #44222

Pilchie commented Sep 11, 2024 •

edited

Loading

annelo-msft commented Oct 11, 2024 •

edited

Loading

annelo-msft commented Oct 14, 2024 •

edited

Loading

annelo-msft commented Oct 15, 2024 •

edited

Loading