Feature suggestion: Retry mechanism #2

doggy8088 · 2024-03-20T17:02:17Z

It because Google AI Gemini API is error prone. Every GenAI app should implement retry mechanism. Would you consider implement retry mechanism as a built-in feature in your library?

generative-ai/src/Mscc.GenerativeAI/GenerativeModel.cs

Lines 509 to 517 in 8c33634

    
           public async Task<GenerateContentResponse> GenerateContent(GenerateContentRequest? request) 
        
           { 
        
               if (request == null) throw new ArgumentNullException(nameof(request)); 
        
               var url = ParseUrl(Url, Method); 
        
               string json = Serialize(request); 
        
               var payload = new StringContent(json, Encoding.UTF8, MediaType); 
        
               var response = await Client.PostAsync(url, payload); 
        
               response.EnsureSuccessStatusCode();

jochenkirstaetter · 2024-03-21T09:03:56Z

Hi @doggy8088
Thanks for the suggestion. Yes, I was thinking about a retry mechanism already and considering something like Polly. However, right now I'd like to keep dependencies on other packages as small as possible and this package supports .NET Framework 4.7.2, .NET Standard 2.0, and .NET 6 or higher.

Do you have suggestions on how to implement a "native" retry mechanism?

doggy8088 · 2024-03-21T09:35:16Z

I think a built-in caching mechanism would be more convenient. It could be a optional argument to enable/disable this feature. Or provide a standalone method to do that. It could be a helper method or static method or extension method.

For the retry mechanism, this should be abstract to the developer. The method only throw exception when it reached max retry limit.

jochenkirstaetter · 2024-03-21T10:54:48Z

I'm going to look into that and open a feature branch.
The retry pattern for asynchronous calls in C# seems to be an interesting approach to solve this.

In case that you have code snippets or solid code samples, please feel free to post. That would be very much appreciated.

doggy8088 · 2024-03-21T14:23:14Z

Here is my code snippet for retry mechanism:

public async Task<string> Run(double temperature = -1, int retries = 5, string[] stopSequences = null)
{
    string endpoint = $"https://generativelanguage.googleapis.com/v1beta/models/gemini-1.0-pro-001:generateContent?key={this.ApiKey}";

    var request = new GeminiRequest()
    {
        Contents = this.Contents.ToArray(),
        GenerationConfig = this.DefaultGenerationConfig,
        SafetySettings = this.DefaultSafetySettings
    };

    if (temperature >= 0)
    {
        request.GenerationConfig.Temperature = temperature;
    }

    if (stopSequences is not null)
    {
        request.GenerationConfig.StopSequences = stopSequences.ToList();
    }

    var json = request.ToJson();

    using (HttpClient client = new HttpClient())
    {
        var content = new StringContent(json, Encoding.UTF8, "application/json");
        var message = await client.PostAsync(endpoint, content);

        responseContent = await message.Content.ReadAsStringAsync();
        this.LogResponseJson(responseContent, logId: logId);

        if (!message.IsSuccessStatusCode)
        {
            if (retries <= 0)
            {
                throw new Exception($"Gemini Error: {message.StatusCode}\n\n{message.Content.ReadAsStringAsync().Result}");
            }

            Thread.Sleep(5000);

            return await Run(temperature, retries - 1, stopSequences);
        }
    }

    GeminiResponse response = GeminiResponse.FromJson(responseContent);

    var firstCandidate = response.Candidates.First();

    bool status = false;
    switch (firstCandidate.FinishReason)
    {
        case FinishReason.SAFETY:
            // Util.HorizontalRun(true, request, response).Dump();
            break;
        case FinishReason.MAX_TOKENS:
            // Util.HorizontalRun(true, request, response).Dump();
            break;
        case FinishReason.RECITATION:
            // Util.HorizontalRun(true, request, response).Dump();
            break;
        case FinishReason.OTHER:
            // If you see BlockedReason.OTHER, the query or response may violate the terms of service or be otherwise unsupported.
            // https://ai.google.dev/docs/troubleshooting#safety_issues
            // Util.HorizontalRun(true, request, response).Dump();
            break;
        case FinishReason.UNSPECIFIED:
            // Util.HorizontalRun(true, request, response).Dump();
            break;
        case FinishReason.STOP:
            status = true;
            break;
    }

    if (!status)
    {
        if (retries <= 0)
        {
            throw new FinishReasonException($"Error: {firstCandidate.FinishReason}");
        }

        Thread.Sleep(5000);

        return await Run(temperature, retries - 1);
    }

    return firstCandidate.Content.Parts.First().Text;
}

jochenkirstaetter · 2024-03-25T10:31:49Z

Dear @doggy8088

Thanks for the code snippet. I'm trying to get my head around several things here.

When shall a retry be conducted? Depending on the returned HTTP status code there are not many options available or making sense to attempt a retry. The main reasons would be an HttpIoException or timeout related issues. As for any HTTP 4xx or HTTP 5xx I don't quite see the justification for a retry though. Perhaps an HTTP 429 could be of interest.
Do you see other reasons?
The handling of FinishReason seems to be contrary to the intended purpose (while reading the Python sources). However, I agree that the current implementation needs improvement in that area.
Your approach seems to work for non-streaming requests only. Or am I missing something?

As mentioned before, I like the idea to add the retry design pattern to the package. I'm currently a bit undecided about the actual how and the implementation.

Again, thanks for your contribution.

PS: The way how HttpClient is used can cause trouble regarding port exhaustion on the client. There are safer ways.

doggy8088 · 2024-03-25T10:34:50Z

In my personal experience, almost every circumstance needs retry. There are so many false positives error returned from Google AI Gemini API. The FinishReason could lie. In my current implementation, I retry everything when I get non-200 response.

I never tried streaming till now.

jochenkirstaetter · 2024-03-25T13:14:37Z

I see.
In case of an HTTP 400 Bad Request or an HTTP 404 Not Found, I wouldn't go for a retry. Same in case of an HTTP 500 Server error or similar.

I get your reasoning, and I'm giving it a bit more thought. In general, I like the retry design pattern, and yes I'd like to implement it as well into the package. Bear with me a little.

doggy8088 · 2024-03-25T18:35:34Z

I agreed with you about the HTTP 400 and 404. But for HTTP 500, that definitely need implement retry with backoff mechanism.

jochenkirstaetter added enhancement New feature or request help wanted Extra attention is needed labels Mar 21, 2024

jochenkirstaetter mentioned this issue Mar 26, 2024

The original usage seems broken: new GenerativeModel(...) #5

Closed

jochenkirstaetter mentioned this issue Jan 14, 2025

InvalidOperationException when setting Timeout #57

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature suggestion: Retry mechanism #2

Feature suggestion: Retry mechanism #2

doggy8088 commented Mar 20, 2024

jochenkirstaetter commented Mar 21, 2024

doggy8088 commented Mar 21, 2024

jochenkirstaetter commented Mar 21, 2024

doggy8088 commented Mar 21, 2024

jochenkirstaetter commented Mar 25, 2024 •

edited

Loading

doggy8088 commented Mar 25, 2024 •

edited

Loading

jochenkirstaetter commented Mar 25, 2024

doggy8088 commented Mar 25, 2024

Feature suggestion: Retry mechanism #2

Feature suggestion: Retry mechanism #2

Comments

doggy8088 commented Mar 20, 2024

jochenkirstaetter commented Mar 21, 2024

doggy8088 commented Mar 21, 2024

jochenkirstaetter commented Mar 21, 2024

doggy8088 commented Mar 21, 2024

jochenkirstaetter commented Mar 25, 2024 • edited Loading

doggy8088 commented Mar 25, 2024 • edited Loading

jochenkirstaetter commented Mar 25, 2024

doggy8088 commented Mar 25, 2024

jochenkirstaetter commented Mar 25, 2024 •

edited

Loading

doggy8088 commented Mar 25, 2024 •

edited

Loading