"String to long" when calling AskAsync. #793

mhackermsft · 2024-09-20T16:55:16Z

mhackermsft
Sep 20, 2024

I created an index of web pages in KernelMemory by using ImportWebPageAsync. All of the web pages are stored in a single index. When I try to use AskAsync I am getting the following error:

Invalid 'messages[0].content': string too long. Expected a string with maximum length 1048576, but got a string with length 1315928 instead. Status: 400 (model_error) ErrorCode: string_above_max_length Content: { "error": { "message": "Invalid 'messages[0].content': string too long. Expected a string with maximum length 1048576, but got a string with length 1315928 instead.", "type": "invalid_request_error", "param": "messages[0].content", "code": "string_above_max_length" } }

The actual question being sent to ask is a short sentence.

It feels like under the covers AskAsync is doing something that is generating follow-up AI messages that are exceeding the model's token limit. Maybe I am wrong. Any idea on how to dig into this more? Since the error is being thrown at AskAsync I am not sure where to go with this.

I have AskAsync working on a different index for uploaded documents and small number of web pages. For this specific index I have over 50 web pages added to the index.

mhackermsft · 2024-09-20T17:01:33Z

mhackermsft
Sep 20, 2024
Author

I should mention that SearchAsync returns results from the index without issue.

1 reply

dluc Sep 20, 2024
Maintainer

It seems there might be a configuration issue with the models in use.

Could you please provide your configuration details - for example, which AI service and models you're using, and how you've set the token limits?

mhackermsft · 2024-09-20T17:07:21Z

mhackermsft
Sep 20, 2024
Author

Chat model is GPT-4O. MaxTokenTotal is set to 128000.

I am also setting the textTokenizer to GPT4Tokenizer

Embedding model is: text-embedding-ada-002
MaxTokenTotal is 8192

If I drop the chat MaxTokenTotal to something like 50000 then it works fine. For GPT-4O the model is listed as a max input token of 124000. So my initial number was off, but even with only 100000 set it gives me an issue.

0 replies

dluc · 2024-09-20T17:56:45Z

dluc
Sep 20, 2024
Maintainer

Could you share some more details:

are you using KM as a service, or serverless?
could you confirm the value set is TextModelMaxTokenTotal ?
are you using OpenAI or Azure OpenAI?
are you also configuring SearchClient or using the defaults?

0 replies

mhackermsft · 2024-09-20T18:44:42Z

mhackermsft
Sep 20, 2024
Author

KM Serverless
.WithAzureOpenAITextGeneration has the MaxTokenTotal value set to 124000 (is this the same as the TextModelmaxTokenTotal you are asking about?)
Using Azure OpenAI
Using the default AskAsync method from the KernelMemory instance.
Storage is SimpleVectorDb on local disk storage

0 replies

dluc · 2024-09-20T21:10:51Z

dluc
Sep 20, 2024
Maintainer

that's similar to one of my setups and I haven't seen a similar issue. I will try to reproduce. If you could share how you're using the builder, that might help me reproducing with exactly the same settings.

0 replies

dluc · 2024-09-23T00:32:27Z

dluc
Sep 23, 2024
Maintainer

I tried to reproduce using a large file, but the code works. There might be something special about the model you are using or the content, or some settings.

Here's my code:

var azureOpenAITextConfig = new AzureOpenAIConfig
{
    Endpoint = "https://....openai.azure.com/",
    Deployment = "gpt-4o",
    Auth = AzureOpenAIConfig.AuthTypes.AzureIdentity,
    MaxTokenTotal = 128000,
    MaxRetries = 1
};

var azureOpenAIEmbeddingConfig = new AzureOpenAIConfig
{
    Endpoint = "https://....openai.azure.com/",
    Deployment = "text-embedding-ada-002",
    Auth = AzureOpenAIConfig.AuthTypes.AzureIdentity,
    MaxTokenTotal = 8191,
    MaxEmbeddingBatchSize = 30,
    MaxRetries = 1,
};

var searchClientConfig = new SearchClientConfig
{
    MaxAskPromptSize = -1,
    MaxMatchesCount = 2000,
    AnswerTokens = 4096
};

var builder = new KernelMemoryBuilder()
    .Configure(builder => builder.Services.AddLogging(l =>
    {
        l.SetMinimumLevel(LogLevel.Trace);
        l.AddSimpleConsole(c =>
        {
            c.SingleLine = true;
            c.UseUtcTimestamp = false;
            c.TimestampFormat = "[HH:mm:ss.fff] ";
        });
    }))
    .WithSimpleVectorDb(SimpleVectorDbConfig.Persistent)
    .WithSimpleFileStorage(SimpleFileStorageConfig.Persistent)
    .WithAzureOpenAITextGeneration(azureOpenAITextConfig, new GPT4oTokenizer())
    .WithAzureOpenAITextEmbeddingGeneration(azureOpenAIEmbeddingConfig, new GPT4Tokenizer())
    .WithSearchClientConfig(searchClientConfig);

var memory = builder.Build<MemoryServerless>();

const string Id = "g10800";
const string URL = "https://www.gutenberg.org/cache/epub/10800/pg10800-images.html";

if (!await memory.IsDocumentReadyAsync(Id))
{
    Console.WriteLine($"Uploading {URL}");
    await memory.ImportWebPageAsync(URL, documentId: Id);
}
else
{
    Console.WriteLine($"{URL} already uploaded.");
}

Console.WriteLine("\n====================================\n");

var question = "explain";
Console.WriteLine($"Question: {question}");
var answer = await memory.AskAsync(question, minRelevance: 0.0);
Console.WriteLine($"\nAnswer: {answer.Result}");

1 reply

mhackermsft Sep 23, 2024
Author

I didn't use the SearchClientConfig. I will try adding that in to see if it makes any difference in my scenario and report back.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"String to long" when calling AskAsync. #793

{{title}}

Replies: 6 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

"String to long" when calling AskAsync. #793

mhackermsft Sep 20, 2024

Replies: 6 comments · 2 replies

mhackermsft Sep 20, 2024 Author

dluc Sep 20, 2024 Maintainer

mhackermsft Sep 20, 2024 Author

dluc Sep 20, 2024 Maintainer

mhackermsft Sep 20, 2024 Author

dluc Sep 20, 2024 Maintainer

dluc Sep 23, 2024 Maintainer

mhackermsft Sep 23, 2024 Author

mhackermsft
Sep 20, 2024

Replies: 6 comments 2 replies

mhackermsft
Sep 20, 2024
Author

dluc Sep 20, 2024
Maintainer

mhackermsft
Sep 20, 2024
Author

dluc
Sep 20, 2024
Maintainer

mhackermsft
Sep 20, 2024
Author

dluc
Sep 20, 2024
Maintainer

dluc
Sep 23, 2024
Maintainer

mhackermsft Sep 23, 2024
Author