"String to long" when calling AskAsync. #793
Replies: 6 comments 2 replies
-
I should mention that SearchAsync returns results from the index without issue. |
Beta Was this translation helpful? Give feedback.
-
Chat model is GPT-4O. MaxTokenTotal is set to 128000. I am also setting the textTokenizer to GPT4Tokenizer Embedding model is: text-embedding-ada-002 If I drop the chat MaxTokenTotal to something like 50000 then it works fine. For GPT-4O the model is listed as a max input token of 124000. So my initial number was off, but even with only 100000 set it gives me an issue. |
Beta Was this translation helpful? Give feedback.
-
Could you share some more details:
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
that's similar to one of my setups and I haven't seen a similar issue. I will try to reproduce. If you could share how you're using the builder, that might help me reproducing with exactly the same settings. |
Beta Was this translation helpful? Give feedback.
-
I tried to reproduce using a large file, but the code works. There might be something special about the model you are using or the content, or some settings. Here's my code: var azureOpenAITextConfig = new AzureOpenAIConfig
{
Endpoint = "https://....openai.azure.com/",
Deployment = "gpt-4o",
Auth = AzureOpenAIConfig.AuthTypes.AzureIdentity,
MaxTokenTotal = 128000,
MaxRetries = 1
};
var azureOpenAIEmbeddingConfig = new AzureOpenAIConfig
{
Endpoint = "https://....openai.azure.com/",
Deployment = "text-embedding-ada-002",
Auth = AzureOpenAIConfig.AuthTypes.AzureIdentity,
MaxTokenTotal = 8191,
MaxEmbeddingBatchSize = 30,
MaxRetries = 1,
};
var searchClientConfig = new SearchClientConfig
{
MaxAskPromptSize = -1,
MaxMatchesCount = 2000,
AnswerTokens = 4096
};
var builder = new KernelMemoryBuilder()
.Configure(builder => builder.Services.AddLogging(l =>
{
l.SetMinimumLevel(LogLevel.Trace);
l.AddSimpleConsole(c =>
{
c.SingleLine = true;
c.UseUtcTimestamp = false;
c.TimestampFormat = "[HH:mm:ss.fff] ";
});
}))
.WithSimpleVectorDb(SimpleVectorDbConfig.Persistent)
.WithSimpleFileStorage(SimpleFileStorageConfig.Persistent)
.WithAzureOpenAITextGeneration(azureOpenAITextConfig, new GPT4oTokenizer())
.WithAzureOpenAITextEmbeddingGeneration(azureOpenAIEmbeddingConfig, new GPT4Tokenizer())
.WithSearchClientConfig(searchClientConfig);
var memory = builder.Build<MemoryServerless>();
const string Id = "g10800";
const string URL = "https://www.gutenberg.org/cache/epub/10800/pg10800-images.html";
if (!await memory.IsDocumentReadyAsync(Id))
{
Console.WriteLine($"Uploading {URL}");
await memory.ImportWebPageAsync(URL, documentId: Id);
}
else
{
Console.WriteLine($"{URL} already uploaded.");
}
Console.WriteLine("\n====================================\n");
var question = "explain";
Console.WriteLine($"Question: {question}");
var answer = await memory.AskAsync(question, minRelevance: 0.0);
Console.WriteLine($"\nAnswer: {answer.Result}"); |
Beta Was this translation helpful? Give feedback.
-
I created an index of web pages in KernelMemory by using ImportWebPageAsync. All of the web pages are stored in a single index. When I try to use AskAsync I am getting the following error:
Invalid 'messages[0].content': string too long. Expected a string with maximum length 1048576, but got a string with length 1315928 instead. Status: 400 (model_error) ErrorCode: string_above_max_length Content: { "error": { "message": "Invalid 'messages[0].content': string too long. Expected a string with maximum length 1048576, but got a string with length 1315928 instead.", "type": "invalid_request_error", "param": "messages[0].content", "code": "string_above_max_length" } }
The actual question being sent to ask is a short sentence.
It feels like under the covers AskAsync is doing something that is generating follow-up AI messages that are exceeding the model's token limit. Maybe I am wrong. Any idea on how to dig into this more? Since the error is being thrown at AskAsync I am not sure where to go with this.
I have AskAsync working on a different index for uploaded documents and small number of web pages. For this specific index I have over 50 web pages added to the index.
Beta Was this translation helpful? Give feedback.
All reactions