SQL restriction of 256 chars for Tags #856
Replies: 6 comments
-
I think this is somewhat related to #781. In particular, we probably should add indexes to As we can read at https://learn.microsoft.com/en-us/sql/sql-server/maximum-capacity-specifications-for-sql-server#-objects, regarding index size:
So, if we want to make the |
Beta Was this translation helpful? Give feedback.
-
Could you elaborate why you're using tags to store this kind of information? It doesn't look like a typical scenario, e.g. tags are designed to store key-values for filtering. If you are filtering by hyperlinks, paths, descriptions, you could consider storing a SHA256 hash instead (32 bytes), saving a lot of space. |
Beta Was this translation helpful? Give feedback.
-
Hello @dluc, sure. And thank you for looking into it. We use SQL as our data store for the front-end (FE). This is our only SQL model (no custom SQL tables). This is intentional, to avoid overcomplicating the structure and prevent duplicate entries between native and our own tables. While operations might be more costly, at the current stage, this is less of a concern for us compared to database complexity. With your suggestion, we would need to create a separate table to store real values and map them to hashes. Additionally, not all tags are used for searching; some are simply meant as additional information that might be used for synchronization or front-end controls (e.g., the link takes you to the source URL directly, not meant for search, same for the description). I understand that this adds clutter to the Search Service, but again, this is not a major concern for us now. As the tool is used for managing many different indexes—each with different tags, RAG filters, control tags, and support tags—it’s beneficial to have one SQL source handling everything. Also in the tool users can define any new tag they want, fully from FE. Finally, the reason I considered this a semi-"bug" is that neither the storage account nor the Search Service has this restriction, so it’s only SQL performance improvement that is "artificially" enforcing this length limit. That’s why I'm not suggesting we just set MAX, but rather, it would be great if we had the ability to override it ourselves to a bigger value. PS: If SQL is being reworked as @marcominerva mentioned, a length of 850 would be sufficient for us (if that is length of one tag key-value pair, not length for the array of all values for that one key). I would say 256 should be enough for 90% of values, 512 for 98%, and 850 for 99.99%. |
Beta Was this translation helpful? Give feedback.
-
I appreciate the need to simplify local infrastructure, but this approach involves compromises and does not align well with the intended use cases. Tags are specifically designed for filtering, making this scenario more suitable as a feature request to include file metadata during document uploads. An alternative approach I would consider, is using the "Request Context" to pass custom values and implementing a custom handler at the end of the pipeline to store this information in SQL storage, such as in the For clarity, KM memory storage is designed as a search index, containing only the subset of data necessary for RAG. Storing additional data may solve one problem but introduces new challenges like increased latency, higher CPU and storage usage, and increased costs. |
Beta Was this translation helpful? Give feedback.
-
Hello @dluc, Thank you for your suggestion. It makes sense, but I'm not sure if the effort is worth it at the moment. Additionally, it's not something we would like to take on ourselves, as we prefer to use libraries rather than creating custom implementations that need regular updates with your changes (and if custom, with very minor changes that merges nicely) . Regarding the context duplication, that's true. We face also other intended "duplications" because we use both SQL, which is easier for custom operations in our management app, and Search Service, which is better suited for final gen AI apps. We test both approaches. Our end goal is to make an Ask or Search request to KM memory and get everything directly for the found chunks, without needing to call any external source for metadata (whether it's SQL, Azure AI search, or anything else). Cause it enforces us to make extra calls that lower the performance as well. What we are trying to achieve is a bit of consistency within the current solution. If AzureAiSearch memory has no problem with text longer than a specific length in tags, I would expect SQL to be okay as well. If there are hard stops like text length preventing an index, that makes sense as a restriction because it affects the whole flow. But if the only restriction is that SQL execution is faster with 256 than with 512, and that's the only reason we're enforcing it, I'm not sure it's worth it. Especially if there is no indexing or anything on the columns, I would say the effect is negligible. Again, not all RAGs have such a number of items that need to be 100% optimized everywhere yet. Especially if you have many user build Gen AI tools/skills with various Tags and RAG used only on few hundreds files. I would say it's much easier to increase the column size than to introduce a whole new feature. This also doesn't fix the problem if you have a field longer than 256 that you want to search in and don't want to use a hash, as it creates the need to obtain the real value of that hash after the search, adding again more calls and dependencies. To really consider trying to optimize this would really need to see that it affects performance dramatically, and whether the optimization is really optimization, and not something that will complicate the solution significantly, and really doesn't help at all (just saving milliseconds and few cents) I hope this makes sense to you, how we look at the problem at hand. Cheers |
Beta Was this translation helpful? Give feedback.
-
And still there is the case of #781 that @marcominerva mentioned. |
Beta Was this translation helpful? Give feedback.
-
Context / Scenario
Hello,
Currently you are hardcoding 256 char limit for values in SQL tags
[value] NVARCHAR(256) NOT NULL
This is problematic for tags longer that this restriction, in our case hyperlink, but can be path, short description, anything really.
Not sure if this restriction is present in other MemoryDb, or whether it is SQL specific.
For a quick fix, I manually adjusted the column size in SQL, and it seems it works fine after.
This could work, but we need the ability to create new indexes in our app, so we need this feature to be supported directly by you (if if does not break anything else).
Would it be possible to enhance code along with SqlServerConfig and give us the ability to adjust the varchar size?
Either give ability to write our own number, or a bool/enum value to select between 256 (512,1024,2048...) or MAX.
example of your code that would need adjustments:
What happened?
What happens now:
If tag is longer than 256 nvarchar, the ingestion fails in unfinished state, preventing file to be saved to the SQL table, which blocks further management.
Also causes KernelMemory to go trough full retry cycle, unnecessarily spending processing resources.
error:
String or binary data would be truncated in table 'ADS_GENAI.dbo.KMMemoriesTags_botone', column 'value'. Truncated value: 'https://firm.sharepoint.com/sites/BontactOneCS/BontactOne%20services%20and%20applications/Complia'.
The statement has been terminated.
Expected behavior:
It will properly process file with tags longer than 256 chars without any failures.
Importance
a fix would make my life easier
Platform, Language, Versions
C#, net core 8
Relevant log output
String or binary data would be truncated in table 'ADS_GENAI.dbo.KMMemoriesTags_botone', column 'value'. Truncated value: 'https://firm.sharepoint.com/sites/BontactOneCS/BontactOne%20services%20and%20applications/Complia'.
The statement has been terminated.
Microsoft.Data.SqlClient.SqlException:
at Microsoft.Data.SqlClient.SqlConnection.OnError (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.TdsParser.TryRun (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.SqlCommand.FinishExecuteReader (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.SqlCommand.CompleteAsyncExecuteReader (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.SqlCommand.InternalEndExecuteNonQuery (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.SqlCommand.EndExecuteNonQueryInternal (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at Microsoft.Data.SqlClient.SqlCommand.EndExecuteNonQueryAsync (Microsoft.Data.SqlClient, Version=5.0.0.0, Culture=neutral, PublicKeyToken=23ec7fc2d6eaa4a5)
at System.Threading.Tasks.TaskFactory
1.FromAsyncCoreLogic (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Runtime.CompilerServices.ConfiguredTaskAwaitable
1+ConfiguredTaskAwaiter.GetResult (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)at Microsoft.KernelMemory.MemoryDb.SQLServer.SqlServerMemory+d__14.MoveNext (Microsoft.KernelMemory.MemoryDb.SQLServer, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.MemoryDb.SQLServer.SqlServerMemory+d__14.MoveNext (Microsoft.KernelMemory.MemoryDb.SQLServer, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore
1.ThrowForFailedGetResult (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e) at System.Threading.Tasks.Sources.ManualResetValueTaskSourceCore
1.GetResult (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)at Microsoft.KernelMemory.MemoryDb.SQLServer.SqlServerMemory+d__13.MoveNext (Microsoft.KernelMemory.MemoryDb.SQLServer, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.MemoryDb.SQLServer.SqlServerMemory+d__13.MoveNext (Microsoft.KernelMemory.MemoryDb.SQLServer, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.Handlers.SaveRecordsHandler+d__16.MoveNext (Microsoft.KernelMemory.Core, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.Handlers.SaveRecordsHandler+d__13.MoveNext (Microsoft.KernelMemory.Core, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.Pipeline.DistributedPipelineOrchestrator+d__8.MoveNext (Microsoft.KernelMemory.Core, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.Pipeline.DistributedPipelineOrchestrator+<>c__DisplayClass5_0+<b__0>d.MoveNext (Microsoft.KernelMemory.Core, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
at Microsoft.KernelMemory.Orchestration.AzureQueues.AzureQueuesPipeline+<>c__DisplayClass20_0+<b__0>d.MoveNext (Microsoft.KernelMemory.Orchestration.AzureQueues, Version=0.73.0.0, Culture=neutral, PublicKeyToken=f300afd708cefcd3)
Beta Was this translation helpful? Give feedback.
All reactions