Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Constants.Sql.MaxParameterCount and update query logic #11369

Merged
merged 5 commits into from
Oct 22, 2021

Conversation

ronaldbarendse
Copy link
Contributor

Prerequisites

  • I have added steps to test this contribution in the description below

If there's an existing issue for this PR then this fixes #10383.

Description

The Umbraco codebase contains a lot of places where parameterized SQL queries might require more than the maximum amount of 2100 parameters that is supported by SQL Server (see documentation). This PR adds a slightly lower value of 2000 as Constants.Sql.MaxParameterCount, so there's room for some extra 'static' parameters, and uses that instead of an arbitrary number.

I've tried to find all instances where 2000/2100 was used or related extension methods like InGroupsOf()/FetchByGroups() and updated those to use the constant. In doing so, I've also fixed some issues where the maximum parameter count was still exceeded (and thereby fix the linked issue and make PR #10677 obsolete).

@@ -768,7 +768,7 @@ private void CopyTagData(int? sourceLanguageId, int? targetLanguageId, IReadOnly
// note: important to use SqlNullableEquals for nullable types, cannot directly compare language identifiers

var whereInArgsCount = propertyTypeIds.Count + (contentTypeIds?.Count ?? 0);
if (whereInArgsCount > 2000)
if (whereInArgsCount > Constants.Sql.MaxParameterCount)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to rewrite this to support more property/content types by introducing the grouping/batching, but that will require more testing (something for another PR).

Comment on lines +1097 to +1110
// lookup all matching rows in umbracoDocumentCultureVariation
// fetch in batches to account for maximum parameter count (distinct languages can't exceed 2000)
var languageIds = editedLanguageVersions.Keys.Select(x => x.langId).Distinct().ToArray();
var nodeIds = editedLanguageVersions.Keys.Select(x => x.nodeId).Distinct();
var docCultureVariationsToUpdate = nodeIds.InGroupsOf(Constants.Sql.MaxParameterCount - languageIds.Length)
.SelectMany(group =>
{
var sql = Sql().Select<DocumentCultureVariationDto>().From<DocumentCultureVariationDto>()
.WhereIn<DocumentCultureVariationDto>(x => x.LanguageId, languageIds)
.WhereIn<DocumentCultureVariationDto>(x => x.NodeId, group);

return Database.Fetch<DocumentCultureVariationDto>(sql);
})
.ToDictionary(x => (x.NodeId, (int?)x.LanguageId), x => x); //convert to dictionary with the same key type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #10677 (comment), we can only retrieve valid results by always including either all language or node IDs. Because we'll have less languages than nodes in almost all cases, we can group the node IDs into batches of Constants.Sql.MaxParameterCount - languageIds.Length.

Because we're using distinct language and node IDs, the dictionary key should always contain unique values.

{
var sqlClause = GetBaseQuery(false)
.Where<DictionaryDto>(x => x.Parent != null)
.Where($"{SqlSyntax.GetQuotedColumnName("parent")} IN (@parentIds)", new { parentIds = @group });
.WhereIn<DictionaryDto>(x => x.Parent, group);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using WhereIn() should do the same thing, but using a cleaner syntax.

}
else
{
foreach (var group in entityIds.InGroupsOf(Constants.Sql.MaxParameterCount - groupIds.Length))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to the language/node IDs above, this always includes all group IDs and batches the entity IDs.

@@ -37,8 +37,9 @@ protected override IRedirectUrl PerformGet(Guid id)

protected override IEnumerable<IRedirectUrl> PerformGetAll(params Guid[] ids)
{
if (ids.Length > 2000)
throw new NotSupportedException("This repository does not support more than 2000 ids.");
if (ids.Length > Constants.Sql.MaxParameterCount)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also fix this by using grouping/batching.

Comment on lines 187 to +189
// can't query more than 2000 ids at a time... but if someone is really querying 2000+ entities,
// the additional overhead of fetching them in groups is minimal compared to the lookup time of each group
const int maxParams = 2000;
if (ids.Length <= maxParams)
if (ids.Length <= Constants.Sql.MaxParameterCount)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit confusing, as this is probably not done to prevent the overhead of grouping, but to return all entities in case ids.Length == 0. Grouping would otherwise result in not returning anything at all...

@p-m-j p-m-j merged commit c61eaa2 into v8/dev Oct 22, 2021
@p-m-j p-m-j deleted the v8/bugfix/maxparametercount branch October 22, 2021 20:08
@umbrabot
Copy link

Hi there @ronaldbarendse,

First of all: A big #H5YR for making an Umbraco related contribution during Hacktoberfest! We are very thankful for the huge amount of PRs submitted, and all the amazing work you've been doing 🥇

Due to the amazing work you and others in the community have been doing, we've had a bit of a hard time keeping up. 😅 While all of the PRs for Hacktoberfest might not have been merged yet, you still qualify for receiving some Umbraco swag, congratulations! 🎉

In the spirit of Hacktoberfest we've prepared some exclusive Umbraco swag for all our contributors - including you!

As an alternative choice this year, you can opt-out of receiving anything and ask us to help improve the planet instead by planting a tree on your behalf. 🌳

Receive your swag or plant a tree! 👈 Please follow this link to fill out and submit the form, before December 31st, 2021.

Following this date we'll be sending out all the swag, but please note that it might not reach your doorstep for a few months, so please bear with us and be patient 🙏

We have blogged about this year's hacktoberfest in a recap post, have a read about at all the achievements for this year!

The only thing left to say is thank you so much for participating in Hacktoberfest! We really appreciate the help!

Kind regards,
The various Umbraco Teams

@jreurink
Copy link

@ronaldbarendse we've recently upgraded to v8.18.3 to see if the issue was fixed by this PR, but sadly, it is still there. The error now occurs on this piece of code:

//Now bulk update the umbracoDocument table
foreach (var editValue in editedDocument.GroupBy(x => x.Value))
{
    Database.Execute(Sql().Update<DocumentDto>(u => u.Set(x => x.Edited, editValue.Key))
        .WhereIn<DocumentDto>(x => x.NodeId, editValue.Select(x => x.Key)));
}

The variable 'editedDocument' can contain more than Constants.Sql.MaxParameterCount, which causes the call to Database.Execute to fail. We've implemented a work-around by using InGroupsOf on the dictionary, but I'm not sure if that could introduce issues for others.

@ronaldbarendse
Copy link
Contributor Author

@jreurink That's both good and bad news, but thankfully something you have found a work-around for 😇

Looks like both these bulk updates will need to be fixed, as the editValue.Select(...) can indeed result in overflowing the SQL parameter count:

//Now bulk update the table DocumentCultureVariationDto, once for edited = true, another for edited = false
foreach (var editValue in toUpdate.GroupBy(x => x.Edited))
{
Database.Execute(Sql().Update<DocumentCultureVariationDto>(u => u.Set(x => x.Edited, editValue.Key))
.WhereIn<DocumentCultureVariationDto>(x => x.Id, editValue.Select(x => x.Id)));
}
//Now bulk update the umbracoDocument table
foreach (var editValue in editedDocument.GroupBy(x => x.Value))
{
Database.Execute(Sql().Update<DocumentDto>(u => u.Set(x => x.Edited, editValue.Key))
.WhereIn<DocumentDto>(x => x.NodeId, editValue.Select(x => x.Key)));
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants