Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to change the SQL Write Lock TimeOut #9744

Merged
merged 2 commits into from
Feb 2, 2021

Conversation

nul800sebastiaan
Copy link
Member

When acquiring a SQL write lock, we currently have hardcoded this to 1.8 seconds (1800ms). This has been the case since version 8.4.0 - #6688

In some cases, people seem to be getting the error Lock request time out period exceeded.. From what I gather from Shannon this indicates the followin:

Has to do with distributed locks when writing/reading from the database at the same time as another thread/process. This means that while something else is holding a write lock, something else is waiting for the readlock and it cannot acquire it because it waited too long. Normally do to too many write locks trying to be used or queries executing slowly.

From what I understand from conversations at HQ, the value of 1800ms is rather arbitrary and there's no objection to increasing it to 5000. This PR is a first step towards that, we have seen various issues with locks being exceeded.

With this PR merged, it will be possible to at least experiment with the lock timeout and alleviate some pain for now. It is still a good idea to investigate what's holding locks and why the lock time is not enough for some operations.

Copy link

@SimonAntony SimonAntony left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me and nice to be configurable. Happy to test this release when ready on a site I have timeout issues on.

Copy link
Contributor

@OwainWilliams OwainWilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with running the default values and also adding a new key to the web.config. Values were applied and published worked without any issues. Tested with a plain text string in a document type and also with Nest Content elements.
Value set to -1 to ensure that the lock timeout was being activated.

@OwainWilliams OwainWilliams merged commit 37402c5 into v8/contrib Feb 2, 2021
@OwainWilliams OwainWilliams deleted the v8/feature/sqlwritelocktimeout branch February 2, 2021 15:18
Copy link
Contributor

@Shazwazza Shazwazza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left some comments on the code.

Before this is released, we need to test what happens with high timeout values compared to what the db connection and command timeout is - because we aren't sure what happens if the configured lock timeout exceeds either of those, we may very well end up with other different more vague errors which would be worse.

{
try
{
return int.Parse(ConfigurationManager.AppSettings[Constants.AppSettings.SqlWriteLockTimeOut]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be tryparse and not have a try/catch, it's very easy to do error checking here without any need for try/catch. This should also store the result of this in a backing field so that it's not re-reading from config and re-parsing this value on every single usage of this - which will be a lot because this will be accessed for every read/write lock (lots).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just repeating the code above! 😅
Good point in adding a backing field here, I was under the assumption these settings were cached somewhere. 😄

}
catch
{
return 1800;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default should be changed to perhaps 5000.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it the same so that it will not change any behavior for now unless specifically configured.

@nul800sebastiaan
Copy link
Member Author

Before this is released, we need to test what happens with high timeout values compared to what the db connection and command timeout is - because we aren't sure what happens if the configured lock timeout exceeds either of those, we may very well end up with other different more vague errors which would be worse.

Should we limit it to a max value so people can't inflict too much pain on themselves?

nul800sebastiaan added a commit that referenced this pull request Feb 3, 2021
Default value is now 5000
There's a limit to the range of the timeout values
Introduced backing field so as not to keep looking up the same value over and over again from the config
@afinn-tech
Copy link

Sounds like a great , we struggled with the time , cant wait for this fix - great work guys

@stefankip
Copy link

We're running into this issue a lot with Umbraco Cloud. Error logs getting spammed with System.Data.SqlClient.SqlException (0x80131904): Lock request time out period exceeded. The statement has been terminated. messages.
Looking forward to 8.12

@afinn-tech
Copy link

@nul800sebastiaan any chance of RC of 8.12 ? so we can test this function ?

@nul800sebastiaan
Copy link
Member Author

@stefanoberetta
Copy link

Is the Lock request time out period exceeded. error supposed to be triggered when I have two websites (one running localhost and the other one on a dev server -NOT Azure-) sharing the same database?

@Shazwazza
Copy link
Contributor

@stefanoberetta this lock is a distributed lock that uses the database. If you have 2 websites pointing to the same database you are Load Balancing with more than one main server which is not explicitly supported (but people do it). This error isn't specific to load balancing, this error will occur on a single server or multiple servers where a write lock is held for too long while another request or process is trying to acquire another read/write lock with the same ID but it times out.

here's a description: #9594 (comment)

@timmather
Copy link

We are having a similar problem with a 8.6+ build (now on 8.12). Single server setup iis and sql on the same box. After 6/7 saves we get this error, worse with multiple editors, nothing special running at all, tried increasing timeout using writelock setting, content started to disappear across the site. It’s a big problem we have two sites waiting to go live this week with 20 editors and they can’t use the site. Never been a problem before on over 80 deployments.

@stefankip
Copy link

@timmather is it using Vendr perhaps?

@timmather
Copy link

Unfortunately not, it’s actually a very simple build apart from having around 2k nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants