-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlite: set busy timeout to 100s #20838
Conversation
Only one process can write to the sqlite db at the same time, if another process tries to use it at that time it fails and a database is locked error is returned. If this happens sqlite should keep retrying until it can write. To do that we can just set the _busy_timeout option. A 100s timeout should be enough even on slower systems but not to much in case there is a deadlock so it still returns in a reasonable time. [NO NEW TESTS NEEDED] I think we strongly need to consider some form of parallel stress testing to catch bugs like this. Fixes containers#20809 Signed-off-by: Paul Holzinger <[email protected]>
@mheon @vrothberg @edsantiago PTAL |
Now testing in #17831 PS thank you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believed that one of the options above took care of setting a default busy value already but I may be wrong.
Changes LGTM. Did you manage to manually reproduce and validate the fix is solving it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Yes using the reproducer from the issue. |
I'm going to drop a Even if this does turn out to not be a complete fix for some reason I see no harm in merging it. |
/hold cancel |
One question though -- what "writing to the database" mean exactly? Is it like single update statement, single transaction, single "connection"? In other words what needs to fit within those 100s? If I have a big container with lots of files and I do |
From the SQLite locking model, this should be a single transaction. Connections are allowed to be concurrent, locks are only taken on write transactions. |
To be clear here the lock is only taken for db writes, those should generally be fast. We do not have the db locked while we delete the container storage for example. The db stores only "metadata" so the data should not be to big and operations should be fast. |
Only one process can write to the sqlite db at the same time, if another process tries to use it at that time it fails and a database is locked error is returned. If this happens sqlite should keep retrying until it can write. To do that we can just set the _busy_timeout option. A 100s timeout should be enough even on slower systems but not to much in case there is a deadlock so it still returns in a reasonable time.
[NO NEW TESTS NEEDED] I think we strongly need to consider some form of parallel stress testing to catch bugs like this.
Fixes #20809
Does this PR introduce a user-facing change?