You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current locking mechanism is susceptible to getting stuck locked if the process gets killed while checking migrations. We've had this happen a couple of times now and it's kinda bad because it deadlocks our whole system.
This PR also looks related to this problem, even if its solution is rather brute-force: #169
Further ideas we've had on how to solve this:
Don't acquire the lock when checking migrations. I believe it's sound to check for unapplied scripts first and then only acquire the lock if that check finds unapplied scripts. This would significantly decrease the likelihood of problems since 99% of runs the lock wouldn't be needed. The issue could still happen when migrations have to be applied, but that is rare and also doesn't happen on a weekend when no one's around.
Instead of just true or false, store a timestamp with the lock. This would at least give some info whether the lock is stale.
The text was updated successfully, but these errors were encountered:
Unless there's objections to the approach, I'll go ahead and implement option 1 since it looks relatively simple and also like it would reduce occurrences significantly.
We are running into the same issue:
Kubernetes kills a container that has locked the history index, but before the container is able to remove the lock.
This solution would help us to run into this less frequently. 99% of the time a container does not need to execute upgrade scripts when starting.
The current locking mechanism is susceptible to getting stuck locked if the process gets killed while checking migrations. We've had this happen a couple of times now and it's kinda bad because it deadlocks our whole system.
This PR also looks related to this problem, even if its solution is rather brute-force: #169
Further ideas we've had on how to solve this:
The text was updated successfully, but these errors were encountered: