Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock gets stuck if app is killed at a bad time #172

Closed
felixkrull-neuland opened this issue Dec 16, 2022 · 3 comments · Fixed by #173
Closed

Lock gets stuck if app is killed at a bad time #172

felixkrull-neuland opened this issue Dec 16, 2022 · 3 comments · Fixed by #173
Labels
enhancement New feature or request

Comments

@felixkrull-neuland
Copy link
Contributor

The current locking mechanism is susceptible to getting stuck locked if the process gets killed while checking migrations. We've had this happen a couple of times now and it's kinda bad because it deadlocks our whole system.

This PR also looks related to this problem, even if its solution is rather brute-force: #169

Further ideas we've had on how to solve this:

  1. Don't acquire the lock when checking migrations. I believe it's sound to check for unapplied scripts first and then only acquire the lock if that check finds unapplied scripts. This would significantly decrease the likelihood of problems since 99% of runs the lock wouldn't be needed. The issue could still happen when migrations have to be applied, but that is rare and also doesn't happen on a weekend when no one's around.
  2. Instead of just true or false, store a timestamp with the lock. This would at least give some info whether the lock is stale.
@felixkrull-neuland
Copy link
Contributor Author

Unless there's objections to the approach, I'll go ahead and implement option 1 since it looks relatively simple and also like it would reduce occurrences significantly.

@lennehendrickx
Copy link

We are running into the same issue:
Kubernetes kills a container that has locked the history index, but before the container is able to remove the lock.

This solution would help us to run into this less frequently. 99% of the time a container does not need to execute upgrade scripts when starting.

@xtermi2 xtermi2 added the enhancement New feature or request label Jan 11, 2023
@felixkrull-neuland
Copy link
Contributor Author

🎉 And thanks for making the release, now we can remove our monkeypatched backport.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants