-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve multi-threaded repo locking #2344
Comments
I'm sure there's probably a data structure that's more appropriate for this, but this is the idea I came up with. The actual lock file descriptor is stored in For multiple threads that's not possible since they can act in any order. The way I thought of to handle this is to maintain a stack per thread in thread local storage like now so that when an individual thread pops a lock state it knows what type of lock state it's dropping. In the shared mutex guarded code, there would just be 2 counters for how many shared and exclusive states are on the stack. This maintains the latching aspect of it. I.e., the lock stays exclusive until all exclusive states have been popped. When all shared and exclusive states have been popped, the lock is unlocked. |
OK from your original commit
But...the patch Alexander had here originally https://bugzilla.gnome.org/show_bug.cgi?id=759442#c2 was way simpler and also by benefit of being file based (like the sysroot lock) also works across processes, which is I think what we actually want. Then you proposed this: #813 (comment) And we merged it but...it seems to be we really want a simple read/write lock, i.e. what you get in-process with a GRWLock and out of process with In other words I think the flaw originally here was thinking of the "read" lock as truly restricting to read operations whereas I think we should interpret to mean "can write new data, just not delete existing data". |
The lock I implemented is Likewise, since we want to use 2 states, there needs to be some tracking of what state the lock is in. This is particularly important when unwinding the recursion and figuring out what state to go back to. This is the other source of the complexity. Now let's consider some of the locking types out there.
So, what I want is the combination of I hadn't given the
The code always refers to it by the flock names - shared and exclusive. As did Alex's patch. But I agree there has been some confusion about what those mean. However, I don't think it matters what the states are called, just that you have more than 1 state and you know what they refer to. In ostree we could refer to this as a creator/destroyer lock or something. |
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. Some bookkeeping is needed to keep track of the lock state when recursing and a mutex guards against threads altering the state concurrently. The required GLib is bumped to 2.44 to allow usage of `GMutexLocker`. Fixes: ostreedev#2344
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. Some bookkeeping is needed to keep track of the lock state when recursing and a mutex guards against threads altering the state concurrently. The required GLib is bumped to 2.44 to allow usage of `GMutexLocker`. Fixes: ostreedev#2344
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. A mutex guards against threads altering the lock state concurrently. Fixes: ostreedev#2344
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. Some bookkeeping is needed to keep track of the lock state when recursing and a mutex guards against threads altering the state concurrently. Fixes: ostreedev#2344
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. A mutex guards against threads altering the lock state concurrently. Fixes: ostreedev#2344
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. A mutex guards against threads altering the lock state concurrently. Fixes: ostreedev#2344
Previously each thread maintained its own lock file descriptor regardless of whether the thread was using the same `OstreeRepo` as another thread. This was very safe but it made certain multithreaded procedures difficult. For example, if a main thread took an exclusive lock and then spawned worker threads, it would deadlock if one of the worker threads tried to acquire the lock. This moves the file descriptor from thread local storage to the `OstreeRepo` structure so that threads using the same `OstreeRepo` can share the lock. A mutex guards against threads altering the lock state concurrently. Fixes: ostreedev#2344
Currently the repo locking is entirely thread local, which is very safe but makes it difficult to use in a multi-threaded application. The canonical issue would be if you took an exclusive lock and then spawned some threads with tried to take a shared lock. This would deadlock because the threads could never acquire the lock even though the application holds an exclusive lock on the repo.
This isn't theoretical. Back when I was working on #1292 I wanted to make
ostree_repo_prune_object
take an exclusive lock. However, it's actually used deep within the pull code to try to cleanup a tombstone commit (IIRC). The main thread acquires a shared repo lock when it starts a transaction and then the pull code spawns a bunch of threads. When a pull thread hits this path it deadlocks because it actually has no connection to the main thread lock.If we're to make the repo locking public, it's pretty likely that someone would take an exclusive lock and then run the pull code. I know I've done that many times in our tooling when doing some maintenance task.
The text was updated successfully, but these errors were encountered: