-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock between multiple container creates with similar volumes #20313
Comments
@mheon PTAL |
This is strange. @vrothberg @Luap99 Mind taking a look here, in case my brain is failing me? |
It’s always possible that I’ve misidentified the cause of the hang, but it does still hang somewhere near the listed code. I can add further logging if you like or run a further debug build, but catching this problem is maddeningly rare. Spitballing: does ‘defer’ cause all of the deferred functions to be called upon leaving the nearest closure? Does each for loop iteration count as leaving the closure or do all the deferred funcs occur as quickly as possible after leaving the loop (in separate threads? Eh? Eh?) |
Thanks for reporting, @UniversalSuperBox ! Can you share the stack traces of the deadlocked processes? You can get them by sending SIGABRT to the podman process. This will dump the stack trace. I currently do not see how an ABBA deadlock can happen between the two. But I am on my first mug of coffee and may miss something. Stack traces will help prevent speculation. |
It isn't what you see per your logs but one big problem is that we can definitely deadlock is setupContainer() when locking the volumes. This does a simple loop over an (AFAICT unsorted!) array while locking each volume, I think the order is how it was set on the cli. So when two containers are created at the same time with a different order of volumes this can deadlock if both proccesses end up in this loop at the same time. IMO locking like this in a loop is with rare exception never a good idea as it easily ends up deadlocking. However I think there is something else going on in your case so the stack traces of both processes should help to see where the hang is exactly. |
I am starting to wonder if locking all volumes on container create is strictly necessary. The database is robust enough to ensure that we cannot add to a volume that is already gone, so I am thinking we can drop the volume locks entirely and rely on the database to ensure safety in that case. |
The plot thickens... I haven't been able to reproduce this problem with two instances of Podman, only when three are running. This could be because I'm super unlucky, but I've had two instances running for some twenty minutes without hitting the deadlock. I've attempted to retrieve stacktraces by running https://gist.github.com/UniversalSuperBox/75564012d098d0f4c79bcf7d5dc878f6 Worth noting that I'm running Podman 4.7.0 inside |
I try to take a look tomorrow at the stack traces/ as for a debugger I use https://github.com/go-delve/delve, then just |
It was a little fiddly to get Podman's debug symbols and delve installed in Stack traces, and the commands used to get the system to deadlock, are here: https://gist.github.com/UniversalSuperBox/1e855b07bdec0f4d79f4c117c09da353 Potentially worth noting, "Aside" which is the "odd one out" is in |
After thinking about it more, I'm now confident that we can remove the volume locks from |
Given your reproducer and stack traces I think it is what I said in #20313 (comment) The two process are stucked in setupContainer() while locking the volumes, the third one is just also waiting in the prepare call as it just wants access to its volume but it nevers gets it due the deadlock caused by the other process. I agree with @mheon here the locking is unnecessary in this place. The db logic should prevent this from happening and if the locks were really needed then this is already mostly useless as the volume can be removed before it is locked. |
Agreed. Removing such objects in parallel to trying to use them is always racy and not something Podman can prevent. +1 to removing the locking there. |
When containers are created with a named volume it can deadlock because the create logic tried to lock all volumes in a loop, this is fine if it only ever creates a single container at any given time. However because we multiple containers can be created at the same time they can cause a deadlock between the volumes. This is because the order of the loop is not stable, in fact it is based on the order of how the volumes were specified on the cli. So if you create two containers at the same time with `-v vol1:/dir2 -v vol2:/dir2` and the other one with `-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock. Now one solution could be to order the volumes to prevent the issue but the reason for holding the lock is dubious. The goal was to prevent the volume from being removed in the meantime. However that could still have happend before we acquired the lock so it didn't protect against that. Both boltdb and sqlite already prevent us from adding a container with volumes that do not exists due their internal consistency checks. Sqlite even uses FOREIGN KEY relationships so the schema will prevent us from doing anything wrong. The create code currently first checks if the volume exists and if not creates it. I have checked that the db will guarantee that this will not work: Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume` Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed` Keep in mind that this error is normally not seen, only if the volume is removed between the volume exists check and adding the container in the db this messages will be seen wich is an acceptable race and a pre-existing condition anyway. [NO NEW TESTS NEEDED] Race condition, hard to test in CI. Fixes containers#20313 Signed-off-by: Paul Holzinger <[email protected]>
#20329 should fix it. |
I've updated the issue title and added a note to the description to help future travelers. For context on how we found this and why I was confused, we have a server which runs jobs which include Thanks to everyone for your quick response! |
When containers are created with a named volume it can deadlock because the create logic tried to lock all volumes in a loop, this is fine if it only ever creates a single container at any given time. However because we multiple containers can be created at the same time they can cause a deadlock between the volumes. This is because the order of the loop is not stable, in fact it is based on the order of how the volumes were specified on the cli. So if you create two containers at the same time with `-v vol1:/dir2 -v vol2:/dir2` and the other one with `-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock. Now one solution could be to order the volumes to prevent the issue but the reason for holding the lock is dubious. The goal was to prevent the volume from being removed in the meantime. However that could still have happend before we acquired the lock so it didn't protect against that. Both boltdb and sqlite already prevent us from adding a container with volumes that do not exists due their internal consistency checks. Sqlite even uses FOREIGN KEY relationships so the schema will prevent us from doing anything wrong. The create code currently first checks if the volume exists and if not creates it. I have checked that the db will guarantee that this will not work: Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume` Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed` Keep in mind that this error is normally not seen, only if the volume is removed between the volume exists check and adding the container in the db this messages will be seen wich is an acceptable race and a pre-existing condition anyway. [NO NEW TESTS NEEDED] Race condition, hard to test in CI. Fixes containers#20313 Signed-off-by: Paul Holzinger <[email protected]> MH: Backported to v4.6.1-rhel per RH Jira RHEL-20910 and RHEL-20911. Signed-off-by: Matt Heon <[email protected]>
Cherry pick from containers#20329 Addresses: https://issues.redhat.com/browse/RHEL-14744 and https://issues.redhat.com/browse/RHEL-14743 When containers are created with a named volume it can deadlock because the create logic tried to lock all volumes in a loop, this is fine if it only ever creates a single container at any given time. However because we multiple containers can be created at the same time they can cause a deadlock between the volumes. This is because the order of the loop is not stable, in fact it is based on the order of how the volumes were specified on the cli. So if you create two containers at the same time with `-v vol1:/dir2 -v vol2:/dir2` and the other one with `-v vol2:/dir2 -v vol1:/dir1` then there is chance for a deadlock. Now one solution could be to order the volumes to prevent the issue but the reason for holding the lock is dubious. The goal was to prevent the volume from being removed in the meantime. However that could still have happend before we acquired the lock so it didn't protect against that. Both boltdb and sqlite already prevent us from adding a container with volumes that do not exists due their internal consistency checks. Sqlite even uses FOREIGN KEY relationships so the schema will prevent us from doing anything wrong. The create code currently first checks if the volume exists and if not creates it. I have checked that the db will guarantee that this will not work: Boltdb: `no volume with name test2 found in database when adding container xxx: no such volume` Sqlite: `adding container volume test2 to database: FOREIGN KEY constraint failed` Keep in mind that this error is normally not seen, only if the volume is removed between the volume exists check and adding the container in the db this messages will be seen wich is an acceptable race and a pre-existing condition anyway. [NO NEW TESTS NEEDED] Race condition, hard to test in CI. Fixes containers#20313 Signed-off-by: Paul Holzinger <[email protected]> Signed-off-by: tomsweeneyredhat <[email protected]>
Don't be fooled by this issue description! The actual issue was setupContainer deadlocking with itself across multiple simultaneous container creates which used the same volumes in different orders on the commandline.
Issue Description
Podman can deadlock when two (or more) containers are being set up using the same volume at the same time. When this condition is triggered, most
podman
commands hang, includingps
,info
,volumes
, and basically anything else of any use. The deadlock can be cleared by ending the hanged podman processes, but of course this is not an ideal solution.For example,
podman ps
hangs with the following last lines:(full log of that run can be found at https://gist.github.com/UniversalSuperBox/162ef8e6c76bee4b8d3cbae1709efa9e)
Upstream logging indicates that the first container is stuck in the following loop:
podman/libpod/container_internal.go
Lines 1727 to 1742 in d90fdfc
Meanwhile, the second container is stuck in the following loop. I know this from my added logging, where
AAAA done setting up shm
comes directly before the indicated lines andAAAA locked named volumes
comes directly after:podman/libpod/runtime_ctr.go
Lines 568 to 581 in d90fdfc
The latter code has been around since e563f41 in 2019 and while it did have a small update in 0f637e0 to prevent this deadlock in a single-container scenario, there is no accounting for multiple podman binaries (or maybe even multiple requests to the same daemon binary?) trying to take the lock at the same time.
This seems like a classic deadlock. The first container is probably waiting on the second container to release a lock on one of the named volumes. The second is waiting on the first. My first reaction to this says "only one process should be able to mutate any named volume at a time," I'm hoping that someone with more experience with the code has a better idea.
I know I'm using podman 4.6.0, but I don't believe any changes between 4.6.0 and the current 4.7.1 release would mitigate the issue.
Steps to reproduce the issue
podman run
) two containers which use the same named volume at the same timeDescribe the results you received
Neither container ends up running. With debug logging (and a bit more logging added by yours truly), one container's log ends at:
(full log is SideA.log in this gist)
The other container's log ends:
(full log is SideB.log in this gist)
Describe the results you expected
Both containers run to completion
podman info output
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
No
Additional environment details
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: