-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock with Postgres on 2.0.1 #11463
Comments
Does this happen if you unmount the source of the clone and then re-mount it before spinning up the WAL writer threads on the clone itself? |
This deadlock behavior is very similar to what I was seeing in #11476, also on CentOS 7.9, and also resolved by downgrading to 2.0.0. |
@sempervictus WAL directory is moved to ext4 before Postgres is started. So WAL writers are not in the picture, I think. |
I did git bisect on this and I got:
Unfortunately, that one is quite large. Testing methodology: Interesting thing is that with the full 2.0.1 release I first see 2 CPUs spinning at 100% and after some time the third joins them. But with this commit it's only one CPU at 100%. The others never join. |
Skimming over the commit the |
courtesy of
|
postgresql-server-9.2 running on top of ZFS.
in /var/lib/pgsql/data/postgresql.conf if that even matters. reproduced running |
How do I build kernel modules to include enough debug information to decode hex offsets to line numbers? I've tried adding |
@dkacar-oradian thanks for narrowing this down, I'll see if I can reproduce it on CentOS 7 with the provided reproducer. While the patch is fairly large and some code was refactored to deal with the 5.10 kernel change what the code does should be identical for CentOS 7. Though clearly something a little different, we just need to identify it.
@sempervictus yes, there's more churn than I would have liked but it was necessary to support both 3.10 and 5.10 kernels. I'm surprised we didn't encounter this issue in the master branch before backporting the change. We manually ran a ton of regression tests as part of making this change. I considered the |
I've opened PR #11484 with a proposed fix for this regression. Any additional testing would be welcome. Thus far I've only verified it resolves the provided Postgres test case on CentOS 7.9 with the 3.10.0-1160.11.1 kernel. @Do-while-Bacon I'd appreciate any help in verifying it also resolves the docker container issue as well (it should). The patch applies cleanly to the 2.0.1 tag. |
I've applied that patch on top of 2.0.1 and things work fine with starting Postgres on a clone (three in a row) and deleting them. Reboot also works fine. |
As part of commit 1c2358c the custom uio_prefaultpages() code was removed in favor of using the generic kernel provided iov_iter_fault_in_readable() interface. Unfortunately, it turns out that up until the Linux 4.7 kernel the function would only ever fault in the first iovec of the iov_iter. The result being uiomove_iov() may hang waiting for the page. This commit effectively restores the custom uio_prefaultpages() pages code for Linux 4.9 and earlier kernels which contain the troublesome version of iov_iter_fault_in_readable(). Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#11463 Closes openzfs#11484
The fix has been merged to the master branch and I've opened #11493 to make sure it gets applied for OpenZFS 2.0.2. |
As part of commit 1c2358c the custom uio_prefaultpages() code was removed in favor of using the generic kernel provided iov_iter_fault_in_readable() interface. Unfortunately, it turns out that up until the Linux 4.7 kernel the function would only ever fault in the first iovec of the iov_iter. The result being uiomove_iov() may hang waiting for the page. This commit effectively restores the custom uio_prefaultpages() pages code for Linux 4.9 and earlier kernels which contain the troublesome version of iov_iter_fault_in_readable(). Signed-off-by: Brian Behlendorf <[email protected]> Closes #11463 Closes #11484
As part of commit 1c2358c the custom uio_prefaultpages() code was removed in favor of using the generic kernel provided iov_iter_fault_in_readable() interface. Unfortunately, it turns out that up until the Linux 4.7 kernel the function would only ever fault in the first iovec of the iov_iter. The result being uiomove_iov() may hang waiting for the page. This commit effectively restores the custom uio_prefaultpages() pages code for Linux 4.9 and earlier kernels which contain the troublesome version of iov_iter_fault_in_readable(). Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#11463 Closes openzfs#11484
As part of commit 1c2358c the custom uio_prefaultpages() code was removed in favor of using the generic kernel provided iov_iter_fault_in_readable() interface. Unfortunately, it turns out that up until the Linux 4.7 kernel the function would only ever fault in the first iovec of the iov_iter. The result being uiomove_iov() may hang waiting for the page. This commit effectively restores the custom uio_prefaultpages() pages code for Linux 4.9 and earlier kernels which contain the troublesome version of iov_iter_fault_in_readable(). Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#11463 Closes openzfs#11484
System information
Describe the problem you're observing
I have an easily reproducible deadlock on CentOS 7.9 with OpenZFS 2.0.1 (kmod kernel package from the official RPM repo) which doesn't happen on 2.0.0.
I have this in top:
At first there are only 2 CPUs spinning at 100%, but after some time the third joins in. And so it stays and I have to power off the VM (running in VmWare). I suppose those are deadlocked spinlocks in the kernel or something like that.
Debug log shows:
And nothing new appears.
Describe how to reproduce the problem
What I did was: create 2 Postgres slaves with ZFS 2.0.0, each on its own ZFS file system. Then upgraded to 2.0.1 and created snapshot and clone on one of them and tried to start primary Postgres instance on the clone. That will apply a certain number of WAL files and then it gets stuck like this. Reproducible every time, so far.
Include any warning/errors/backtraces from the system logs
However, this data isn't particularly useful for debugging the issue. When I had similar problems with BTRFS there were stack traces for stuck processes obtainable with dmsg, due to:
But there's absolutely nothing after the boot messages with ZFS.
So how do I find out relevant data to debug this?
The text was updated successfully, but these errors were encountered: