-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs receive deadlocks when zstdcat piped to it #13571
Comments
You and #13309 should be friends, including the backstory in #13232. But briefly, Linux has a bug, they ignored a patch to fix it, and nobody particularly cares enough to try again because LKML tends to vomit fire and worse things at anyone who mentions ZFS around them, so nobody can have larger pipe sizes on Linux. |
Wow, yes indeed. I spent a lot of time searching but didn't manage to turn up either of those, somehow. I am patching out the F_SETPIPE_SZ in 2.0.3 and will observe if that appears to fix it. I'll let it bake for a few days and report back here. The system in question is a slower x86_64 one, a Core i5-5200U which is moderately popular as a sort of small PC in a fanless configuration. It is perfect for receiving ZFS backups, which is its primary purpose for me. I am still at a loss as to why I never saw this bug when the pipeline was being kicked off by the shell, but did when it was being kicked off by Filespooler; perhaps, since it seems to be a race, the faster pace at which my Rust-based program was able to work through the queue may have had something to do with it. Interestingly, I had inserted cat into the pipeline, which significantly reduced, but did not eliminate, the incidence of this. I at first thought maybe cat was reblocking, but after inspecting its source and strace output, don't believe it was. Perhaps it had something to do with helping to avoid triggering the race. |
I believe, if I understand the bug correctly, it only triggers if you F_SETPIPE_SZ when the writer has put nonzero but not a full unit's worth in yet, which is why the world isn't on fire screaming about this - you need to either have a very slow but nonzero or otherwise very strange write pattern to hit it, which is why it doesn't come up in, say, the CI or most of my testbeds, but my poor little SPARC (440 MHz, 1c1t) and Raspberry Pis were not so fortunate. |
This could very well explain why I never saw it before I switched to processing data with Filespooler. Previously, the pipeline was roughly Now, Filespooler invokes I suspect this increases the likelihood of the condition you described, because now the gpg/zstdcat pipeline will already have data ready to be read by the time zfs receive is invoked, rather than those two programs forking and initializing at about the same time as zfs receive. Edit: Also I am very impressed at you running ZFS on a 440MHz SPARC! |
I have not experienced any deadlocks since I patched out F_SETPIPE_SZ. I think this is the proper resolution - thank you! |
Any word on when this might be merged? Thanks! |
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions. |
At some point, the fix was merged; 2.2.2 no longer has this issue, and contains this code:
So, unless ZFS_SET_PIPE_MAX is given, the behavior will be correct. That is, it's correct by default. I can confirm deadlocks have gone away in 2.2.2. I don't know when this was merged; it was still there in 2.1.11. |
It's also a little moot because torvalds/linux@e95aada got merged, which should theoretically make that obsolete, eventually. PS: a30927f was the commit in master, and 2.1.8 had e84a2ed. So you shouldn't be able to hit that on 2.1.11... |
System information
Describe the problem you're observing
When piping data to
zfs receive
fromzstdcat
, there is an issue that manifests itself approximately 1/1000 of the time in which the pipeline deadlocks. Additionally, attaching to thezfs receive
process withstrace -p
causes thezfs receive
process to exit withcannot receive incremental stream: incomplete stream
a few seconds later.Describe how to reproduce the problem
I wrote a blog post going into detail about the situation and my investigation into it.
I note that it appears to me that the zfs process is not reading from stdin itself, but rather is delegating this work to
kernel_read
within the kernel. I believe that zfs send is (was?) doing the same; for instance, #11445 described an issue with zfs send not working piped to /dev/null, and #13133 for using a wrapper thread for zfs send (at least when things aren't being sent to a pipe).I wonder if there is something about how libzfs_set_pipe_max, calling fcntl with F_SETPIPE_SZ, interacts with the kernel code.
Include any warning/errors/backtraces from the system logs
I checked and there are none.
The text was updated successfully, but these errors were encountered: