-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZVOL ops hanging #6888
Comments
With #6294, this is still happening, albeit with a new stack trace:
I'm pulling SPL dynamic threads (setting to 0) and deploying another from-master build. Annoying. Storage virtualization with block devices is such a useful function, and its been a rolling nightmare on ZoL since 0.6.4. |
Adding insult to injury, ZFS appears able to perform this same operation to a file in the ZPL. ZVOLs are definitely broken if the ZPL, a more complex construct, can execute this operation with attributes and structures associated which dont exist in the bitwise blockdev world. Ping @behlendorf: are there any plans to fix the zvol nightmare? its been in varying states of "broken->super-broken" for well over a year. IIRC @ryao was working on this for an employer, but i've not seen hide nor hair of him since that discussion, making me think that code wont be implemented or published if it is. Maybe it makes more sense to write a Linux-specific implementation given the threading and block access differences from the ground up, instead of the 'adapt Illumos code much as wel can" approach on this. We used to get >1GB/s on iSCSI before 0.6.4.8 or thereabouts (the zvol rewrite removing half of the blockdev pipe), now we cant even use a zvol locally as a dd target... |
Searching for "zvol" in the issue tracker produces pages of results - this is definitely a major problem area for some time. Just a few recent examples of seemingly the same issue (blocking ops for over 2m) - #6890, #6736, #6330, #6265, and pages more going back to 2013. I figure the LLNL use case is for Lustre, which accesses the DMU directly, so there's no direct priority for this issue. However, any company providing services on ZoL is not likely going to want to publish the code which gives them their ops differentiator for the service they provide (which is the sort of thing i believe ryao was working on when last we spoke). As a result, we might be a bit stuck unless someone like Datto comes around who seem more than willing to Open Source their internal magic for interop and sanity in the ecosystem, or one of us finds the time to learn the Linux block IO subsystem well enough to implement these virtual block devices in an optimal/safe way for this OS. |
@sempervictus I am still around, although very backlogged. I have not hit this particular issue at work. Is it possible to get stack traces from all threads when this happens? |
Interesting, sure it's not hardware related? I've hit this but only after a disk failed and after tweaking NCQ. Increasing the queue would wake it up. Assumed it was a bug in the kernel somewhere. |
@ryao: so this is still happening using your 0.7.3 tree:
^^ pages upon pages of that when sending from one pool to another via zfs send -Rvec and zfs recv -vFu. I see this all over the place when sending anything with zvols. After this ZFS starts "hiccuping" and transfer rates drop from 200MB/s to 50-100/MB every few seconds with nothing in between. |
@sempervictus by "setting the volmode parameter systemwide has no effect" did you mean "it has no effect in this context" (fixing this particular issue) or "the volmode parameter doesn't work as it is supposed to"? Assuming the "system-wide" option is the If you found a bug using the |
@sempervictus could you post the contents of |
@sempervictus the stack from #6888 (comment) shows that for some reason there are IOs which aren't getting completed. When this happens to IOs issued by the |
I've been loosely following this, and other related threads, and I suppose it might be time to make mention of the little patch in dweeezil/zfs@b8f3110fb. That patch along with a much shorter setting for |
@behlendorf: the faulty hardware thing is a bit tough to swallow, same ops work just fine on file targets. There's something in the zvol IO pipeline causing this. |
@dweeezil: no dice, built from master with that commit (had a minor merge fix since the line above the delta changed since you made it), and got:
while doing a dd to a zvol on a raidz2 with a slog ssd mirror. |
It won't let me set one under ZFS these days (so none). We also run all vdevs on dm-crypt, adding a layer of indirection via the DM tier. I've wondered about this too - why did zol force us to not use the native scheduling facilities? Older versions were much faster and more stable in executing actual IO... they did allow setting ones own scheduler and to complete IOs issued to zvols :-). Actually bfq was a life saver for that. Assuming that the entire IO pipeline can be handled by ZFS is a bit silly if there are Linux native IOs to service as a result (DM or what not).
|
Yeah, I don't think I buy it either. Let me see about putting together a patch to properly extend the deadman logic so we can get some proper debugging regarding the hung I/Os. |
@sempervictus does the problem persist if using direct_io on the target zvol? For example, executing |
Unknown, direct requests might add up to the volume needed to cause this on nvme, but sata/sas don't seem to do it. I actually don't have a good reproducer right now either, though I guess I could try a ramdisk->ramdisk dd
|
Is this still an issue? |
Haven't seen it in about 6 months, but we greatly reduced snapshot frequency across the board and moved to 4.14 as our base. |
Alright, well then let's close it out for now. If we're able to reproduce it we can always open a new issue. |
System information
Describe the problem you're observing
While running a dd if=ssd of=raidz2/zvol bs=64M i'm getting consistent crashes and hangs which look like:
Preventing the operation from completing and requiring a full system reboot. I pulled all patches and changes, just built from master, and still seeing the issue.
This is a very common stack trace (or similar to many others i've seen over the years), and i'd love to never see it again... :)
The text was updated successfully, but these errors were encountered: