-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZFS Swap Lockup #1274
Comments
Setting sync=always on the zvol being used for swap might help. Without it, you could have the ZIL keeping stuff in memory longer than is necessary. |
The following options were set:
|
i can reliably reproduce this on my opensuse box with rc14 |
@devZer0 Do you think you can either attach a serial console to the machine or reproduce it in a VM. We really need to get a back trace from the system when it locks up to see what has happened. |
i have running it in Oracle Virtualbox. I hope this is what you want, if not tell me what to do. [ 62.132611] Adding 1048572k swap on /dev/zd448. Priority:-2 extents:1 across:1048572k SS |
i have found that i had compression and dedup on, but disabling it makes no difference, problem still happens. |
After a brief chat with @rlaager in IRC, I have revised my swap recommendation to be:
The actual volume size should vary depending on how much RAM you have. Using 4K blocks avoids read-copy-modify overhead. Would someone suffering from this problem try using a swap zvol with the attributes that I recommend and see if the problem persists? |
@devZer0 You may need to recreate your pool or create a new pool for swap if you've had dedup enabled in the past. The original dedup table likely still remains even if new blocks aren't being deduped and that could be compounding the issue. |
good point. to make sure that no compression or dedup was involved, i recreated the pool - and the issue is still reproducible. eatmem freezes the system as soon as the process reaches phyiscal memory limit, whereas eatmem getting killed by oom at ~2100mb when there is no swap mounted on zvol. |
I've been experiencing this too using a swap file on zvol. I'm using latest git as of today. Before I found this bug, I was using a similar program to quickly chew up memory. When the swap file was enabled on the zvol, the system would immediately hang ( sometimes it would reboot ) as soon as memory was consumed and swapping started. The machine has 24GB of memory. With the cache off, the oom killer kicks in, or malloc fails and the process ends. With a cache file on a non-zfs disk, the cache fills up fine, then oom killer kicks in . I never used compression or dedup on, so that was not an issue. Initially, I was using sync=standard and primarycache=none and secondarycache=none. Changing sync as recommended above to always and primarycache to metadata did not help. I noticed that after about 14k bytes were written to the cache, the load on the box sky rocketed to 60+ ( i7-3770K cpu ), the cache swap stopped filling up, and the computer became completely unresponsive . This is very reproducible, and I can get more info as needed or custom compile any code. Using netconsole, I've captured the hangs ( 4 parts and long ) . zvol info:
After the load jumped, I dumped the blocked tasks:
|
Looks like my prior message got truncated. Here's after the kernel hung task monitor kicked in:
Finally, I dumped the threads:
|
These dumps are getting truncated ( I'm using the tick tick tick code tag ?? ) . Anyway, I can email the full dumps out if anyone wants it. |
This problem was easy to replicate in a bare debian vm 3.4.41 kernel ( 3.8 / 3.2 kernels exhibit the same problem, but sticking with 3.4 kernel to be consistent ) . I set zfs options zvol_threads=1 zfs_zevent_console=1 . Set up a small 500MB zvol swap. I gave the vm a single CPU. Run the mem program to chew up memory. Immediate hangs.
Here's the SYSRQ-W:
|
A deadlock was accidentally introduced by commit e95853a which can occur when the system is under memory pressure. What happens is that while the txg_quiesce thread is holding the tx->tx_cpu locks it enters memory reclaim. In the context of this memory reclaim it then issues synchronous I/O to a ZVOL swap device. Because the txg_quiesce thread is holding the tx->tx_cpu locks a new txg cannot be opened to handle the I/O. Deadlock. The fix is straight forward. Move the memory allocation outside the critical region where the tx->tx_cpu locks are held. And for good measure change the offending allocation to KM_PUSHPAGE to ensure it never attempts to issue I/O during reclaim. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#1274
Looks like some progress. I tried the patch, and it got past that deadlock, but it looks like it hit another deadlock in txg_sync as soon as swapping started: sysrq-w:
|
@mgmartin I'm able to reproduce the second issue, but only when |
A deadlock was accidentally introduced by commit e95853a which can occur when the system is under memory pressure. What happens is that while the txg_quiesce thread is holding the tx->tx_cpu locks it enters memory reclaim. In the context of this memory reclaim it then issues synchronous I/O to a ZVOL swap device. Because the txg_quiesce thread is holding the tx->tx_cpu locks a new txg cannot be opened to handle the I/O. Deadlock. The fix is straight forward. Move the memory allocation outside the critical region where the tx->tx_cpu locks are held. And for good measure change the offending allocation to KM_PUSHPAGE to ensure it never attempts to issue I/O during reclaim. Signed-off-by: Brian Behlendorf <[email protected]> Issue #1274
Thanks for the info. I set zvol_threads back to default of 32. It gets a littler further. As I slowly fill the swap, a few iterations in, another lock occurs ( maybe a know issue too ).
|
A deadlock was accidentally introduced by commit e95853a which can occur when the system is under memory pressure. What happens is that while the txg_quiesce thread is holding the tx->tx_cpu locks it enters memory reclaim. In the context of this memory reclaim it then issues synchronous I/O to a ZVOL swap device. Because the txg_quiesce thread is holding the tx->tx_cpu locks a new txg cannot be opened to handle the I/O. Deadlock. The fix is straight forward. Move the memory allocation outside the critical region where the tx->tx_cpu locks are held. And for good measure change the offending allocation to KM_PUSHPAGE to ensure it never attempts to issue I/O during reclaim. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#1274
A deadlock was accidentally introduced by commit e95853a which can occur when the system is under memory pressure. What happens is that while the txg_quiesce thread is holding the tx->tx_cpu locks it enters memory reclaim. In the context of this memory reclaim it then issues synchronous I/O to a ZVOL swap device. Because the txg_quiesce thread is holding the tx->tx_cpu locks a new txg cannot be opened to handle the I/O. Deadlock. The fix is straight forward. Move the memory allocation outside the critical region where the tx->tx_cpu locks are held. And for good measure change the offending allocation to KM_PUSHPAGE to ensure it never attempts to issue I/O during reclaim. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#1274
A deadlock was accidentally introduced by commit e95853a which can occur when the system is under memory pressure. What happens is that while the txg_quiesce thread is holding the tx->tx_cpu locks it enters memory reclaim. In the context of this memory reclaim it then issues synchronous I/O to a ZVOL swap device. Because the txg_quiesce thread is holding the tx->tx_cpu locks a new txg cannot be opened to handle the I/O. Deadlock. The fix is straight forward. Move the memory allocation outside the critical region where the tx->tx_cpu locks are held. And for good measure change the offending allocation to KM_PUSHPAGE to ensure it never attempts to issue I/O during reclaim. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#1274
With openzfs/spl#474, I am no longer able to cause deadlocks with swap on zvols on recent kernels. |
Illumos does not have direct reclaim and code run inside taskq worker threads is not designed to deal with it. Allowing direct reclaim inside a worker thread can therefore deadlock. We set PF_MEMALLOC_NOIO through memalloc_noio_save() to indicate to the kernel's reclaim code that we are inside a context where memory allocations cannot be allowed to block on filesystem activity. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs/zfs#1274 Issue openzfs/zfs#2390 Closes #474
Closing as stale, refer to the swap section of the wiki for additional configuration information. |
Am 04.02.2013 07:03, schrieb Darik Horn:
datakanja@multi-os-host:~$ zfs get all SSD/swap
NAME PROPERTY VALUE SOURCE
SSD/swap type volume -
SSD/swap creation Mo Feb 4 16:11 2013 -
SSD/swap used 17,0G -
SSD/swap available 17,2G -
SSD/swap referenced 20K -
SSD/swap compressratio 1.00x -
SSD/swap reservation none default
SSD/swap volsize 16G local
SSD/swap volblocksize 4K -
SSD/swap checksum on default
SSD/swap compression off default
SSD/swap readonly off default
SSD/swap copies 1 default
SSD/swap refreservation 17,0G local
SSD/swap primarycache none local
SSD/swap secondarycache all default
SSD/swap usedbysnapshots 0 -
SSD/swap usedbydataset 20K -
SSD/swap usedbychildren 0 -
SSD/swap usedbyrefreservation 17,0G -
SSD/swap logbias latency default
SSD/swap dedup off default
SSD/swap mlslabel none default
SSD/swap sync always local
SSD/swap refcompressratio 1.00x -
SSD/swap written 20K -
SSD/swap com.sun:auto-snapshot false local
but even here, as soon as the regular swap partition ran out of space and swapping to zfs would kick in, the machine locked up again.
This is the test code to eat the RAM:
----ooO-(_)-Ooo----
And so was Datakanja
The text was updated successfully, but these errors were encountered: