Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Disable direct reclaim in taskq worker threads on Linux 3.9+ #474

Closed
wants to merge 1 commit into from

Conversation

ryao
Copy link
Contributor

@ryao ryao commented Sep 7, 2015

Illumos does not have direct reclaim and code run inside taskq worker
threads is not designed to deal with it. Allowing direct reclaim inside
a worker thread can therefore deadlock. We set PF_MEMALLOC_NOIO through
memalloc_noio_save() to indicate to the kernel's reclaim code that we
are inside a context where memory allocations cannot be allowed to block
on filesystem activity.

Signed-off-by: Richard Yao [email protected]

Illumos does not have direct reclaim and code run inside taskq worker
threads is not designed to deal with it. Allowing direct reclaim inside
a worker thread can therefore deadlock. We set PF_MEMALLOC_NOIO through
memalloc_noio_save() to indicate to the kernel's reclaim code that we
are inside a context where memory allocations cannot be allowed to block
on filesystem activity.

Signed-off-by: Richard Yao <[email protected]>
@ryao
Copy link
Contributor Author

ryao commented Sep 7, 2015

@behlendorf I realize that you want to try a more surgical approach. I am opening this to provide myself with an easy way to supply a patch to people who need swap working right now and have a place to track progress on this.

That said, I have made a revision since the earlier iteration that set PF_FSTRANS after realizing that PF_FSTRANS will change the kmem allocator to make KM_SLEEP into KM_PUSHPAGE in addition to setting PF_MEMALLOC_NOIO on Linux 3.9+. All that we should need (in theory) is PF_MEMALLOC_NOIO. Anything that is KM_SLEEP in a taskq on Illumos should be safe in a taskq on Linux.

@kernelOfTruth
Copy link

@ryao Great !

I had swap support on my mind the last few days and wanted to ask about its state,

glad it'll work with this 👍

Thanks

@behlendorf
Copy link
Contributor

@ryao I don't think this is an unreasonable approach since it specifically targets IO during direct reclaim and leaves all the other reclaim strategies intact. It would be great to get confirm this does resolve the outstanding swap issues.

@behlendorf behlendorf added this to the 0.7.0 milestone Sep 8, 2015
@ryao
Copy link
Contributor Author

ryao commented Sep 8, 2015

@behlendorf I will try to do some stress tests on my laptop today and let you know.

@ryao
Copy link
Contributor Author

ryao commented Sep 9, 2015

Preliminary testing of swap on a zvol on my laptop (Linux 4.1.3) with this patch applied to HEAD strongly suggests that this prevents deadlocks with swap on zvols.

The methodoogy is as follows:

  1. Setup a zvol with volblocksize=4k primarycache=metadata secondarycache=none sync=always and logbias=throughput as swap so that paged out memory is evicted quickly.
  2. Start 8 instances of python -c 'print 2**10**100' & to gobble up system memory. The number 8 is the number of virtual cores shown in /proc/cpuinfo.
  3. Wait for system memory to be exhausted such that page out occurs and pagefaults effectively lock up Xorg.
  4. Attempt to regain control of the system via Magic Sysreq key by killing processes via either manual OOM execution or killing all processes.
  5. Verify that things still work without rebooting.

If we deadlock, step 3 is not possible because the ZIO pipeline has stalled and the only solution is to force a reboot. The reason Xorg and other things lock up is because the python processes are starving interactive processes for unpaged memory by consuming memory faster than pageout and accessing it all in a loop.

Prior to this change, I had never conducted this test without step 3 deadocking almost immediately.

@behlendorf
Copy link
Contributor

No unexpected issues observed during testing. Because this is a minimal change designed to improve the robustness of the system it has been merged as:

d4bf6d8 Disable direct reclaim in taskq worker threads on Linux 3.9+

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants