Handle kthread_create() failures properly in taskq_create(). #340

dweeezil · 2014-03-26T14:38:26Z

The return value should be checked with IS_ERR() to deal with
possible -ENOMEM or other errors.

behlendorf · 2014-03-26T20:54:36Z

Thanks. This looks good.

tuxoko · 2014-04-02T17:00:12Z

Hi this won't solve the issue.
See my comment in openzfs/zfs#2230

dweeezil · 2014-04-02T18:07:44Z

I would suggest that whomever is seeing this problem add a printk() to see the value coming back from kthread_create(). The -ENOMEM return from kthread_create_on_node() can, in fact, mean out of memory or it can also mean SIGKILL.

In either case, the somewhat larger problem is that there are places throughout the ZFS code base that don't properly handle a failure of taskq_create().

It sounds like additional debugging is in order from someone who can reproduce the problem. I suspect we'd like to see the return value from kthread_create() and also maybe the TIF flags.

tuxoko · 2014-04-03T03:57:34Z

For short term hack, we could probably temporary mask the SIGKILL in pending signal and retry.
Though I'm not sure if this is possible or has any side effect.

dweeezil · 2014-04-06T17:24:20Z

I was finally able to create a set rig to reliably reproduce the interrupted kthread_create() and have pushed dweeezil/spl@0d36f37 to handle it. My only tests so far have been to interrupt pool import but I will be trying to test other potential failure modes as well.

dweeezil · 2014-04-07T03:01:59Z

Thanks for looking at this @tuxoko. I just pushed dweeezil/spl@5ac44ed, and as I mentioned will try to test some more of these cases.

I'm sure you noticed by now that openzfs/zfs#2243 is likely cause by this as well. I'm afraid we're going to get a flood of these now that the big distros are starting to roll out 3.13.

tuxoko · 2014-04-07T07:18:48Z

I wrote a reliable script to test SIGKILL on import. Note that the sleep time may not apply to everyone.

#!/bin/sh
POOL=p1
T=1.2
zpool import $POOL -R /mnt &
sleep $T
kill -9 $!

@dweeezil Your patch seems OK to me.
Though I have to wonder if we should re-enable the signal after kthread_create.
I added set_thread_flag(TIF_SIGPENDING); after successful retry.
Running with this test script, the pool imported fine, but the filesystems are not mounted. Which is expected since the zpool command is killed.

Anyway, I just want to ask if we should re-enable the signal or not.

dweeezil · 2014-04-07T15:22:50Z

@tuxoko I was wondering the same thing but noticed that without re-enabling it, my test program (zpool import) was, in fact terminated. FYI, my testing rig uses an IOPS-limited KVM guest to give me more windows in which to send signals. The system was otherwise too fast and/or my test pools too simply for a script like you outlined to work.

behlendorf · 2014-04-07T17:06:41Z

@dweeezil @tuxoko Thank you for working on this, I suspect more people are starting to see this so I really appreciate you guys running it to ground.

The proposed approach to catch and handle the signal looks reasonable to me. But I think we should move this logic in to a spl_kthread_create() function which preserves the previous kthread_create behavior. Then we can update all callers to use the new interface and ensure we fix all instances of this. All but one caller are in the SPL and the single ZFS caller is in zpios which appears to handle the failure cleanly. This buys us a few things.

A stable interface. We can update spl_kthread_create() as needed to maintain the behavior we expect.
The signal and ENOMEM handling logic can be written once so other callers won't need to worry about getting this right.
We can clearly document in a block comment why this special case handling is needed. What changed in linux-3.13 and how we have to handle it.

For example (untested, please rework as needed),

#define spl_kthread_create(func, data, fmt, ...)                                   \
{                                                                       \
        struct task_struct *__tsk;                                      \
                                                                        \
        while (1) {                                                     \
                __tsk = kthread_create(func, data, fmt, ## __VA_ARGS__); \
                if (IS_ERR(__tsk)) {                                    \
                        if (signal_pending(current)) {                  \
                                clear_thread_flag(TIF_SIGPENDING);      \
                                continue;                               \
                        }                                               \
                                                                        \
                        if (PTR_ERR(tsk) == -ENOMEM)                    \
                                continue;                               \
                                                                        \
                        return (NULL);                                  \
                } else {                                                \
                        return (__tsk);                                 \
                }                                                       \
        }                                                               \
}

dweeezil · 2014-04-07T17:31:11Z

@behlendorf Sounds like a good idea. I'll re-work it as a kthread_create() wrapper.

dweeezil · 2014-04-07T18:41:44Z

I just pushed dweeezil/spl@f3dee6f which provides the spl_kthread_create() wrapper function. I gave it light testing on a stock 3.13 kernel and it seems to properly handle interrupted imports.

The other consumers of kthread_create() have been updated to use the new interface.

EDIT: Tweaked the comments.

behlendorf · 2014-04-07T19:34:33Z

This looks good to me. I've queued it up for testing.

behlendorf · 2014-04-07T23:10:28Z

@dweeezil @tuxoko The testing for your latest patch looks good. If you're both happy with it I'd like to merge it tomorrow, this issue is starting to impact more people.

Provide spl_kthread_create() as a wrapper to the kernel's kthread_create() to provide pre-3.13 semantics. Re-try if the call is interrupted or if it would have returned -ENOMEM. Otherwise return NULL.

dweeezil · 2014-04-08T03:35:08Z

@tuxoko Thanks for looking at this. I've re-pushed dweeezil/spl@4a0984b which uses kthread_create() rather than kthread_create_on_node() which appears to have been added between 2.8.38 and 2.8.39. I was able to build install and use under CentOS 6.4 with its 2.6.32 kernel.

gitboy · 2014-04-08T06:58:46Z

@dweezil Hi Tim could you please give me a high level idea of what symptoms this should address? I'm attempting to run on a box that will have sustained CPU and IO load, is that going to be safe/wise?

dweeezil · 2014-04-08T12:03:28Z

@gitboy This patch should cause taskq_create() to never return a NULL. The symptoms come about because the ZFS code does not properly handle a NULL pointer return in most cases so the usual result is a "NULL pointer dereference". Sometimes, however, the symptom can be that a process spins in the kernel and you'll see the message indicating a task has run for too long. The failures typically occur either when a pool is first imported or when a ZVOL is opened subsequent to pool import as those are the two most common times at which ZFS creates new taskqs.

I have also created fixes for most of the places within the main ZFS code where these failures aren't handled and will be working up a separate patch for those, however, with this fix, they should be no more likely to occur than they did pre-3.13.

fling- · 2014-04-09T03:10:12Z

When will the patch be merged? I want to solve my issue openzfs/zfs#2238

dweeezil · 2014-04-09T05:41:18Z

@fling- It was merged today as 17a527c.

behlendorf · 2014-04-10T03:16:53Z

Closing issue, the patch has been merged to master.

dweeezil mentioned this pull request Apr 4, 2014

NULL pointer dereference hardlock openzfs/zfs#2238

Closed

dweeezil mentioned this pull request Apr 7, 2014

zfs deadlock openzfs/zfs#2243

Closed

behlendorf mentioned this pull request Apr 7, 2014

High CPU Load caused machine to stop. openzfs/zfs#2208

Closed

prometheanfire mentioned this pull request Apr 7, 2014

looping in zfs_unlinked_drain? openzfs/zfs#2240

Closed

Support post-3.13 kthread_create() semantics.

4a0984b

Provide spl_kthread_create() as a wrapper to the kernel's kthread_create() to provide pre-3.13 semantics. Re-try if the call is interrupted or if it would have returned -ENOMEM. Otherwise return NULL.

behlendorf closed this Apr 10, 2014

behlendorf added this to the 0.6.3 milestone Apr 10, 2014

tuxoko mentioned this pull request Apr 11, 2014

kthread name corrupted #347

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle kthread_create() failures properly in taskq_create(). #340

Handle kthread_create() failures properly in taskq_create(). #340

dweeezil commented Mar 26, 2014

behlendorf commented Mar 26, 2014

tuxoko commented Apr 2, 2014

dweeezil commented Apr 2, 2014

tuxoko commented Apr 3, 2014

dweeezil commented Apr 6, 2014

dweeezil commented Apr 7, 2014

tuxoko commented Apr 7, 2014

dweeezil commented Apr 7, 2014

behlendorf commented Apr 7, 2014

dweeezil commented Apr 7, 2014

dweeezil commented Apr 7, 2014

behlendorf commented Apr 7, 2014

behlendorf commented Apr 7, 2014

dweeezil commented Apr 8, 2014

gitboy commented Apr 8, 2014

dweeezil commented Apr 8, 2014

fling- commented Apr 9, 2014

dweeezil commented Apr 9, 2014

behlendorf commented Apr 10, 2014

Handle kthread_create() failures properly in taskq_create(). #340

Handle kthread_create() failures properly in taskq_create(). #340

Conversation

dweeezil commented Mar 26, 2014

behlendorf commented Mar 26, 2014

tuxoko commented Apr 2, 2014

dweeezil commented Apr 2, 2014

tuxoko commented Apr 3, 2014

dweeezil commented Apr 6, 2014

dweeezil commented Apr 7, 2014

tuxoko commented Apr 7, 2014

dweeezil commented Apr 7, 2014

behlendorf commented Apr 7, 2014

dweeezil commented Apr 7, 2014

dweeezil commented Apr 7, 2014

behlendorf commented Apr 7, 2014

behlendorf commented Apr 7, 2014

dweeezil commented Apr 8, 2014

gitboy commented Apr 8, 2014

dweeezil commented Apr 8, 2014

fling- commented Apr 9, 2014

dweeezil commented Apr 9, 2014

behlendorf commented Apr 10, 2014