Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Handle kthread_create() failures properly in taskq_create(). #340

Closed
wants to merge 1 commit into from

Conversation

dweeezil
Copy link
Contributor

The return value should be checked with IS_ERR() to deal with
possible -ENOMEM or other errors.

@behlendorf
Copy link
Contributor

Thanks. This looks good.

@tuxoko
Copy link
Contributor

tuxoko commented Apr 2, 2014

Hi this won't solve the issue.
See my comment in openzfs/zfs#2230

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 2, 2014

I would suggest that whomever is seeing this problem add a printk() to see the value coming back from kthread_create(). The -ENOMEM return from kthread_create_on_node() can, in fact, mean out of memory or it can also mean SIGKILL.

In either case, the somewhat larger problem is that there are places throughout the ZFS code base that don't properly handle a failure of taskq_create().

It sounds like additional debugging is in order from someone who can reproduce the problem. I suspect we'd like to see the return value from kthread_create() and also maybe the TIF flags.

@tuxoko
Copy link
Contributor

tuxoko commented Apr 3, 2014

For short term hack, we could probably temporary mask the SIGKILL in pending signal and retry.
Though I'm not sure if this is possible or has any side effect.

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 6, 2014

I was finally able to create a set rig to reliably reproduce the interrupted kthread_create() and have pushed dweeezil/spl@0d36f37 to handle it. My only tests so far have been to interrupt pool import but I will be trying to test other potential failure modes as well.

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 7, 2014

Thanks for looking at this @tuxoko. I just pushed dweeezil/spl@5ac44ed, and as I mentioned will try to test some more of these cases.

I'm sure you noticed by now that openzfs/zfs#2243 is likely cause by this as well. I'm afraid we're going to get a flood of these now that the big distros are starting to roll out 3.13.

@dweeezil dweeezil mentioned this pull request Apr 7, 2014
@tuxoko
Copy link
Contributor

tuxoko commented Apr 7, 2014

I wrote a reliable script to test SIGKILL on import. Note that the sleep time may not apply to everyone.

#!/bin/sh
POOL=p1
T=1.2
zpool import $POOL -R /mnt &
sleep $T
kill -9 $!

@dweeezil Your patch seems OK to me.
Though I have to wonder if we should re-enable the signal after kthread_create.
I added set_thread_flag(TIF_SIGPENDING); after successful retry.
Running with this test script, the pool imported fine, but the filesystems are not mounted. Which is expected since the zpool command is killed.

Anyway, I just want to ask if we should re-enable the signal or not.

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 7, 2014

@tuxoko I was wondering the same thing but noticed that without re-enabling it, my test program (zpool import) was, in fact terminated. FYI, my testing rig uses an IOPS-limited KVM guest to give me more windows in which to send signals. The system was otherwise too fast and/or my test pools too simply for a script like you outlined to work.

@behlendorf
Copy link
Contributor

@dweeezil @tuxoko Thank you for working on this, I suspect more people are starting to see this so I really appreciate you guys running it to ground.

The proposed approach to catch and handle the signal looks reasonable to me. But I think we should move this logic in to a spl_kthread_create() function which preserves the previous kthread_create behavior. Then we can update all callers to use the new interface and ensure we fix all instances of this. All but one caller are in the SPL and the single ZFS caller is in zpios which appears to handle the failure cleanly. This buys us a few things.

  • A stable interface. We can update spl_kthread_create() as needed to maintain the behavior we expect.
  • The signal and ENOMEM handling logic can be written once so other callers won't need to worry about getting this right.
  • We can clearly document in a block comment why this special case handling is needed. What changed in linux-3.13 and how we have to handle it.

For example (untested, please rework as needed),

#define spl_kthread_create(func, data, fmt, ...)                                   \
{                                                                       \
        struct task_struct *__tsk;                                      \
                                                                        \
        while (1) {                                                     \
                __tsk = kthread_create(func, data, fmt, ## __VA_ARGS__); \
                if (IS_ERR(__tsk)) {                                    \
                        if (signal_pending(current)) {                  \
                                clear_thread_flag(TIF_SIGPENDING);      \
                                continue;                               \
                        }                                               \
                                                                        \
                        if (PTR_ERR(tsk) == -ENOMEM)                    \
                                continue;                               \
                                                                        \
                        return (NULL);                                  \
                } else {                                                \
                        return (__tsk);                                 \
                }                                                       \
        }                                                               \
}

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 7, 2014

@behlendorf Sounds like a good idea. I'll re-work it as a kthread_create() wrapper.

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 7, 2014

I just pushed dweeezil/spl@f3dee6f which provides the spl_kthread_create() wrapper function. I gave it light testing on a stock 3.13 kernel and it seems to properly handle interrupted imports.

The other consumers of kthread_create() have been updated to use the new interface.

EDIT: Tweaked the comments.

@behlendorf
Copy link
Contributor

This looks good to me. I've queued it up for testing.

@behlendorf
Copy link
Contributor

@dweeezil @tuxoko The testing for your latest patch looks good. If you're both happy with it I'd like to merge it tomorrow, this issue is starting to impact more people.

Provide spl_kthread_create() as a wrapper to the kernel's kthread_create()
to provide pre-3.13 semantics.  Re-try if the call is interrupted or if it
would have returned -ENOMEM.  Otherwise return NULL.
@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 8, 2014

@tuxoko Thanks for looking at this. I've re-pushed dweeezil/spl@4a0984b which uses kthread_create() rather than kthread_create_on_node() which appears to have been added between 2.8.38 and 2.8.39. I was able to build install and use under CentOS 6.4 with its 2.6.32 kernel.

@gitboy
Copy link

gitboy commented Apr 8, 2014

@dweezil Hi Tim could you please give me a high level idea of what symptoms this should address? I'm attempting to run on a box that will have sustained CPU and IO load, is that going to be safe/wise?

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 8, 2014

@gitboy This patch should cause taskq_create() to never return a NULL. The symptoms come about because the ZFS code does not properly handle a NULL pointer return in most cases so the usual result is a "NULL pointer dereference". Sometimes, however, the symptom can be that a process spins in the kernel and you'll see the message indicating a task has run for too long. The failures typically occur either when a pool is first imported or when a ZVOL is opened subsequent to pool import as those are the two most common times at which ZFS creates new taskqs.

I have also created fixes for most of the places within the main ZFS code where these failures aren't handled and will be working up a separate patch for those, however, with this fix, they should be no more likely to occur than they did pre-3.13.

@fling-
Copy link

fling- commented Apr 9, 2014

When will the patch be merged? I want to solve my issue openzfs/zfs#2238

@dweeezil
Copy link
Contributor Author

dweeezil commented Apr 9, 2014

@fling- It was merged today as 17a527c.

@behlendorf
Copy link
Contributor

Closing issue, the patch has been merged to master.

@behlendorf behlendorf closed this Apr 10, 2014
@behlendorf behlendorf added this to the 0.6.3 milestone Apr 10, 2014
@tuxoko tuxoko mentioned this pull request Apr 11, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants