kernel deadlock when `zfs recv` is interrupted by a signal #785

jesboat · 2021-02-08T23:55:32Z

The user thread doing the ioctl for zfs recv eventually gets (on the kernel side) to dmu_recv_stream which calls bqueue_enqueue for each block. bqueue_enqueue has:

	while (q->bq_size + item_size > q->bq_maxsize) {
		cv_wait_sig(&q->bq_add_cv, &q->bq_lock);
	}

cv_wait_sig is defined by (debug ifdefs elided)

#define cv_wait_sig(cvp, mp)        \
	spl_cv_wait((cvp), (mp), PRIBIO|PCATCH, #cvp)

void
spl_cv_wait(kcondvar_t *cvp, kmutex_t *mp, int flags, const char *msg)
{
    if (msg != NULL && msg[0] == '&')
        ++msg;  /* skip over '&' prefixes */

	mp->m_owner = NULL;
    (void) msleep(cvp, (lck_mtx_t *)&mp->m_lock, flags, msg, 0);
    mp->m_owner = current_thread();
}

This percolates down through xnu until it reaches lck_mtx_sleep (e.g. https://opensource.apple.com/source/xnu/xnu-2050.18.24/osfmk/kern/locks.c.auto.html) which does

	res = assert_wait(event, interruptible);
	if (res == THREAD_WAITING) {
		lck_mtx_unlock(lck);
		res = thread_block(THREAD_CONTINUE_NULL);
		if (!(lck_sleep_action & LCK_SLEEP_UNLOCK)) {
			if ((lck_sleep_action & LCK_SLEEP_SPIN))
				lck_mtx_lock_spin(lck);
			else
				lck_mtx_lock(lck);
		}
	}
	else
	if (lck_sleep_action & LCK_SLEEP_UNLOCK)
		lck_mtx_unlock(lck);

	KERNEL_DEBUG(MACHDBG_CODE(DBG_MACH_LOCKS, LCK_MTX_SLEEP_CODE) | DBG_FUNC_END, (int)res, 0, 0, 0, 0);

	return res;

If the current thread has already received a signal, then assert_wait returns THREAD_INTERRUPTED, lck_mtx_sleep never releases the mutex and returns, eventually all the way back to bqueue_enqueue's call to cv_wait_sig. But, since the mutex was never released, no bqueue_dequeue calls will have been able to make progress, and bqueue_enqueue will repeat this process forever.

I think the correct fix here would be to modify bqueue_enqueue (and bqueue_dequeue, which has the same problem) to use the uninterruptible versions; since there's nothing they can do (without changing their API) except retry, they might as well use the uninterruptible versions.

I have not looked to see if there are other uses of cv_wait_interruptible/cv_timedwait_interruptible/cv_wait_sig/cv_timedwait_sig which have issues.

The text was updated successfully, but these errors were encountered:

lundman · 2021-02-09T07:25:05Z

Thanks for the issue and the details of it. We pass PCATCH to msleep() so we can detect the user hitting ^C for the signal, and have the kernel parts undo. I don't recall specifically which part detects that, but I don't think it was bqueue_dequeue(), so potentially what you suggest could work.

Potentially we could also detect THREAD_INTERRUPTED and do a little manual mutex release for progress.

lundman · 2021-03-12T08:36:37Z

OK, so yes, you can definitely just change it to cv_wait() (without the _sig). But presumably upstream picked the _sig for a reason relevant to their platform.

I can also add, at the end of spl_cv_wait():

+    if (result == EINTR &&
+        (mp->m_waiters > 0 || mp->m_sleepers > 0)) {
+        printf("avoiding hang\n");
+        mutex_exit(mp);
+        (void) thread_block(THREAD_CONTINUE_NULL);
+        mutex_enter(mp);
+    }

    /*
     * 1 - condvar got cv_signal()/cv_broadcast()
     * 0 - received signal (kill -signal)
     */
    return (result == EINTR ? 0 : 1);

I added waiters, and sleepers just to avoid doing it everytime. Now to decide which fix to go with.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kernel deadlock when `zfs recv` is interrupted by a signal #785

kernel deadlock when `zfs recv` is interrupted by a signal #785

jesboat commented Feb 8, 2021 •

edited

Loading

lundman commented Feb 9, 2021

lundman commented Mar 12, 2021

kernel deadlock when zfs recv is interrupted by a signal #785

kernel deadlock when zfs recv is interrupted by a signal #785

Comments

jesboat commented Feb 8, 2021 • edited Loading

lundman commented Feb 9, 2021

lundman commented Mar 12, 2021

kernel deadlock when `zfs recv` is interrupted by a signal #785

kernel deadlock when `zfs recv` is interrupted by a signal #785

jesboat commented Feb 8, 2021 •

edited

Loading