You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The user thread doing the ioctl for zfs recv eventually gets (on the kernel side) to dmu_recv_stream which calls bqueue_enqueue for each block. bqueue_enqueue has:
while (q->bq_size + item_size > q->bq_maxsize) {
cv_wait_sig(&q->bq_add_cv, &q->bq_lock);
}
res = assert_wait(event, interruptible);
if (res == THREAD_WAITING) {
lck_mtx_unlock(lck);
res = thread_block(THREAD_CONTINUE_NULL);
if (!(lck_sleep_action & LCK_SLEEP_UNLOCK)) {
if ((lck_sleep_action & LCK_SLEEP_SPIN))
lck_mtx_lock_spin(lck);
else
lck_mtx_lock(lck);
}
}
else
if (lck_sleep_action & LCK_SLEEP_UNLOCK)
lck_mtx_unlock(lck);
KERNEL_DEBUG(MACHDBG_CODE(DBG_MACH_LOCKS, LCK_MTX_SLEEP_CODE) | DBG_FUNC_END, (int)res, 0, 0, 0, 0);
return res;
If the current thread has already received a signal, then assert_wait returns THREAD_INTERRUPTED, lck_mtx_sleep never releases the mutex and returns, eventually all the way back to bqueue_enqueue's call to cv_wait_sig. But, since the mutex was never released, no bqueue_dequeue calls will have been able to make progress, and bqueue_enqueue will repeat this process forever.
I think the correct fix here would be to modify bqueue_enqueue (and bqueue_dequeue, which has the same problem) to use the uninterruptible versions; since there's nothing they can do (without changing their API) except retry, they might as well use the uninterruptible versions.
I have not looked to see if there are other uses of cv_wait_interruptible/cv_timedwait_interruptible/cv_wait_sig/cv_timedwait_sig which have issues.
The text was updated successfully, but these errors were encountered:
Thanks for the issue and the details of it. We pass PCATCH to msleep() so we can detect the user hitting ^C for the signal, and have the kernel parts undo. I don't recall specifically which part detects that, but I don't think it was bqueue_dequeue(), so potentially what you suggest could work.
Potentially we could also detect THREAD_INTERRUPTED and do a little manual mutex release for progress.
OK, so yes, you can definitely just change it to cv_wait() (without the _sig). But presumably upstream picked the _sig for a reason relevant to their platform.
The user thread doing the ioctl for
zfs recv
eventually gets (on the kernel side) to dmu_recv_stream which calls bqueue_enqueue for each block. bqueue_enqueue has:cv_wait_sig is defined by (debug ifdefs elided)
This percolates down through xnu until it reaches lck_mtx_sleep (e.g. https://opensource.apple.com/source/xnu/xnu-2050.18.24/osfmk/kern/locks.c.auto.html) which does
If the current thread has already received a signal, then assert_wait returns THREAD_INTERRUPTED, lck_mtx_sleep never releases the mutex and returns, eventually all the way back to bqueue_enqueue's call to cv_wait_sig. But, since the mutex was never released, no bqueue_dequeue calls will have been able to make progress, and bqueue_enqueue will repeat this process forever.
I think the correct fix here would be to modify bqueue_enqueue (and bqueue_dequeue, which has the same problem) to use the uninterruptible versions; since there's nothing they can do (without changing their API) except retry, they might as well use the uninterruptible versions.
I have not looked to see if there are other uses of cv_wait_interruptible/cv_timedwait_interruptible/cv_wait_sig/cv_timedwait_sig which have issues.
The text was updated successfully, but these errors were encountered: