Make taskq_wait() block until the queue is empty #453

behlendorf · 2015-05-14T21:57:53Z

Under Illumos taskq_wait() returns when there are no more tasks
in the queue. This behavior differs from ZoL and FreeBSD where
taskq_wait() returns when all the tasks in the queue at the
beginning of the taskq_wait() call are complete. New tasks
added whilst taskq_wait() is running will be ignored.

This difference in semantics makes it possible that new subtle
issues could be introduced when porting changes from Illumos.
To avoid that possibility the taskq_wait() function is being
updated such that it blocks until the queue in empty. The
previous behavior is still available via the taskq_wait_all()
interface.

Signed-off-by: Chris Dunlop [email protected]
Signed-off-by: Brian Behlendorf [email protected]

behlendorf · 2015-05-14T22:00:20Z

@chrisrd what do you think about this? We update the taskq_wait() implementation to have the illumos semantics. My only slightly concern with this is it might end up have a performance impact. However, that could be pretty easily mitigated because we could also update critical areas to use taskq_wait_all() to get the previous semantics.

chrisrd · 2015-05-14T23:10:35Z

@behlendorf the code looks good.

The taskq_wait_all() name seems at odds with it's functionality: to my mind taskq_wait() (now) waits for "all" tasks to complete, whereas the renamed task_wait_all() is waiting for all outstanding tasks when the function was called, so possibly taskq_wait_outstanding()?

Performance wise, what's the worst that could happen? I guess we could potentially not return for a very long time if tasks are constantly getting queued, e.g. due to memory pressure. If we have any queuing of tasks in the freeing memory path that could potentially end up as a real or apparent deadlock. That would be pretty bad for performance!

behlendorf · 2015-05-15T16:10:43Z

@chrisrd agreed, I'm not to happy with taskq_wait_all() either, it was a poor choice of name. The good news is that nothing in ZFS uses it since it was primarily added for testing purposes in the SPLAT. So, if we're going to change it now is the time to do it. I like taskq_wait_outstanding() let me refresh this accordingly.

As for performance I don't think we'll encounter anything too serious because that most likely would already have been exposed under illumos. But it's just something to keep in mind and I wanted to mention! It's probably more likely that we'll just get some inexplicable and unlikely bugs fixed.

Under Illumos taskq_wait() returns when there are no more tasks in the queue. This behavior differs from ZoL and FreeBSD where taskq_wait() returns when all the tasks in the queue at the beginning of the taskq_wait() call are complete. New tasks added whilst taskq_wait() is running will be ignored. This difference in semantics makes it possible that new subtle issues could be introduced when porting changes from Illumos. To avoid that possibility the taskq_wait() function is being updated such that it blocks until the queue in empty. The previous behavior remains available through the taskq_wait_outstanding() interface. Note that this function was previously called taskq_wait_all() but has been renamed to avoid confusion. Signed-off-by: Chris Dunlop <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>

behlendorf · 2015-05-15T16:24:20Z

@chrisrd refreshed. Locally I've run the updated SPL through the buildbot and everything passed. I'll test the refreshed patch again but assuming nothing comes up I think we can merge this. Can you look over the refreshed version, it just includes the function name change.

chrisrd · 2015-05-16T00:25:46Z

Looks good to me!

behlendorf · 2015-05-18T17:14:17Z

Interestingly, this patch appears to cause the occasional cannot export 'tank': pool is busy error in the automated zimport.sh tests. I'm fairly sure retrying the command would succeed so I wouldn't categorize this as a regression. I suspect we're just blocking slightly longer now with a reference on the spa, which arguably is safer. I'll try adjusting the test script to account for this and verify it's a transient problem before merging this.

behlendorf · 2015-05-18T23:47:24Z

Adding a delay between the immediate import/export appears to avoid the spurious busy error.

ryao · 2015-05-20T20:35:14Z

@behlendorf If we import the Illumos taskq code, we can avoid regressions like this by getting the exact Illumos behavior. I have a proof of concept in a local branch that I can push if you are interested. It depended on the kmem-rework, which has since been merged.

behlendorf · 2015-05-20T21:21:36Z

@ryao I'm not sure I'd categorize this as a regression. The documentation was a bit ambiguous about this and FreeBSD implemented the same way we did. Frankly, I'd prefer to keep the existing behavior which is probably better for performance but I don't like the risk of accidental bugs being introduced due to a behavior differences.

I'm open to the idea of moving to the Illumos implementation but I'd need to be convinced of a few things. Off the top of my head:

It performs as well, or better, than the existing implementation.
taskq_dispatch() can be safely called in an atomic context.

As described in the comment above arc_adapt_thread() it is critical that the arc_adapt_thread() function never sleep while holding a hash lock. This behavior was possible in the Linux implementation because the arc_prune() logic was implemented to be synchronous. Under illumos the analogous dnlc_reduce_cache() function is asynchronous. To address this the arc_do_user_prune() function is has been reworked in to two new functions as follows: * arc_prune_async() is an asynchronous implementation which dispatches the prune callback to be run by the system taskq. This makes it suitable to use in the context of the arc_adapt_thread(). * arc_prune() is a synchronous implementation which depends on the arc_prune_async() implementation but blocks until the outstanding callbacks complete. This is used in arc_kmem_reap_now() where it is safe, and expected, that memory will be freed. Note that currently the system taskq is used for this purpose because it's convenient. However, it may make more sense to put this on another taskq or create a new one. This patch additionally adds the zfs_arc_meta_strategy module option while allows the meta reclaim strategy to be configured. It defaults to a balanced strategy which has been proved to work well under Linux but the illumos meta-only strategy can be enabled. Signed-off-by: Brian Behlendorf <[email protected]>

behlendorf · 2015-06-04T23:08:44Z

Closing, replaced by #455.

behlendorf force-pushed the task_wait branch from e0b75bd to 84a4f13 Compare May 15, 2015 16:22

behlendorf mentioned this pull request May 15, 2015

Illumos 5269 - zpool import slow openzfs/zfs#3396

Closed

behlendorf closed this Jun 4, 2015

behlendorf mentioned this pull request Jun 5, 2015

Lock contention on arcs mtx openzfs/zfs#3115

Closed

behlendorf deleted the task_wait branch July 28, 2017 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make taskq_wait() block until the queue is empty #453

Make taskq_wait() block until the queue is empty #453

behlendorf commented May 14, 2015

behlendorf commented May 14, 2015

chrisrd commented May 14, 2015

behlendorf commented May 15, 2015

behlendorf commented May 15, 2015

chrisrd commented May 16, 2015

behlendorf commented May 18, 2015

behlendorf commented May 18, 2015

ryao commented May 20, 2015

behlendorf commented May 20, 2015

behlendorf commented Jun 4, 2015

Make taskq_wait() block until the queue is empty #453

Make taskq_wait() block until the queue is empty #453

Conversation

behlendorf commented May 14, 2015

behlendorf commented May 14, 2015

chrisrd commented May 14, 2015

behlendorf commented May 15, 2015

behlendorf commented May 15, 2015

chrisrd commented May 16, 2015

behlendorf commented May 18, 2015

behlendorf commented May 18, 2015

ryao commented May 20, 2015

behlendorf commented May 20, 2015

behlendorf commented Jun 4, 2015