-
Notifications
You must be signed in to change notification settings - Fork 178
Make taskq_wait() block until the queue is empty #453
Conversation
@chrisrd what do you think about this? We update the |
@behlendorf the code looks good. The Performance wise, what's the worst that could happen? I guess we could potentially not return for a very long time if tasks are constantly getting queued, e.g. due to memory pressure. If we have any queuing of tasks in the freeing memory path that could potentially end up as a real or apparent deadlock. That would be pretty bad for performance! |
@chrisrd agreed, I'm not to happy with As for performance I don't think we'll encounter anything too serious because that most likely would already have been exposed under illumos. But it's just something to keep in mind and I wanted to mention! It's probably more likely that we'll just get some inexplicable and unlikely bugs fixed. |
Under Illumos taskq_wait() returns when there are no more tasks in the queue. This behavior differs from ZoL and FreeBSD where taskq_wait() returns when all the tasks in the queue at the beginning of the taskq_wait() call are complete. New tasks added whilst taskq_wait() is running will be ignored. This difference in semantics makes it possible that new subtle issues could be introduced when porting changes from Illumos. To avoid that possibility the taskq_wait() function is being updated such that it blocks until the queue in empty. The previous behavior remains available through the taskq_wait_outstanding() interface. Note that this function was previously called taskq_wait_all() but has been renamed to avoid confusion. Signed-off-by: Chris Dunlop <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
@chrisrd refreshed. Locally I've run the updated SPL through the buildbot and everything passed. I'll test the refreshed patch again but assuming nothing comes up I think we can merge this. Can you look over the refreshed version, it just includes the function name change. |
Looks good to me! |
Interestingly, this patch appears to cause the occasional |
Adding a delay between the immediate import/export appears to avoid the spurious busy error. |
@behlendorf If we import the Illumos taskq code, we can avoid regressions like this by getting the exact Illumos behavior. I have a proof of concept in a local branch that I can push if you are interested. It depended on the kmem-rework, which has since been merged. |
@ryao I'm not sure I'd categorize this as a regression. The documentation was a bit ambiguous about this and FreeBSD implemented the same way we did. Frankly, I'd prefer to keep the existing behavior which is probably better for performance but I don't like the risk of accidental bugs being introduced due to a behavior differences. I'm open to the idea of moving to the Illumos implementation but I'd need to be convinced of a few things. Off the top of my head:
|
As described in the comment above arc_adapt_thread() it is critical that the arc_adapt_thread() function never sleep while holding a hash lock. This behavior was possible in the Linux implementation because the arc_prune() logic was implemented to be synchronous. Under illumos the analogous dnlc_reduce_cache() function is asynchronous. To address this the arc_do_user_prune() function is has been reworked in to two new functions as follows: * arc_prune_async() is an asynchronous implementation which dispatches the prune callback to be run by the system taskq. This makes it suitable to use in the context of the arc_adapt_thread(). * arc_prune() is a synchronous implementation which depends on the arc_prune_async() implementation but blocks until the outstanding callbacks complete. This is used in arc_kmem_reap_now() where it is safe, and expected, that memory will be freed. Note that currently the system taskq is used for this purpose because it's convenient. However, it may make more sense to put this on another taskq or create a new one. This patch additionally adds the zfs_arc_meta_strategy module option while allows the meta reclaim strategy to be configured. It defaults to a balanced strategy which has been proved to work well under Linux but the illumos meta-only strategy can be enabled. Signed-off-by: Brian Behlendorf <[email protected]>
Closing, replaced by #455. |
Under Illumos taskq_wait() returns when there are no more tasks
in the queue. This behavior differs from ZoL and FreeBSD where
taskq_wait() returns when all the tasks in the queue at the
beginning of the taskq_wait() call are complete. New tasks
added whilst taskq_wait() is running will be ignored.
This difference in semantics makes it possible that new subtle
issues could be introduced when porting changes from Illumos.
To avoid that possibility the taskq_wait() function is being
updated such that it blocks until the queue in empty. The
previous behavior is still available via the taskq_wait_all()
interface.
Signed-off-by: Chris Dunlop [email protected]
Signed-off-by: Brian Behlendorf [email protected]