Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Make taskq_wait() block until the queue is empty #453

Closed
wants to merge 1 commit into from

Conversation

behlendorf
Copy link
Contributor

Under Illumos taskq_wait() returns when there are no more tasks
in the queue. This behavior differs from ZoL and FreeBSD where
taskq_wait() returns when all the tasks in the queue at the
beginning of the taskq_wait() call are complete. New tasks
added whilst taskq_wait() is running will be ignored.

This difference in semantics makes it possible that new subtle
issues could be introduced when porting changes from Illumos.
To avoid that possibility the taskq_wait() function is being
updated such that it blocks until the queue in empty. The
previous behavior is still available via the taskq_wait_all()
interface.

Signed-off-by: Chris Dunlop [email protected]
Signed-off-by: Brian Behlendorf [email protected]

@behlendorf
Copy link
Contributor Author

@chrisrd what do you think about this? We update the taskq_wait() implementation to have the illumos semantics. My only slightly concern with this is it might end up have a performance impact. However, that could be pretty easily mitigated because we could also update critical areas to use taskq_wait_all() to get the previous semantics.

@chrisrd
Copy link
Contributor

chrisrd commented May 14, 2015

@behlendorf the code looks good.

The taskq_wait_all() name seems at odds with it's functionality: to my mind taskq_wait() (now) waits for "all" tasks to complete, whereas the renamed task_wait_all() is waiting for all outstanding tasks when the function was called, so possibly taskq_wait_outstanding()?

Performance wise, what's the worst that could happen? I guess we could potentially not return for a very long time if tasks are constantly getting queued, e.g. due to memory pressure. If we have any queuing of tasks in the freeing memory path that could potentially end up as a real or apparent deadlock. That would be pretty bad for performance!

@behlendorf
Copy link
Contributor Author

@chrisrd agreed, I'm not to happy with taskq_wait_all() either, it was a poor choice of name. The good news is that nothing in ZFS uses it since it was primarily added for testing purposes in the SPLAT. So, if we're going to change it now is the time to do it. I like taskq_wait_outstanding() let me refresh this accordingly.

As for performance I don't think we'll encounter anything too serious because that most likely would already have been exposed under illumos. But it's just something to keep in mind and I wanted to mention! It's probably more likely that we'll just get some inexplicable and unlikely bugs fixed.

Under Illumos taskq_wait() returns when there are no more tasks
in the queue.  This behavior differs from ZoL and FreeBSD where
taskq_wait() returns when all the tasks in the queue at the
beginning of the taskq_wait() call are complete.  New tasks
added whilst taskq_wait() is running will be ignored.

This difference in semantics makes it possible that new subtle
issues could be introduced when porting changes from Illumos.
To avoid that possibility the taskq_wait() function is being
updated such that it blocks until the queue in empty.

The previous behavior remains available through the
taskq_wait_outstanding() interface.  Note that this function
was previously called taskq_wait_all() but has been renamed
to avoid confusion.

Signed-off-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
@behlendorf
Copy link
Contributor Author

@chrisrd refreshed. Locally I've run the updated SPL through the buildbot and everything passed. I'll test the refreshed patch again but assuming nothing comes up I think we can merge this. Can you look over the refreshed version, it just includes the function name change.

@chrisrd
Copy link
Contributor

chrisrd commented May 16, 2015

Looks good to me!

@behlendorf
Copy link
Contributor Author

Interestingly, this patch appears to cause the occasional cannot export 'tank': pool is busy error in the automated zimport.sh tests. I'm fairly sure retrying the command would succeed so I wouldn't categorize this as a regression. I suspect we're just blocking slightly longer now with a reference on the spa, which arguably is safer. I'll try adjusting the test script to account for this and verify it's a transient problem before merging this.

@behlendorf
Copy link
Contributor Author

Adding a delay between the immediate import/export appears to avoid the spurious busy error.

@ryao
Copy link
Contributor

ryao commented May 20, 2015

@behlendorf If we import the Illumos taskq code, we can avoid regressions like this by getting the exact Illumos behavior. I have a proof of concept in a local branch that I can push if you are interested. It depended on the kmem-rework, which has since been merged.

@behlendorf
Copy link
Contributor Author

@ryao I'm not sure I'd categorize this as a regression. The documentation was a bit ambiguous about this and FreeBSD implemented the same way we did. Frankly, I'd prefer to keep the existing behavior which is probably better for performance but I don't like the risk of accidental bugs being introduced due to a behavior differences.

I'm open to the idea of moving to the Illumos implementation but I'd need to be convinced of a few things. Off the top of my head:

  • It performs as well, or better, than the existing implementation.
  • taskq_dispatch() can be safely called in an atomic context.

behlendorf referenced this pull request in behlendorf/zfs Jun 2, 2015
As described in the comment above arc_adapt_thread() it is critical
that the arc_adapt_thread() function never sleep while holding a hash
lock.  This behavior was possible in the Linux implementation because
the arc_prune() logic was implemented to be synchronous.  Under
illumos the analogous dnlc_reduce_cache() function is asynchronous.

To address this the arc_do_user_prune() function is has been reworked
in to two new functions as follows:

* arc_prune_async() is an asynchronous implementation which dispatches
the prune callback to be run by the system taskq.  This makes it
suitable to use in the context of the arc_adapt_thread().

* arc_prune() is a synchronous implementation which depends on the
arc_prune_async() implementation but blocks until the outstanding
callbacks complete.  This is used in arc_kmem_reap_now() where it
is safe, and expected, that memory will be freed.

Note that currently the system taskq is used for this purpose because
it's convenient.  However, it may make more sense to put this on
another taskq or create a new one.

This patch additionally adds the zfs_arc_meta_strategy module option
while allows the meta reclaim strategy to be configured.  It defaults
to a balanced strategy which has been proved to work well under Linux
but the illumos meta-only strategy can be enabled.

Signed-off-by: Brian Behlendorf <[email protected]>
@behlendorf
Copy link
Contributor Author

Closing, replaced by #455.

@behlendorf behlendorf closed this Jun 4, 2015
@behlendorf behlendorf deleted the task_wait branch July 28, 2017 22:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants