Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[buildbot tests] [PR5062, PR5404, PR5449, PR5285, zvol: reduce linear list search] #5453

Closed

Conversation

kernelOfTruth
Copy link
Contributor

The following pull-requests are tested on top of latest master which includes ABD changes,

pushing this through a battery of tests to make sure it's stable for my "primetime" (bleeding edge production) testing

#5062 Godfather's child I/Os that may can't done and lost chance to reexecute again
#5404 OpenZFS 7303 - dynamic metaslab selection
#5449 OpenZFS 6569 - large file delete can starve out write ops
#5285 Allow zfs to send replication streams when missing snapshots in the hierarchy
tuxoko/zfs@0a7c4b7 zvol: reduce linear list search

Since there's an issue with the LRU for ABD patch and I'll most likely update my kernel + zfs tomorrow,

opening this branch/PR for a test run without it

[closing https://github.com//pull/5451 for now]

Chunwei Chen and others added 5 commits December 4, 2016 21:02
Use kernel ida to generate minor number, and use hash table to find zvol
with
name.

Signed-off-by: Chunwei Chen <[email protected]>
After resume the suspend godfather I/O , if some child zio's io_reexecute flag assigned ZIO_REEXECUTE_NOW bit,
but others assigned ZIO_REEXECUTE_SUSPEND bit, so the godfather I/O's io_reexecute flag will inherit both ZIO_REEXECUTE_NOW and ZIO_REEXECUTE_SUSPEND bit.

However these child I/Os which io_reexecute flag assigned ZIO_REEXECUTE_SUSPEND will remove from current
godfater I/O's child list and add to new spa_suspend_zio_root's child list, but others zio which io_reexecute
flag assigned ZIO_REEXECUTE_NOW only notify godfater zio to execute and assigned self io_reexecute flag value to godfater zio.

At last, the godfater I/O execute above code in zio_done function, and then godfather I/O execute done and destroy,
and this lead to these child zio which io_reexecute flag assigned ZIO_REEXECUTE_NOW will lose monitor zio and have no chance to reexecute again.

So fix zio_done() should only be clearing the ZIO_REEXECUTE_SUSPEND bit in this case.
This change introduces a new weighting algorithm to improve
metaslab selection. The new weighting algorithm relies on the
SPACEMAP_HISTOGRAM feature. As a result, the metaslab weight
now encodes the type of weighting algorithm used (size-based
vs segment-based).

Authored by: George Wilson <[email protected]>
Reviewed by: Alex Reece <[email protected]>
Reviewed by: Chris Siden <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Reviewed by: Pavel Zakharov [email protected]
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: Don Brady <[email protected]>
Approved by:
Ported-by: Don Brady <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/7303
OpenZFS-commit:
openzfs/openzfs@8710fccea7

Porting Notes: The metaslab allocation tracing code is conditionally
               removed on linux (dependent on mdb debugger).
…ierarchy

With OpenZFS 6111 we lost the ability to create replication send streams
when we're missing even a single snapshot in the whole hierarchy: this
commit restore this functionality.

Add also relative zfs_send_008_pos.ksh script to ZFS test suite.

Signed-off-by: loli10K <[email protected]>
@mention-bot
Copy link

@kernelOfTruth, thanks for your PR! By analyzing the history of the files in this pull request, we identified @grwilson, @behlendorf and @don-brady to be potential reviewers.

kernelOfTruth and others added 5 commits December 15, 2016 00:42
This reverts commit 4d1ae43.

[erroneously used the wrong commit]
Use it for spa_deadman, zpl_posix_acl_free, snapentry_expire.
This free system_taskq from the above long delay tasks, and allow us to do
taskq_wait_outstanding on system_taskq without being blocked forever, making
system_taskq more generic and useful.

Signed-off-by: Chunwei Chen <[email protected]>
Use kernel ida to generate minor number, and use hash table to find zvol with
name.

Signed-off-by: Chunwei Chen <[email protected]>
Do parallel prefetch all zvol dnodes before actually creating each individual.
This will greatly reduce the import time when having a lot of zvols and disk
is slow.

Signed-off-by: Chunwei Chen <[email protected]>
On some kernel version, blk_cleanup_queue and put_disk will wait for more then
10ms. So a pool with a lot of zvols will easily wait for more then 1 min if we
do zvol_free sequentially.

Signed-off-by: Chunwei Chen <[email protected]>
Requires-spl: refs/pull/588/head
@kernelOfTruth
Copy link
Contributor Author

pushing additionally the changes from #5433 , want to make sure this works fine before I test it with an rt-kernel [https://github.com/openzfs/spl/pull/589] ...

The introduction of parallel zvol prefetch causes deadlock when using
vdev_file.

spa_async->(spa_namespace_lock)->txg_wait_synced->(wait for txg_sync)
txg_sync->zio_wait->(wait for vdev_file_io_fsync on system_taskq)
zvol_prefetch_minors_impl (on system_taskq)->spa_open_common->(wait for spa_namespace_lock)

We fix this by using dedicated taskq for vdev_file.  This same change
was originally made in commit bc25c93 but reverted in commit aa9af22
when dynamic taskqs were added.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes openzfs#5506 
Closes openzfs#5495
@kernelOfTruth
Copy link
Contributor Author

adding the fix for deadlock [haven't run into it yet],

preparing usage & testing of 4.9-rt kernel

@behlendorf
Copy link
Contributor

Closing to cut down on the number of open PRs. The test results will remain available as always.

@behlendorf behlendorf closed this Jan 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants