Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[buildbot tests] [PR5062, PR5404, PR5449, PR5285, LRU for ABD, zvol: reduce linear list search] #5451

Closed

Conversation

kernelOfTruth
Copy link
Contributor

@kernelOfTruth kernelOfTruth commented Dec 4, 2016

The following pull-requests are tested on top of latest master which includes ABD changes,

pushing this through a battery of tests to make sure it's stable for my "primetime" (bleeding edge production) testing

#5062 Godfather's child I/Os that may can't done and lost chance to reexecute again
#5404 OpenZFS 7303 - dynamic metaslab selection
#5449 OpenZFS 6569 - large file delete can starve out write ops
#5285 Allow zfs to send replication streams when missing snapshots in the hierarchy
https://github.com/jxiong/zfs/commits/abd_LRU [rebased]
tuxoko@0a7c4b7 zvol: reduce linear list search

Thanks

Chunwei Chen and others added 5 commits December 4, 2016 21:02
Use kernel ida to generate minor number, and use hash table to find zvol
with
name.

Signed-off-by: Chunwei Chen <[email protected]>
Maintain a per-cpu LRU cache for abd_t.

Two ZFS parameters have added to control the LRU:
1. zfs_abd_lru_max_size: control the maximum size of per-cpu abd size;
2. zfs_abd_lru_purge_count: number of abd should be purged from each
   LRU bucket during cache shrink.

Performance comparision after this patch is applied:
+------------+-----------------------+
| before     | 5871834.88 KB/sec     |
+------------+-----------------------+
| after      | 6470394.88 KB/sec     |
+------------+-----------------------+

The test was performed on a node with 6 NVMe SSD drives installed.

Signed-off-by: Jinshan Xiong <[email protected]>
After resume the suspend godfather I/O , if some child zio's io_reexecute flag assigned ZIO_REEXECUTE_NOW bit,
but others assigned ZIO_REEXECUTE_SUSPEND bit, so the godfather I/O's io_reexecute flag will inherit both ZIO_REEXECUTE_NOW and ZIO_REEXECUTE_SUSPEND bit.

However these child I/Os which io_reexecute flag assigned ZIO_REEXECUTE_SUSPEND will remove from current
godfater I/O's child list and add to new spa_suspend_zio_root's child list, but others zio which io_reexecute
flag assigned ZIO_REEXECUTE_NOW only notify godfater zio to execute and assigned self io_reexecute flag value to godfater zio.

At last, the godfater I/O execute above code in zio_done function, and then godfather I/O execute done and destroy,
and this lead to these child zio which io_reexecute flag assigned ZIO_REEXECUTE_NOW will lose monitor zio and have no chance to reexecute again.

So fix zio_done() should only be clearing the ZIO_REEXECUTE_SUSPEND bit in this case.
This change introduces a new weighting algorithm to improve
metaslab selection. The new weighting algorithm relies on the
SPACEMAP_HISTOGRAM feature. As a result, the metaslab weight
now encodes the type of weighting algorithm used (size-based
vs segment-based).

Authored by: George Wilson <[email protected]>
Reviewed by: Alex Reece <[email protected]>
Reviewed by: Chris Siden <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Reviewed by: Pavel Zakharov [email protected]
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: Don Brady <[email protected]>
Approved by:
Ported-by: Don Brady <[email protected]>

OpenZFS-issue: https://www.illumos.org/issues/7303
OpenZFS-commit:
openzfs/openzfs@8710fccea7

Porting Notes: The metaslab allocation tracing code is conditionally
               removed on linux (dependent on mdb debugger).
…ierarchy

With OpenZFS 6111 we lost the ability to create replication send streams
when we're missing even a single snapshot in the whole hierarchy: this
commit restore this functionality.

Add also relative zfs_send_008_pos.ksh script to ZFS test suite.

Signed-off-by: loli10K <[email protected]>
@mention-bot
Copy link

@kernelOfTruth, thanks for your PR! By analyzing the history of the files in this pull request, we identified @grwilson, @behlendorf and @don-brady to be potential reviewers.

@kernelOfTruth
Copy link
Contributor Author

kernelOfTruth commented Dec 4, 2016

amended latest commit with no changes == different hash,

hopefully the buildbots this time are fine

edit:

/tmp/zfs-build-buildbot-H3lFoW5P/BUILD/zfs-kmod-0.7.0/_kmod_build_4.4.30-32.54.amzn1.x86_64/../zfs-0.7.0/module/zfs/abd.c:122:26: fatal error: sys/spinlock.h: No such file or directory
 #include <sys/spinlock.h>
                          ^
compilation terminated.

alright, need to complete the patchset, seems I oversaw something ...

edit2:

introduced by

LRU for ABD

edit3:

debug builds are broken

../../module/zfs/abd.c: In function ‘abd_lru_put’:
../../module/zfs/abd.c:339:26: error: incompatible type for argument 1 of ‘refcount_is_zero’
  ASSERT(refcount_is_zero(abd->abd_children));
                          ^
../../lib/libspl/include/assert.h:48:13: note: in definition of macro ‘VERIFY’
  (void) ((!(cond)) &&      \
             ^
../../module/zfs/abd.c:339:2: note: in expansion of macro ‘ASSERT’
  ASSERT(refcount_is_zero(abd->abd_children));
  ^
In file included from ../../include/sys/abd.h:32:0,
                 from ../../module/zfs/abd.c:117:
../../include/sys/refcount.h:68:5: note: expected ‘refcount_t * {aka struct refcount *}’ but argument is of type ‘refcount_t {aka struct refcount}’
 int refcount_is_zero(refcount_t *rc);
     ^
In file included from ../../lib/libspl/include/sys/debug.h:30:0,
                 from ../../include/sys/abd.h:31,
                 from ../../module/zfs/abd.c:117:
../../module/zfs/abd.c: In function ‘abd_lru_cpu_purge’:
../../module/zfs/abd.c:397:11: error: implicit declaration of function ‘ABD_IS_SCATTER’ [-Werror=implicit-function-declaration]
    ASSERT(ABD_IS_SCATTER(obj));
           ^
../../lib/libspl/include/assert.h:48:13: note: in definition of macro ‘VERIFY’
  (void) ((!(cond)) &&      \
             ^
../../module/zfs/abd.c:397:4: note: in expansion of macro ‘ASSERT’
    ASSERT(ABD_IS_SCATTER(obj));
    ^
cc1: all warnings being treated as errors

@kernelOfTruth
Copy link
Contributor Author

I'm not certain whether that spinlock.h error is a limitation of the buildbots or a bogus error message,

LRU for ABD needs more testing and a PR anyway, it's very interesting 😄

CC: @jxiong

@kernelOfTruth
Copy link
Contributor Author

Closing for the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants