Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fork #1

Merged
merged 28 commits into from
Apr 10, 2015
Merged

Update fork #1

merged 28 commits into from
Apr 10, 2015

Conversation

dasjoe
Copy link
Owner

@dasjoe dasjoe commented Apr 10, 2015

No description provided.

Josef 'Jeff' Sipek and others added 28 commits March 12, 2015 15:40
5047 don't use atomic_*_nv if you discard the return value
Author: Josef 'Jeff' Sipek <[email protected]>
Reviewed by: Garrett D'Amore <[email protected]>
Reviewed by: Jason King <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Robert Mustacchi <[email protected]>

References:
  https://www.illumos.org/issues/5047
  illumos/illumos-gate@640c167

Porting Notes:

Several hunks from the original patch where not specific to ZFS
and thus were dropped.

Ported-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Issue #3172
…orruption

5630 stale bonus buffer in recycled dnode_t leads to data corruption
Author: Justin T. Gibbs <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Will Andrews <[email protected]>
Approved by: Robert Mustacchi <[email protected]>

References:
  https://www.illumos.org/issues/5630
  illumos/illumos-gate@cd485b4

Ported-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Issue #3172
Explicitly disable the unused by variable warnings by setting
__attribute__((unused)) for bdi_setup_and_register().  This is
required because the function is defined with the __must_check
attribute.

Signed-off-by: Bill McGonigle <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3141
As of automake 1.14.2, currently shipped with Ubuntu 14.04, automake
warns about AM_INIT_AUTOMAKE having more than one argument:

configure.ac:41: warning: AM_INIT_AUTOMAKE: two- and three-arguments forms are deprecated.  For more info, see:
configure.ac:41: http://www.gnu.org/software/automake/manual/automake.html#Modernize-AM_005fINIT_005fAUTOMAKE-invocation

This commit fixes the warnings by following above link's advice, so
AM_INIT gets called with the package's name and version. As both are
defined in the META file we're parsing it with `grep`, `cut` and `tr`.

NOTE: autoconf < 1.14 not supporting m4_esyscmd_s so m4_esyscmd was
used and modified `tr` to truncate newlines, too.

Signed-off-by: Hajo M<C3><B6>ller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3174
Use MUTEX_FSTRANS on l2arc_buflist_mtx to prevent the following deadlock
scenario:
1. arc_release() -> hash_lock -> l2arc_buflist_mtx
2. l2arc_write_buffers() -> l2arc_buflist_mtx -> (direct reclaim) ->
   arc_buf_remove_ref() -> hash_lock

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Signed-off-by: Tim Chase <[email protected]>
Issue #3160
The extra one was under the 'zfs receive' command (which isn't relevant).
Instead, it should have been further up (still in the 'zfs send' option).

Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3194
I noticed when reviewing documentation that it is possible for user
space to use fctnl(fd, F_SETPIPE_SZ, (unsigned long) size) to change
the kernel pipe buffer size on Linux to increase the pipe size up to
the value specified in /proc/sys/fs/pipe-max-size. There are users using
mbuffer to improve zfs recv performance when piping over the network, so
it seems advantageous to integrate such functionality directly into the
zfs recv tool. This avoids the addition of two buffers and two copies
(one for the buffer mbuffer adds and another for the additional pipe),
so it should be more efficient.

This could have been made configurable and/or this could have changed
the value back to the original after we were done with the file
descriptor, but I do not see a strong case for doing either, so I
went with a simple implementation.

Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1161
The arc_meta_max value should be increased when space it consumed not when
it is returned.  This ensure's that arc_meta_max is always up to date.

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
Issue #3160
Originally when the ARC prune callback was introduced the idea was
to register a single callback for the ZPL.  The ARC could invoke this
call back if it needed the ZPL to drop dentries, inodes, or other
cache objects which might be pinning buffers in the ARC.  The ZPL
would iterate over all ZFS super blocks and perform the reclaim.

For the most part this design has worked well but due to limitations
in 2.6.35 and earlier kernels there were some problems.  This patch
is designed to address those issues.

1) iterate_supers_type() is not provided by all kernels which makes
it impossible to safely iterate over all zpl_fs_type filesystems in
a single callback.  The most straight forward and portable way to
resolve this is to register a callback per-filesystem during mount.
The arc_*_prune_callback() functions have always supported multiple
callbacks so this is functionally a very small change.

2) Commit 050d22b removed the non-portable shrink_dcache_memory()
and shrink_icache_memory() functions and didn't replace them with
equivalent functionality.  This meant that for Linux 3.1 and older
kernels the ARC had no mechanism to drop dentries and inodes from
the caches if needed.  This patch adds that missing functionality
by calling shrink_dcache_parent() to release dentries which may be
pinning inodes.  This will result in all unused cache entries being
dropped which is a bit heavy handed but it's the only interface
available for old kernels.

3) A zpl_drop_inode() callback is registered for kernels older than
2.6.35 which do not support the .evict_inode callback.  This ensures
that when the last reference on an inode is dropped it is immediately
removed from the cache.  If this isn't done than inode can end up on
the global unused LRU with no mechanism available to ZFS to drop them.
Since the ARC buffers are not dropped the hottest inodes can still
be recreated without performing disk IO.

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
Issue #3160
The goal of this function is to evict enough meta data buffers from the
ARC in order to enforce the arc_meta_limit.  Achieving this is slightly
more complicated than it appears because it is common for data buffers
to have holds on meta data buffers.  In addition, dnode meta data buffers
will be held by the dnodes in the block preventing them from being freed.
This means we can't simply traverse the ARC and expect to always find
enough unheld meta data buffer to release.

Therefore, this function has been updated to make alternating passes
over the ARC releasing data buffers and then newly unheld meta data
buffers.  This ensures forward progress is maintained and arc_meta_used
will decrease.  Normally this is sufficient, but if required the ARC
will call the registered prune callbacks causing dentry and inodes to
be dropped from the VFS cache.  This will make dnode meta data buffers
available for reclaim.  The number of total restarts in limited by
zfs_arc_meta_adjust_restarts to prevent spinning in the rare case
where all meta data is pinned.

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
Issue #3160
zfs_sb_t has grown to the point where using kmem_zalloc() for allocations
is triggering the 32k warning threshold.

We can't safely convert this entire allocation to use vmem_alloc() instead
of kmem_alloc() because the backing_dev_info structure is embedded here.
It depends on the bit_waitqueue() function which won't behave properly
when given a virtual address.

Instead, use vmem_alloc() to allocate the z_hold_mtx array separately.

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlop <[email protected]>
Closes #3178
The zio_inject.c keeps zio_injection_enabled as a counter of
fault handlers, so it should not be exported to user space as
a module option.

Several EXPORT_SYMBOLs are moved from zio.c to zio_inject.c,
where the symbols are defined.

Signed-off-by: Isaac Huang <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3199
Execute udevadm settle before trying to import pools.  Otherwise the
disk device nodes may not be ready before import time.  This is
analogous to the behavior of the init scripts and systemd units.

Signed-off-by: Gordan Bobic <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3213
ZoL had been setting max_sectors to UINT_MAX, but until Linux 3.19, it
the kernel artifically capped it at 1024 (BLK_DEF_MAX_SECTORS).
This cap was removed in torvalds/linux@34b48db.  This patch changes
it to DMU_MAX_ACCESS (in sectors) and also changes the ASSERT in
dmu_tx_hold_write() to allow the maximum transfer size.

Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3212
When called to free a spill block from a dnode, dbuf_free_range() has a
bug that results in all dbufs for the dnode getting freed.  A variety of
problems may result from this bug, but a common one was a zap lookup
tripping an ASSERT because the zap buffers had been zeroed out.  This
could happen on a dataset with xattr=sa set when extended attributes are
written and removed on a directory concurrently with I/O to files in
that directory.

Signed-off-by: Ned Bass <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Fixes #3195
Fixes #3204
Fixes #3222
When using 'zpool import' to scan for available pools prefer vdev names
which reference vdevs with more valid labels.  There should be two labels
at the start of the device and two labels at the end of the device.  If
labels are missing then the device has been damaged or is in some other
way incomplete.  Preferring names with fully intact labels helps weed out
bad paths and improves the likelihood of being able to import the pool.

This behavior only applies when scanning /dev/ for valid pools.  If a
cache file exists the pools described by the cache file will be used.

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlap <[email protected]>
Closes #3145
Closes #2844
Closes #3107
Originally it was thought that custom spec files might be required
for Fedora.  Happily that has turns out not to be the case.  Since
this directory just contains symlinks to the generic spec files it
can be removed.

Signed-off-by: Brian Behlendorf <[email protected]>
Provide a Redhat specific zfs-kmod.spec file which uses the old style
kmods (not kmods2) packaging.  By using the provided kmodtool script
packages can be built which support weak modules.  This allows for the
kernel to be updated without having to rebuild the ZFS kernel modules.

Packages for RHEL/Centos/SL/TOSS which use this spec file can by built
as follows:

$ ./configure --with-spec=redhat
$ make rpms

Signed-off-by: Brian Behlendorf <[email protected]>
The owner field could be NULL in some cases, so add a guard.  Shorten
__entry field names to fit assignment statements in 80 columns.

Signed-off-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Fixes #3220
Make the 'zpool import' command honor the overlay property to allow
filesystems to be mounted on a non-empty directory. As it stands now
this property is only checked by the 'zfs mount' command.  Move the
check into 'zfs_mount()` in libzpool so the property is honored for all
callers.

Signed-off-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3227
5695 dmu_sync'ed holes do not retain birth time
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Bayard Bell <[email protected]>
Approved by: Dan McDonald <[email protected]>

References:
  https://www.illumos.org/issues/5695
  illumos/illumos-gate@70163ac

Ported-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3229
Align code in traverse_visitbp() with that in Illumos in preparation for
applying Illumos-5694.

No functional change: use a temporary variable pd to replace multiple
occurrences of td->td_pfd.  This increases our stack use slightly more
then normal because the function is called recursively.

Signed-off-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #3230
5694 traverse_prefetcher does not prefetch enough
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Alex Reece <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Josef 'Jeff' Sipek <[email protected]>
Reviewed by: Bayard Bell <[email protected]>
Approved by: Garrett D'Amore <[email protected]>

References:
  https://www.illumos.org/issues/5694
  illumos/illumos-gate@34d7ce05

Ported-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3230
…and bp_override

5693 ztest fails in dbuf_verify: buf[i] == 0, due to dedup and bp_override
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Bayard Bell <[email protected]>
Approved by: Dan McDonald <[email protected]>

References:
  https://www.illumos.org/issues/5693
  illumos/illumos-gate@7f7ace3

Ported-by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3231
Commit b738bc5 should have updated the default value of zfs_pd_bytes_max
in the zfs(8) man page.  The correct default value is 50*1024*1024.

Signed-off-by: Brian Behlendorf <[email protected]>
Prevent deadlocks by disabling direct reclaim during all ZPL and ioctl
calls as well as the l2arc and adapt ARC threads.

This obviates the need for MUTEX_FSTRANS so its previous uses and
definition have been eliminated.

Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3225
The packed nvlist allocated in spa_config_write() may exceed the
warning threshold for large configurations.  Use the vmem interfaces
for this short lived allocation.

Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3251
META file and release log updated.

Signed-off-by: Brian Behlendorf <[email protected]>
dasjoe added a commit that referenced this pull request Apr 10, 2015
@dasjoe dasjoe merged commit b4a9adc into dasjoe:master Apr 10, 2015
dasjoe pushed a commit that referenced this pull request May 15, 2015
The params to the functions are uint64_t, but the offsets to memcpy
/ bcopy are calculated using 32bit ints. This patch changes them to
also be uint64_t so there isnt an overflow. PaX's Size Overflow
caught this when formatting a zvol.

Gentoo bug: #546490

PAX: offset: 1ffffb000 db->db_offset: 1ffffa000 db->db_size: 2000 size: 5000
PAX: size overflow detected in function dmu_read /var/tmp/portage/sys-fs/zfs-kmod-0.6.3-r1/work/zfs-zfs-0.6.3/module/zfs/../../module/zfs/dmu.c:781 cicus.366_146 max, count: 15
CPU: 1 PID: 2236 Comm: zvol/10 Tainted: P           O   3.17.7-hardened-r1 #1
Call Trace:
 [<ffffffffa0382ee8>] ? dsl_dataset_get_holds+0x9d58/0x343ce [zfs]
 [<ffffffff81a59c88>] dump_stack+0x4e/0x7a
 [<ffffffffa0393c2a>] ? dsl_dataset_get_holds+0x1aa9a/0x343ce [zfs]
 [<ffffffff81206696>] report_size_overflow+0x36/0x40
 [<ffffffffa02dba2b>] dmu_read+0x52b/0x920 [zfs]
 [<ffffffffa0373ad1>] zrl_is_locked+0x7d1/0x1ce0 [zfs]
 [<ffffffffa0364cd2>] zil_clean+0x9d2/0xc00 [zfs]
 [<ffffffffa0364f21>] zil_commit+0x21/0x30 [zfs]
 [<ffffffffa0373fe1>] zrl_is_locked+0xce1/0x1ce0 [zfs]
 [<ffffffff81a5e2c7>] ? __schedule+0x547/0xbc0
 [<ffffffffa01582e6>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff81103eb0>] ? wake_up_state+0x20/0x20
 [<ffffffffa0158150>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff810f7ff4>] kthread+0xc4/0xe0
 [<ffffffff810f7f30>] ? kthread_create_on_node+0x170/0x170
 [<ffffffff81a62fa4>] ret_from_fork+0x74/0xa0
 [<ffffffff810f7f30>] ? kthread_create_on_node+0x170/0x170

Signed-off-by: Jason Zaman <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#3333
dasjoe pushed a commit that referenced this pull request May 24, 2015
The params to the functions are uint64_t, but the offsets to memcpy
/ bcopy are calculated using 32bit ints. This patch changes them to
also be uint64_t so there isnt an overflow. PaX's Size Overflow
caught this when formatting a zvol.

Gentoo bug: #546490

PAX: offset: 1ffffb000 db->db_offset: 1ffffa000 db->db_size: 2000 size: 5000
PAX: size overflow detected in function dmu_read /var/tmp/portage/sys-fs/zfs-kmod-0.6.3-r1/work/zfs-zfs-0.6.3/module/zfs/../../module/zfs/dmu.c:781 cicus.366_146 max, count: 15
CPU: 1 PID: 2236 Comm: zvol/10 Tainted: P           O   3.17.7-hardened-r1 #1
Call Trace:
 [<ffffffffa0382ee8>] ? dsl_dataset_get_holds+0x9d58/0x343ce [zfs]
 [<ffffffff81a59c88>] dump_stack+0x4e/0x7a
 [<ffffffffa0393c2a>] ? dsl_dataset_get_holds+0x1aa9a/0x343ce [zfs]
 [<ffffffff81206696>] report_size_overflow+0x36/0x40
 [<ffffffffa02dba2b>] dmu_read+0x52b/0x920 [zfs]
 [<ffffffffa0373ad1>] zrl_is_locked+0x7d1/0x1ce0 [zfs]
 [<ffffffffa0364cd2>] zil_clean+0x9d2/0xc00 [zfs]
 [<ffffffffa0364f21>] zil_commit+0x21/0x30 [zfs]
 [<ffffffffa0373fe1>] zrl_is_locked+0xce1/0x1ce0 [zfs]
 [<ffffffff81a5e2c7>] ? __schedule+0x547/0xbc0
 [<ffffffffa01582e6>] taskq_cancel_id+0x2a6/0x5b0 [spl]
 [<ffffffff81103eb0>] ? wake_up_state+0x20/0x20
 [<ffffffffa0158150>] ? taskq_cancel_id+0x110/0x5b0 [spl]
 [<ffffffff810f7ff4>] kthread+0xc4/0xe0
 [<ffffffff810f7f30>] ? kthread_create_on_node+0x170/0x170
 [<ffffffff81a62fa4>] ret_from_fork+0x74/0xa0
 [<ffffffff810f7f30>] ? kthread_create_on_node+0x170/0x170

Signed-off-by: Jason Zaman <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#3333
dasjoe pushed a commit that referenced this pull request Oct 25, 2016
Leaks reported by using AddressSanitizer, GCC 6.1.0

Direct leak of 4097 byte(s) in 1 object(s) allocated from:
    #1 0x414f73 in process_options cmd/ztest/ztest.c:721

Direct leak of 5440 byte(s) in 17 object(s) allocated from:
    #1 0x41bfd5 in umem_alloc ../../lib/libspl/include/umem.h:88
    #2 0x41bfd5 in ztest_zap_parallel cmd/ztest/ztest.c:4659
    #3 0x4163a8 in ztest_execute cmd/ztest/ztest.c:5907

Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#4896
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.