Fix multiple zdb bugs #7099

tuxoko · 2018-01-30T22:19:04Z

Description

This series fix some bugs in zdb I found recently when I tried to debug a corrupted pool (The corruption is likely not a ZFS issue)

Fix zdb -c traverse stop on damaged objset root
If a corruption happens to be on a root block of an objset, zdb -c will
not correctly report the error, and it will not traverse the datasets
that come after
Fix zle_decompress out of bound access
Fix racy assignment of zcb.zcb_haderrors
Fix zdb -R decompression
Add ZDB_NO_ZLE env to make zdb skip ZLE.
The random bytes appended to pabd, pbuf2 stuff serves no purpose. Instead, we randomize lbuf2 to make sure we fill exactly to lsize.
Fix wrong compression fail condition.
Fix zdb -E segfault
Fix zdb -ed on objset for exported pool
zdb pass objset name to spa_import, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one.

Motivation and Context

How Has This Been Tested?

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
All new and existing tests passed.
All commit messages are properly formatted and contain Signed-off-by.
Change has been approved by a ZFS on Linux member.

tuxoko · 2018-02-02T01:05:09Z

repurpose this pull request for multiple zdb bugs.

behlendorf

Thanks for running these down!

behlendorf · 2018-02-03T00:03:28Z

module/zfs/zle.c

@@ -74,10 +74,14 @@ zle_decompress(void *s_start, void *d_start, size_t s_len, size_t d_len, int n)
 	while (src < s_end && dst < d_end) {
 		int len = 1 + *src++;
 		if (len <= n) {
+			if (src + len > s_end || dst + len > d_end)


Check my math on this, but shouldn't this be >= and the same below.

Assuming src + len == s_end, after running *dst++ = *src++ len times, src becomes s_end. Which is fine, because *src++ dereference before addition.

behlendorf · 2018-02-03T00:17:30Z

cmd/zdb/zdb.c

@@ -3446,6 +3446,9 @@ dump_block_stats(spa_t *spa)
 		}
 	}

+	/* This is safe after zio_wait */


Let's be even more explicit in this comment since the race here is subtle. How about something like this.

/ * Done after zio_wait() since zcb_haderrors is modified in zdb_blkptr_done() */

behlendorf · 2018-02-03T00:27:43Z

cmd/zdb/zdb.c

@@ -4481,8 +4495,10 @@ main(int argc, char **argv)
 		args.path = searchdirs;
 		args.can_be_active = B_TRUE;

-		error = zpool_tryimport(g_zfs, target, &cfg, &args);
+		error = zpool_tryimport(g_zfs, target_pool, &cfg, &args);


It's a little concerning we didn't have any test coverage for this. What do you think about adding a very basic testcase to tests/zfs-tests/tests/functional/cli_root/zdb/ which covers passing the pool/dataset for imported/exported pools.

If a corruption happens to be on a root block of an objset, zdb -c will not correctly report the error, and it will not traverse the datasets that come after. This is because traverse_visitbp, which does the callback and reset error for TRAVERSE_HARD, is skipped when traversing zil is failed in traverse_impl. Here's example of what 'zdb -eLcc' command looks like on a pool with damaged objset root: == before patch: Traversing all blocks to verify checksums ... Error counts: errno count block traversal size 379392 != alloc 33987072 (unreachable 33607680) bp count: 172 ganged count: 0 bp logical: 1678336 avg: 9757 bp physical: 130560 avg: 759 compression: 12.85 bp allocated: 379392 avg: 2205 compression: 4.42 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 71 Dittoed blocks on same vdev: 101 == after patch: Traversing all blocks to verify checksums ... zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0> -- skipping Error counts: errno count 52 1 block traversal size 33963520 != alloc 33987072 (unreachable 23552) bp count: 447 ganged count: 0 bp logical: 36093440 avg: 80745 bp physical: 33699840 avg: 75391 compression: 1.07 bp allocated: 33963520 avg: 75981 compression: 1.06 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 76 Dittoed blocks on same vdev: 115 == Signed-off-by: Chunwei Chen <[email protected]>

Signed-off-by: Chunwei Chen <[email protected]>

zcb_haderrors will be modified in zdb_blkptr_done, which is asynchronous. So we must move this assignment after zio_wait. Signed-off-by: Chunwei Chen <[email protected]>

tuxoko · 2018-02-05T23:11:08Z

Updated according to review. Also mention ZDB_NO_ZLE in zdb manpage.

behlendorf

This looks like it should resolve #6464, can you verify that's the case. Otherwise this LGTM.

tuxoko · 2018-02-06T02:08:32Z

The test failed on some test bot, not sure why.

Test: /usr/share/zfs/zfs-tests/tests/functional/cli_root/zdb/zdb_006_pos (run as root) [00:04] [FAIL]
23:43:03.44 ASSERTION: Verify zdb -d
23:43:03.87 SUCCESS: zpool create -f testpool mirror loop0 loop1 loop2
23:43:03.91 SUCCESS: zfs create testpool/testfs
23:43:03.98 SUCCESS: zfs set mountpoint=/mnt/testdir testpool/testfs
23:43:05.18 Dataset mos [META], ID 0, cr_txg 4, 73.5K, 49 objects
23:43:05.18 Dataset testpool/testfs [ZPL], ID 76, cr_txg 8, 24K, 6 objects
23:43:05.18 Dataset testpool [ZPL], ID 54, cr_txg 1, 24K, 7 objects
23:43:05.18 Verified large_blocks feature refcount of 0 is correct
23:43:05.18 Verified large_dnode feature refcount of 0 is correct
23:43:05.18 Verified sha512 feature refcount of 0 is correct
23:43:05.18 Verified skein feature refcount of 0 is correct
23:43:05.18 Verified edonr feature refcount of 0 is correct
23:43:05.18 Verified userobj_accounting feature refcount of 2 is correct
23:43:05.18 Verified encryption feature refcount of 0 is correct
23:43:05.18 SUCCESS: zdb -d testpool
23:43:06.38 Dataset testpool [ZPL], ID 54, cr_txg 1, 24K, 7 objects
23:43:06.39 SUCCESS: zdb -d testpool/
23:43:07.58 Dataset testpool/testfs [ZPL], ID 76, cr_txg 8, 24K, 6 objects
23:43:07.58 SUCCESS: zdb -d testpool/testfs
23:43:07.76 SUCCESS: zpool export testpool
23:43:08.16 zdb: can't open 'testpool': Invalid argument
23:43:08.16 ERROR: zdb -ed testpool exited 1

loli10K · 2018-02-06T18:15:35Z

Does this also fix #4984?

tuxoko · 2018-02-06T20:34:29Z

Yes, this should fix #4984 and #6464

loli10K · 2018-02-06T21:48:48Z

I've tested this locally, the failure in "zdb_006_pos" is only observed when running the test group [clean_mirror] before [zdb]: it seems the [clean_mirror] cleanup script fails to wipe some information from the disks, the EINVAL is from zpool_tryimport() when it finds more than one matching pool:

root@linux:~# zdb -ed testpool
zdb: can't open 'testpool': Invalid argument
root@linux:~# zdb -ed 9144353876379755339
Dataset mos [META], ID 0, cr_txg 4, 73.5K, 49 objects
Dataset 9144353876379755339/testfs [ZPL], ID 76, cr_txg 7, 24K, 6 objects
Dataset 9144353876379755339 [ZPL], ID 54, cr_txg 1, 24K, 6 objects
Verified large_blocks feature refcount of 0 is correct
Verified large_dnode feature refcount of 0 is correct
Verified sha512 feature refcount of 0 is correct
Verified skein feature refcount of 0 is correct
Verified edonr feature refcount of 0 is correct
Verified userobj_accounting feature refcount of 2 is correct
Verified encryption feature refcount of 0 is correct
root@linux:~#

(gdb) bt
#0  zpool_tryimport (hdl=0xd47720, target=0x7fffffffee46 "testpool", configp=0x7fffffffea90, args=0x7fffffffea60) at libzfs_import.c:2327
#1  0x0000000000414f01 in main (argc=1, argv=0x7fffffffec38) at zdb.c:4501
(gdb) list
2322	
2323		if (count > 1) {
2324			(void) zfs_error_aux(hdl, dgettext(TEXT_DOMAIN,
2325			    "%d pools found, use pool GUID\n"), count);
2326			free(targetdup);
2327			return (EINVAL);
2328		}
2329	
2330		*configp = match;
2331		free(targetdup);
(gdb) p count
$1 = 2
(gdb) call nvlist_print(stderr, pools)
nvlist version: 0
	testpool = (embedded nvlist)
	nvlist version: 0
		vdev_children = 0x1
		version = 0x1388
		pool_guid = 0x8fa67d827301af28
		name = testpool
		state = 0x2
		vdev_tree = (embedded nvlist)
		nvlist version: 0
			type = root
			id = 0x0
			guid = 0x8fa67d827301af28
			children = (array of embedded nvlists)
			(start children[0])
			nvlist version: 0
				type = mirror
				id = 0x0
				guid = 0x79d4b21a1cd648d8
				metaslab_array = 0x44
				metaslab_shift = 0x18
				ashift = 0x9
				asize = 0x5980000
				is_log = 0x0
				create_txg = 0x4
				children = (array of embedded nvlists)
				(start children[0])
				nvlist version: 0
					type = disk
					id = 0x0
					guid = 0x11fb916a1579cc69
					whole_disk = 0x0
					create_txg = 0x4
					path = /dev/loop0p1
				(end children[0])
				(start children[1])
				nvlist version: 0
					type = disk
					id = 0x1
					guid = 0x263e1966a7fe8897
					whole_disk = 0x0
					create_txg = 0x4
					path = /dev/loop2p1
				(end children[1])

			(end children[0])

		(end vdev_tree)

	(end testpool)

	testpool = (embedded nvlist)
	nvlist version: 0
		vdev_children = 0x1
		version = 0x1388
		pool_guid = 0x7ee74566d614e34b
		name = testpool
		state = 0x1
		vdev_tree = (embedded nvlist)
		nvlist version: 0
			type = root
			id = 0x0
			guid = 0x7ee74566d614e34b
			children = (array of embedded nvlists)
			(start children[0])
			nvlist version: 0
				type = mirror
				id = 0x0
				guid = 0xfb634efe45d1cf24
				metaslab_array = 0x45
				metaslab_shift = 0x19
				ashift = 0x9
				asize = 0xffb80000
				is_log = 0x0
				create_txg = 0x4
				children = (array of embedded nvlists)
				(start children[0])
				nvlist version: 0
					type = disk
					id = 0x0
					guid = 0xe57c02d922d16b30
					whole_disk = 0x0
					create_txg = 0x4
					path = /dev/loop0
				(end children[0])
				(start children[1])
				nvlist version: 0
					type = disk
					id = 0x1
					guid = 0xec01ec4b1f3d32f1
					whole_disk = 0x0
					create_txg = 0x4
					path = /dev/loop1
				(end children[1])
				(start children[2])
				nvlist version: 0
					type = disk
					id = 0x2
					guid = 0xf089169b4273913
					whole_disk = 0x0
					create_txg = 0x4
					path = /dev/loop2
				(end children[2])

			(end children[0])

		(end vdev_tree)

	(end testpool)

(gdb)

loli10K · 2018-02-06T21:55:13Z

tests/zfs-tests/tests/functional/cli_root/zdb/zdb_006_pos.ksh

+
+log_must zdb -d $TESTPOOL
+log_must zdb -d $TESTPOOL/
+log_must zdb -d $TESTPOOL/$TESTFS


Should we also test the "snapshot" case (log_must zdb -d $TESTPOOL/$TESTFS@some-snap-name)?

loli10K · 2018-02-06T21:55:55Z

tests/zfs-tests/tests/functional/cli_root/zdb/zdb_006_pos.ksh

+
+log_must zdb -ed -p $DEV_RDSKDIR $TESTPOOL
+log_must zdb -ed -p $DEV_RDSKDIR $TESTPOOL/
+log_must zdb -ed -p $DEV_RDSKDIR $TESTPOOL/$TESTFS


Same here (?)

loli10K · 2018-02-06T22:03:00Z

cmd/zdb/zdb.c

@@ -4477,8 +4498,10 @@ main(int argc, char **argv)
 		args.path = searchdirs;
 		args.can_be_active = B_TRUE;

-		error = zpool_tryimport(g_zfs, target, &cfg, &args);
+		error = zpool_tryimport(g_zfs, target_pool, &cfg, &args);


Is there any way we can use the error message set by zfs_error_aux() in zpool_tryimport() to provide a more descriptive message here?

4568 if (error) 4569 fatal("can't open '%s': %s", target, strerror(error));

Let's leave this for later.

tuxoko · 2018-02-07T00:13:37Z

Update fix test fail and add snapshot in test.

loli10K · 2018-02-08T10:37:09Z

man/man8/zdb.8

+Decompress the block. Set environment variable
+.Nm ZBD_NO_ZLE
+to skip zle when
+guessing.


nit: superfluous newline here?

loli10K · 2018-02-08T10:47:38Z

tests/zfs-tests/tests/functional/cli_root/zdb/zdb_006_pos.ksh

+}
+
+for DISK in $DISKS; do
+	log_must partprobe $DEV_RDSKDIR/$DISK


Instead of forcefully reloading the partitions here (doesn't seem to work on my local CentOS7 builder) we could update clean_mirror/cleanup.ksh:

diff --git a/tests/zfs-tests/tests/functional/clean_mirror/cleanup.ksh b/tests/zfs-tests/tests/functional/clean_mirror/cleanup.ksh index ac3bfbc..fb0db31 100755 --- a/tests/zfs-tests/tests/functional/clean_mirror/cleanup.ksh +++ b/tests/zfs-tests/tests/functional/clean_mirror/cleanup.ksh @@ -38,10 +38,10 @@ df -F zfs -h | grep "$TESTFS " >/dev/null [[ $? == 0 ]] && log_must zfs umount -f $TESTDIR destroy_pool $TESTPOOL -if is_mpath_device $MIRROR_PRIMARY; then +if ( is_mpath_device $MIRROR_PRIMARY || is_loop_device $MIRROR_SECONDARY); then parted $DEV_DSKDIR/$MIRROR_PRIMARY -s rm 1 fi -if is_mpath_device $MIRROR_SECONDARY; then +if ( is_mpath_device $MIRROR_SECONDARY || is_loop_device $MIRROR_SECONDARY); then parted $DEV_DSKDIR/$MIRROR_SECONDARY -s rm 1 fi # recreate and destroy a zpool over the disks to restore the partitions to

loli10K · 2018-02-08T10:51:01Z

tests/zfs-tests/tests/functional/cli_root/zdb/zdb_006_pos.ksh

+
+function cleanup
+{
+	datasetexists $TESTPOOL && destroy_pool $TESTPOOL


destroy_pool uses poolexists internally, no need to datasetexists $TESTPOOL && here.

I just copy this from other zdb tests for consistency.

loli10K · 2018-02-08T10:52:28Z

tests/zfs-tests/tests/functional/cli_root/zdb/zdb_006_pos.ksh

+	for DISK in $DISKS; do
+		zpool labelclear -f $DEV_RDSKDIR/$DISK
+	done
+}


nit: need to destroy $TESTPOOL/$TESTFS@snap here.

Why? The pool is already destroyed.

There are some issues in the zdb -R decompression implementation. The first is that ZLE can easily decompress non-ZLE streams. So we add ZDB_NO_ZLE env to make zdb skip ZLE. The second is the random bytes appended to pabd, pbuf2 stuff. This serve no purpose at all, those bytes shouldn't be read during decompression anyway. Instead, we randomize lbuf2, so that we can make sure decompression fill exactly to lsize by bcmp lbuf and lbuf2. The last one is the condition to detect fail is wrong. Signed-off-by: Chunwei Chen <[email protected]>

SPA_MAXBLOCKSIZE is too large for stack. Signed-off-by: Chunwei Chen <[email protected]>

zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Signed-off-by: Chunwei Chen <[email protected]>

loli10K

LGTM, thanks.

codecov · 2018-02-09T09:51:20Z

Codecov Report

Merging #7099 into master will decrease coverage by 0.78%.
The diff coverage is 65.51%.

@@            Coverage Diff             @@
##           master    #7099      +/-   ##
==========================================
- Coverage   76.14%   75.36%   -0.79%     
==========================================
  Files         324      296      -28     
  Lines      102933    95914    -7019     
==========================================
- Hits        78377    72282    -6095     
+ Misses      24556    23632     -924

Flag	Coverage Δ
#kernel	`74.96% <50%> (-1.1%)`	⬇️
#user	`67.33% <65.51%> (+2.13%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b72a38...cdf8275. Read the comment docs.

If a corruption happens to be on a root block of an objset, zdb -c will not correctly report the error, and it will not traverse the datasets that come after. This is because traverse_visitbp, which does the callback and reset error for TRAVERSE_HARD, is skipped when traversing zil is failed in traverse_impl. Here's example of what 'zdb -eLcc' command looks like on a pool with damaged objset root: == before patch: Traversing all blocks to verify checksums ... Error counts: errno count block traversal size 379392 != alloc 33987072 (unreachable 33607680) bp count: 172 ganged count: 0 bp logical: 1678336 avg: 9757 bp physical: 130560 avg: 759 compression: 12.85 bp allocated: 379392 avg: 2205 compression: 4.42 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 71 Dittoed blocks on same vdev: 101 == after patch: Traversing all blocks to verify checksums ... zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0> -- skipping Error counts: errno count 52 1 block traversal size 33963520 != alloc 33987072 (unreachable 23552) bp count: 447 ganged count: 0 bp logical: 36093440 avg: 80745 bp physical: 33699840 avg: 75391 compression: 1.07 bp allocated: 33963520 avg: 75981 compression: 1.06 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 76 Dittoed blocks on same vdev: 115 == Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

zcb_haderrors will be modified in zdb_blkptr_done, which is asynchronous. So we must move this assignment after zio_wait. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

There are some issues in the zdb -R decompression implementation. The first is that ZLE can easily decompress non-ZLE streams. So we add ZDB_NO_ZLE env to make zdb skip ZLE. The second is the random bytes appended to pabd, pbuf2 stuff. This serve no purpose at all, those bytes shouldn't be read during decompression anyway. Instead, we randomize lbuf2, so that we can make sure decompression fill exactly to lsize by bcmp lbuf and lbuf2. The last one is the condition to detect fail is wrong. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099 Closes openzfs#4984

SPA_MAXBLOCKSIZE is too large for stack. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099 Closes openzfs#6464

This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() openzfs#7179 openzfs#7211 - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head

If a corruption happens to be on a root block of an objset, zdb -c will not correctly report the error, and it will not traverse the datasets that come after. This is because traverse_visitbp, which does the callback and reset error for TRAVERSE_HARD, is skipped when traversing zil is failed in traverse_impl. Here's example of what 'zdb -eLcc' command looks like on a pool with damaged objset root: == before patch: Traversing all blocks to verify checksums ... Error counts: errno count block traversal size 379392 != alloc 33987072 (unreachable 33607680) bp count: 172 ganged count: 0 bp logical: 1678336 avg: 9757 bp physical: 130560 avg: 759 compression: 12.85 bp allocated: 379392 avg: 2205 compression: 4.42 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 71 Dittoed blocks on same vdev: 101 == after patch: Traversing all blocks to verify checksums ... zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0> -- skipping Error counts: errno count 52 1 block traversal size 33963520 != alloc 33987072 (unreachable 23552) bp count: 447 ganged count: 0 bp logical: 36093440 avg: 80745 bp physical: 33699840 avg: 75391 compression: 1.07 bp allocated: 33963520 avg: 75981 compression: 1.06 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 76 Dittoed blocks on same vdev: 115 == Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

zcb_haderrors will be modified in zdb_blkptr_done, which is asynchronous. So we must move this assignment after zio_wait. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

There are some issues in the zdb -R decompression implementation. The first is that ZLE can easily decompress non-ZLE streams. So we add ZDB_NO_ZLE env to make zdb skip ZLE. The second is the random bytes appended to pabd, pbuf2 stuff. This serve no purpose at all, those bytes shouldn't be read during decompression anyway. Instead, we randomize lbuf2, so that we can make sure decompression fill exactly to lsize by bcmp lbuf and lbuf2. The last one is the condition to detect fail is wrong. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099 Closes openzfs#4984

SPA_MAXBLOCKSIZE is too large for stack. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099 Closes openzfs#6464

This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Fix MMP write frequency for large pools openzfs#7205 openzfs#7289 - Handle zio_resume and mmp => off openzfs#7286 - Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284 - zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261 - Take user namespaces into account in policy checks openzfs#6800 openzfs#7270 - Detect long config lock acquisition in mmp openzfs#7212 - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

If a corruption happens to be on a root block of an objset, zdb -c will not correctly report the error, and it will not traverse the datasets that come after. This is because traverse_visitbp, which does the callback and reset error for TRAVERSE_HARD, is skipped when traversing zil is failed in traverse_impl. Here's example of what 'zdb -eLcc' command looks like on a pool with damaged objset root: == before patch: Traversing all blocks to verify checksums ... Error counts: errno count block traversal size 379392 != alloc 33987072 (unreachable 33607680) bp count: 172 ganged count: 0 bp logical: 1678336 avg: 9757 bp physical: 130560 avg: 759 compression: 12.85 bp allocated: 379392 avg: 2205 compression: 4.42 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 71 Dittoed blocks on same vdev: 101 == after patch: Traversing all blocks to verify checksums ... zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0> -- skipping Error counts: errno count 52 1 block traversal size 33963520 != alloc 33987072 (unreachable 23552) bp count: 447 ganged count: 0 bp logical: 36093440 avg: 80745 bp physical: 33699840 avg: 75391 compression: 1.07 bp allocated: 33963520 avg: 75981 compression: 1.06 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 76 Dittoed blocks on same vdev: 115 == Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

zcb_haderrors will be modified in zdb_blkptr_done, which is asynchronous. So we must move this assignment after zio_wait. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

There are some issues in the zdb -R decompression implementation. The first is that ZLE can easily decompress non-ZLE streams. So we add ZDB_NO_ZLE env to make zdb skip ZLE. The second is the random bytes appended to pabd, pbuf2 stuff. This serve no purpose at all, those bytes shouldn't be read during decompression anyway. Instead, we randomize lbuf2, so that we can make sure decompression fill exactly to lsize by bcmp lbuf and lbuf2. The last one is the condition to detect fail is wrong. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099 Closes openzfs#4984

SPA_MAXBLOCKSIZE is too large for stack. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099 Closes openzfs#6464

This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Fix MMP write frequency for large pools openzfs#7205 openzfs#7289 - Handle zio_resume and mmp => off openzfs#7286 - Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284 - zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261 - Take user namespaces into account in policy checks openzfs#6800 openzfs#7270 - Detect long config lock acquisition in mmp openzfs#7212 - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Fix "--enable-code-coverage" debug build openzfs#6674 - Update codecov.yml openzfs#6669 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head

If a corruption happens to be on a root block of an objset, zdb -c will not correctly report the error, and it will not traverse the datasets that come after. This is because traverse_visitbp, which does the callback and reset error for TRAVERSE_HARD, is skipped when traversing zil is failed in traverse_impl. Here's example of what 'zdb -eLcc' command looks like on a pool with damaged objset root: == before patch: Traversing all blocks to verify checksums ... Error counts: errno count block traversal size 379392 != alloc 33987072 (unreachable 33607680) bp count: 172 ganged count: 0 bp logical: 1678336 avg: 9757 bp physical: 130560 avg: 759 compression: 12.85 bp allocated: 379392 avg: 2205 compression: 4.42 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 71 Dittoed blocks on same vdev: 101 == after patch: Traversing all blocks to verify checksums ... zdb_blkptr_cb: Got error 52 reading <54, 0, -1, 0> -- skipping Error counts: errno count 52 1 block traversal size 33963520 != alloc 33987072 (unreachable 23552) bp count: 447 ganged count: 0 bp logical: 36093440 avg: 80745 bp physical: 33699840 avg: 75391 compression: 1.07 bp allocated: 33963520 avg: 75981 compression: 1.06 bp deduped: 0 ref>1: 0 deduplication: 1.00 SPA allocated: 33987072 used: 0.80% additional, non-pointer bps of type 0: 76 Dittoed blocks on same vdev: 115 == Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099

zcb_haderrors will be modified in zdb_blkptr_done, which is asynchronous. So we must move this assignment after zio_wait. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099

There are some issues in the zdb -R decompression implementation. The first is that ZLE can easily decompress non-ZLE streams. So we add ZDB_NO_ZLE env to make zdb skip ZLE. The second is the random bytes appended to pabd, pbuf2 stuff. This serve no purpose at all, those bytes shouldn't be read during decompression anyway. Instead, we randomize lbuf2, so that we can make sure decompression fill exactly to lsize by bcmp lbuf and lbuf2. The last one is the condition to detect fail is wrong. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099 Closes #4984

SPA_MAXBLOCKSIZE is too large for stack. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099

zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099 Closes #6464

tuxoko force-pushed the zdb_c branch from 81185f0 to caaedd0 Compare February 2, 2018 00:53

tuxoko changed the title ~~Fix zdb -c traverse stop on damaged objset root~~ Fix multiple zdb bugs Feb 2, 2018

behlendorf requested changes Feb 3, 2018

View reviewed changes

davidchenntnx added 3 commits February 5, 2018 15:08

Fix zle_decompress out of bound access

5d95c95

Signed-off-by: Chunwei Chen <[email protected]>

Fix racy assignment of zcb.zcb_haderrors

9bafb76

zcb_haderrors will be modified in zdb_blkptr_done, which is asynchronous. So we must move this assignment after zio_wait. Signed-off-by: Chunwei Chen <[email protected]>

tuxoko force-pushed the zdb_c branch from caaedd0 to 3c27d6a Compare February 5, 2018 23:09

behlendorf approved these changes Feb 6, 2018

View reviewed changes

behlendorf requested a review from loli10K February 6, 2018 00:26

tuxoko force-pushed the zdb_c branch from 3c27d6a to e7c1e02 Compare February 6, 2018 20:33

tuxoko force-pushed the zdb_c branch from e7c1e02 to 557360a Compare February 6, 2018 21:44

loli10K reviewed Feb 6, 2018

View reviewed changes

tuxoko force-pushed the zdb_c branch 2 times, most recently from 08ff723 to d21db95 Compare February 7, 2018 00:11

loli10K suggested changes Feb 8, 2018

View reviewed changes

davidchenntnx added 3 commits February 8, 2018 19:22

Fix zdb -E segfault

14b919c

SPA_MAXBLOCKSIZE is too large for stack. Signed-off-by: Chunwei Chen <[email protected]>

tuxoko force-pushed the zdb_c branch from d21db95 to cdf8275 Compare February 9, 2018 03:23

loli10K approved these changes Feb 9, 2018

View reviewed changes

behlendorf approved these changes Feb 9, 2018

View reviewed changes

behlendorf closed this in d3190c5 Feb 9, 2018

tonyhutter mentioned this pull request Mar 7, 2018

zfs-0.7.7 patchset (squashed) #7278

Closed

13 tasks

aerusso pushed a commit to aerusso/zfs that referenced this pull request Mar 13, 2018

Fix zle_decompress out of bound access

bc380fd

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes openzfs#7099

tonyhutter pushed a commit that referenced this pull request Mar 19, 2018

Fix zle_decompress out of bound access

5e566c5

Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix multiple zdb bugs #7099

Fix multiple zdb bugs #7099

tuxoko commented Jan 30, 2018 •

edited

Loading

tuxoko commented Feb 2, 2018

behlendorf left a comment

behlendorf Feb 3, 2018 •

edited

Loading

tuxoko Feb 3, 2018

behlendorf Feb 3, 2018

tuxoko Feb 3, 2018

behlendorf Feb 3, 2018

tuxoko Feb 3, 2018

tuxoko commented Feb 5, 2018

behlendorf left a comment

tuxoko commented Feb 6, 2018

loli10K commented Feb 6, 2018

tuxoko commented Feb 6, 2018

loli10K commented Feb 6, 2018

loli10K Feb 6, 2018

loli10K Feb 6, 2018

loli10K Feb 6, 2018

tuxoko Feb 7, 2018

tuxoko commented Feb 7, 2018

loli10K Feb 8, 2018

tuxoko Feb 9, 2018

loli10K Feb 8, 2018

tuxoko Feb 9, 2018

loli10K Feb 8, 2018

tuxoko Feb 9, 2018

loli10K Feb 8, 2018

tuxoko Feb 9, 2018

loli10K left a comment

codecov bot commented Feb 9, 2018

Fix multiple zdb bugs #7099

Fix multiple zdb bugs #7099

Conversation

tuxoko commented Jan 30, 2018 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

tuxoko commented Feb 2, 2018

behlendorf left a comment

Choose a reason for hiding this comment

behlendorf Feb 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tuxoko commented Feb 5, 2018

behlendorf left a comment

Choose a reason for hiding this comment

tuxoko commented Feb 6, 2018

loli10K commented Feb 6, 2018

tuxoko commented Feb 6, 2018

loli10K commented Feb 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tuxoko commented Feb 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

loli10K left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 9, 2018

Codecov Report

tuxoko commented Jan 30, 2018 •

edited

Loading

behlendorf Feb 3, 2018 •

edited

Loading