-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle zap_add() failures in "casesensitivity=mixed" mode. #7054
Conversation
e780a0b
to
4990f7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in zfs_rename(), do we correctly handle the error from zfs_link_create()? I see that it checks error == 0
, but in the existing code I don't think it can actually fail in the ZRENAMING case. It looks like if we are renaming "on top of" an existing file, that file will be removed, and then if the zfs_link_create() fails, we'll be left with the rename not completed, but the existing file will be deleted, which I don't think is supposed to happen.
module/zfs/zap_micro.c
Outdated
/* | ||
* Each mzap entry requires at max : 4 chunks | ||
* 3 chunks for names + 1 chunk for value. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems out of place; seems like it is relevant to MZAP_ENT_CHUNKS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Will fix it.
module/zfs/zap_micro.c
Outdated
@@ -1202,6 +1239,10 @@ zap_add_impl(zap_t *zap, const char *key, | |||
if (mze != NULL) { | |||
err = SET_ERROR(EEXIST); | |||
} else { | |||
if (!mze_canfit_fzap_leaf(zn, zn->zn_hash)) { | |||
zap_unlockdir(zap, tag); | |||
return (ENOSPC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider upgrading to a fat zap in this case? That would mitigate the slightly-smaller-than-necessary limit on collisions in the microzap due to your overestimation of the name lengths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense. Will fix it.
module/zfs/zfs_dir.c
Outdated
* transaction which will rollback the SA updates done above. | ||
*/ | ||
mutex_exit(&zp->z_lock); | ||
return (error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update the comment on line 745.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I shall update that.
module/zfs/zfs_vnops.c
Outdated
*/ | ||
zp->z_unlinked = 1; | ||
drop_nlink(ZTOI(zp)); | ||
zfs_unlinked_add(zp, tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding this to the unlinked set seems like a roundabout way of removing it. Can we just call zfs_znode_delete()? I know zfs_inactive->zfs_zinactive->zfs_rmnode has some additional logic, but I don't think that it applies in this case. E.g. this znode can not have an xattrdir at this point (I think).
The same applies to symlink and mkdir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going to want to add your lovely test case from #7011 to this PR. Adding it to the functional/casenorm
directory seems like a reasonable spot.
It's not critical, but rebasing on master will clear up the remaining unrelated cstyle warnings.
module/zfs/zfs_vnops.c
Outdated
@@ -3685,6 +3709,9 @@ zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm, | |||
VERIFY3U(zfs_link_destroy(tdl, szp, tx, | |||
ZRENAMING, NULL), ==, 0); | |||
} | |||
} else { | |||
VERIFY3U(zfs_link_create(tdl, tzp, tx, ZRENAMING), | |||
==, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] you can use VERIFY0
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops.. will fix it.
module/zfs/zfs_dir.c
Outdated
* out. | ||
*/ | ||
if (error != 0) { | ||
if (!(flag & ZRENAMING) && !(flag &ZNEW)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] add a space after the &
here. i.e. !(flag & ZNEW)
[nit] you can drop the { }
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
module/zfs/zfs_dir.c
Outdated
/* | ||
* zap_add could fail to add the entry if it exceeds the capacity of the | ||
* leaf-block and zap_leaf_split() failed to help. In such a case bail | ||
* out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest dropping "In such a case bail out" from the comment (that's already quite clear), and then pulling up the comment from line 798 in to this block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
module/zfs/zfs_vnops.c
Outdated
iput(ZTOI(zp)); | ||
ZFS_EXIT(zfsvfs); | ||
return (error); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about exiting through the existing out
label instead like many of the other error paths. i.e.
error = zfs_link_create(dl, zp, tx, ZNEW);
if (error != 0) {
zfs_znode_delete(zp, tx);
zfs_acl_ids_free(&acl_ids);
dmu_tx_commit(tx);
goto out;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me give that shot.
module/zfs/zfs_vnops.c
Outdated
@@ -1429,6 +1429,7 @@ zfs_create(struct inode *dip, char *name, vattr_t *vap, int excl, | |||
dmu_tx_hold_write(tx, DMU_NEW_OBJECT, | |||
0, acl_ids.z_aclp->z_acl_bytes); | |||
} | |||
dmu_tx_hold_zap(tx, zfsvfs->z_unlinkedobj, FALSE, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be needed anymore after switching to zfs_znode_delete
. Here and elsewhere in this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Will take care of it.
module/zfs/zfs_vnops.c
Outdated
dmu_tx_commit(tx); | ||
iput(ZTOI(zp)); | ||
ZFS_EXIT(zfsvfs); | ||
return (error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar thing could be done here with an out
label, it might make the code easier to read.
Thank you Brian for the review comments. I am working on them and the test case. |
@sanjeevbagewadi in or to make reviewing this easier could you please squash your commits together and rebase the entire PR against master. You'll need to force update the branch on Github which is fine. |
3892694
to
3a9525f
Compare
module/zfs/zfs_vnops.c
Outdated
error = zfs_link_create(dl, zp, tx, ZNEW); | ||
if (error != 0) { | ||
zfs_znode_delete(zp, tx); | ||
goto out; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need the goto
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops... makes sense.
@@ -0,0 +1,56 @@ | |||
#!/bin/ksh -p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're going to need to add to add this test case to the casenorm
section of the tests/runfiles/linux.run
. This is why it wasn't run by the automated testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ! I missed that.
# when multiple files which the same name (differing only in case) created, the | ||
# number of files is limited to what can fit in a fatzap leaf-block. And beyond | ||
# that, it fails with ENOSPC. Ensure that the create fails gracefully and does | ||
# not hit and assert and panic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] extra leading whitespace
s/not hit and assert and panic./not hit an assert and panic./
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
# DESCRIPTION: | ||
# For the filesystem with casesensitivity=mixed, normalization=none, | ||
# when multiple files which the same name (differing only in case) created, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/which/with/
s/created/are created/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
names='{a,A}{b,B}{c,C}{d,D}{e,E}{f,F}{g,G}{h,H}{i,I}{j,J}{k,K}{l,L}' | ||
for name in $names; do | ||
create_file $name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be Github but it looks like there's an indent issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was an extra space. Deleted it.
create_file $name | ||
ret=$? | ||
log_note "Created file $name status $ret" | ||
if (($ret != 0)); then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could do this in a way where the test checks for ENOSPC and just any error. That way we know we're failing for the right reason, that said I don't think it's critical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will try to address this. The exit code does not indicate ENOSPC. Will need to rely on the error message.
module/zfs/zap_micro.c
Outdated
* Check if the current entry keeps the colliding entries under the fatzap leaf | ||
* size. | ||
* | ||
* This is important to ensure that the microzap is upgradable to fatzap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] I'm not sure this needs to be its own paragraph, you could include it after the sentence above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting that line as it is no longer correct. As per Matt's suggestion, we now trigger an upgrade to fatzap when the number of chunks exceed the limit.
Without this change we were failing the zap_add() for a smaller number if it was a microzap.
With upgrade, we would first upgrade to fatzap and that would allow one to add more entries with colliding hash. And eventually zap_add() to fatzap would fail when the colliding entry can longer fit in the leaf-block.
module/zfs/zfs_vnops.c
Outdated
if (error != 0) { | ||
/* | ||
* Delete the node created above. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] I'd drop the comment here or update it to be clear why zp
from zfs_mknode
needs to be destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have reworded the comment. If does not serve the purpose, I am fine with deleting it.
module/zfs/zfs_vnops.c
Outdated
@@ -3686,6 +3706,8 @@ zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm, | |||
VERIFY3U(zfs_link_destroy(tdl, szp, tx, | |||
ZRENAMING, NULL), ==, 0); | |||
} | |||
} else { | |||
VERIFY0(zfs_link_create(tdl, tzp, tx, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this only needed now? This looks like it may address a potential existing issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this seems like an existing issue. But, it is mostly a remote/unlikely case for the following reasons :
- We are deleting an entry on line:3677 and adding the same entry back on line:3680.
Hence, i don't think it would fail. Also, since this thread holds, the zfs_dirlock(), no other
thread can race us.
However, as a safety measure, decided to add this code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanjeevbagewadi
I'll copy me gerrit comment here:
You haven't add 'if (tzp)'
Also, the zfs_link_destroy above will set z_unlinked to true, which will cause this to fail. And it will call zfs_unlinked_add. You'll need to revert those changes.
I don't feel we need to solve this in this patch, since this path is probably not very likely, and it's an existing issue. Maybe we should just have a warning here, and solve it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I had overlooked the fact the zfs_link_create() will not solve the problem. For now I will add an ASSERT() here. And we can take that up separately.
module/zfs/zfs_vnops.c
Outdated
|
||
out: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or the out:
.
# STRATEGY: | ||
# 1. Create files with same name but varying in case. | ||
# E.g. 'abcdefghijklmnop', 'Abcdefghijklmnop', 'ABcdefghijklmnop' etc. | ||
# The create would fail with ENOSPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this change touches several system calls we should extent the test case to verify they behave as expected too. Specifically let's check rename, symlinks, and mkdir.
3a9525f
to
51cf565
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refreshing this. The updated patch functionally LGTM, I only have two minor remaining concerns.
-
The test case only covers the
zfs_create()
case. While not critical it would be best if we could extend the new test case to add coverage forzfs_mkdir()
,zfs_rename()
, andzfs_symlink()
. -
One of the bots unexpectedly terminated while running the ZTS and it's unclear why. This happened long after running the new test case so it's almost certainly unrelated, but to be certain I've resubmitted that build.
http://build.zfsonlinux.org/builders/CentOS%207%20x86_64%20%28TEST%29/builds/1706
@ahrens would you mind reviewing this again.
module/zfs/zfs_vnops.c
Outdated
@@ -3686,6 +3706,8 @@ zfs_rename(struct inode *sdip, char *snm, struct inode *tdip, char *tnm, | |||
VERIFY3U(zfs_link_destroy(tdl, szp, tx, | |||
ZRENAMING, NULL), ==, 0); | |||
} | |||
} else { | |||
VERIFY0(zfs_link_create(tdl, tzp, tx, 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanjeevbagewadi
I'll copy me gerrit comment here:
You haven't add 'if (tzp)'
Also, the zfs_link_destroy above will set z_unlinked to true, which will cause this to fail. And it will call zfs_unlinked_add. You'll need to revert those changes.
I don't feel we need to solve this in this patch, since this path is probably not very likely, and it's an existing issue. Maybe we should just have a warning here, and solve it later.
* Since, we failed to add the directory entry for it, | ||
* delete the newly created dnode. | ||
*/ | ||
zfs_znode_delete(zp, tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanjeevbagewadi
I'll copy me gerrit comment here:
iput_final won't evict the inode unless you drop the link count to zero or call remove_inode_hash to unhash it. But if you just drop the link count, you will still have a window that might have obj id reuse issue I mentioned above.
So the best way I think is to call remove_inode_hash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. Will add remove_inode_hash().
51cf565
to
33ac60f
Compare
@behlendorf , I have updated the test case to cover the different object types (files, dir, symlink, hardlinks) and for each object type exercise the create and rename operation. Kindly take a look. @ahrens , for the corner case in zfs_rename() for now, I have added an assert() there. If you agree we could take that up as a separate case. @tuxoko , I have now added remove_inode_hash(). |
33ac60f
to
cc00049
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the small test case nit this LGTM.
# failed to add in step 1 above. | ||
# This should fail as well. | ||
|
||
set -x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Let's drop the set -x
for the final version.
With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Signed-off-by: Sanjeev Bagewadi <[email protected]>
cc00049
to
68985de
Compare
Codecov Report
@@ Coverage Diff @@
## master #7054 +/- ##
==========================================
- Coverage 76.4% 75.19% -1.22%
==========================================
Files 324 296 -28
Lines 102935 95758 -7177
==========================================
- Hits 78651 72004 -6647
+ Misses 24284 23754 -530
Continue to review full report at Codecov.
|
With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes openzfs#7011 Closes openzfs#7054
With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes openzfs#7011 Closes openzfs#7054
This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() openzfs#7179 openzfs#7211 - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head
With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes openzfs#7011 Closes openzfs#7054
This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Fix MMP write frequency for large pools openzfs#7205 openzfs#7289 - Handle zio_resume and mmp => off openzfs#7286 - Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284 - zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261 - Take user namespaces into account in policy checks openzfs#6800 openzfs#7270 - Detect long config lock acquisition in mmp openzfs#7212 - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head
This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Fix MMP write frequency for large pools openzfs#7205 openzfs#7289 - Handle zio_resume and mmp => off openzfs#7286 - Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284 - zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261 - Take user namespaces into account in policy checks openzfs#6800 openzfs#7270 - Detect long config lock acquisition in mmp openzfs#7212 - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head
With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes openzfs#7011 Closes openzfs#7054
This is a squashed patchset for zfs-0.7.7. The individual commits are in the tonyhutter:zfs-0.7.7-hutter branch. I squashed the commits so that buildbot wouldn't have to run against each one, and because github/builbot seem to have a maximum limit of 30 commits they can test from a PR. - Fix MMP write frequency for large pools openzfs#7205 openzfs#7289 - Handle zio_resume and mmp => off openzfs#7286 - Fix zfs-kmod builds when using rpm >= 4.14 openzfs#7284 - zdb and inuse tests don't pass with real disks openzfs#6939 openzfs#7261 - Take user namespaces into account in policy checks openzfs#6800 openzfs#7270 - Detect long config lock acquisition in mmp openzfs#7212 - Linux 4.16 compat: get_disk_and_module() openzfs#7264 - Change checksum & IO delay ratelimit values openzfs#7252 - Increment zil_itx_needcopy_bytes properly openzfs#6988 openzfs#7176 - Fix some typos openzfs#7237 - Fix zpool(8) list example to match actual format openzfs#7244 - Add SMART self-test results to zpool status -c openzfs#7178 - Add scrub after resilver zed script openzfs#4662 openzfs#7086 - Fix free memory calculation on v3.14+ openzfs#7170 - Report duration and error in mmp_history entries openzfs#7190 - Do not initiate MMP writes while pool is suspended openzfs#7182 - Linux 4.16 compat: use correct *_dec_and_test() - Allow modprobe to fail when called within systemd openzfs#7174 - Add SMART attributes for SSD and NVMe openzfs#7183 openzfs#7193 - Correct count_uberblocks in mmp.kshlib openzfs#7191 - Fix config issues: frame size and headers openzfs#7169 - Clarify zinject(8) explanation of -e openzfs#7172 - OpenZFS 8857 - zio_remove_child() panic due to already destroyed parent zio openzfs#7168 - 'zfs receive' fails with "dataset is busy" openzfs#7129 openzfs#7154 - contrib/initramfs: add missing conf.d/zfs openzfs#7158 - mmp should use a fixed tag for spa_config locks openzfs#6530 openzfs#7155 - Handle zap_add() failures in mixed case mode openzfs#7011 openzfs#7054 - Fix zdb -ed on objset for exported pool openzfs#7099 openzfs#6464 - Fix zdb -E segfault openzfs#7099 - Fix zdb -R decompression openzfs#7099 openzfs#4984 - Fix racy assignment of zcb.zcb_haderrors openzfs#7099 - Fix zle_decompress out of bound access openzfs#7099 - Fix zdb -c traverse stop on damaged objset root openzfs#7099 - Linux 4.11 compat: avoid refcount_t name conflict openzfs#7148 - Linux 4.16 compat: inode_set_iversion() openzfs#7148 - OpenZFS 8966 - Source file zfs_acl.c, function zfs_aclset_common contains a use after end of the lifetime of a local variable openzfs#7141 - Remove deprecated zfs_arc_p_aggressive_disable openzfs#7135 - Fix default libdir for Debian/Ubuntu openzfs#7083 openzfs#7101 - Bug fix in qat_compress.c for vmalloc addr check openzfs#7125 - Fix systemd_ RPM macros usage on Debian-based distributions openzfs#7074 openzfs#7100 - Emit an error message before MMP suspends pool openzfs#7048 - ZTS: Fix create-o_ashift test case openzfs#6924 openzfs#6977 - Fix --with-systemd on Debian-based distributions (openzfs#6963) openzfs#6591 openzfs#6963 - Remove vn_rename and vn_remove dependency openzfs/spl#648 openzfs#6753 - Fix "--enable-code-coverage" debug build openzfs#6674 - Update codecov.yml openzfs#6669 - Add support for "--enable-code-coverage" option openzfs#6670 - Make "-fno-inline" compile option more accessible openzfs#6605 - Add configure option to enable gcov analysis openzfs#6642 - Implement --enable-debuginfo to force debuginfo openzfs#2734 - Make --enable-debug fail when given bogus args openzfs#2734 Signed-off-by: Tony Hutter <[email protected]> Requires-spl: refs/pull/690/head
With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes #7011 Closes #7054
With "casesensitivity=mixed", zap_add() could fail when the number of
files/directories with the same name (varying in case) exceed the
capacity of the leaf node of a Fatzap. This results in a ASSERT()
failure as zfs_link_create() does not expect zap_add() to fail. The fix
is to handle these failures and rollback the transactions.
Signed-off-by: Sanjeev Bagewadi [email protected]
Description
For a dataset with "casesensitivity=mixed", when a large number of files/directories
with same name (varying only in case e.g: ABCD, ABCd, ABcD and so on) are created
zap_add() could fail. With mixed mode zap_add() normalises the names before the hash
is computed. And all the names would generate the same hash and land in the same leaf.
When the number of entries exceed the capacity of the leaf-block, zap_add() tries to split
the leaf-block which fails as well and zap_add() fails. This trips an ASSERT in zfs_link_create()
as it does not expect zap_add() to fail.
The fix does the following :
fzap_add_cd() : Handle the case when zap_expand_leaf() fails with ENOSPC and bailout
without calling zap_put_leaf_maybe_grow_ptrtbl().
zap_add_impl() : When adding to a micro-zap check if the total number of entries
with colliding/same hash value can fit into fatzap-leaf-block. This is important because, if/when
the microzap needs to be upgraded to fatzap, all the entries with the same hash would need to
fit into the same leaf-block (16K). If the number of such entries donot fit fail the zap_add().
The routine mze_canfit_fzap_leaf() today assumes the MZAP_NAME_LEN for every entry.
This is erring on the safer side but, ends up accommodating lesser number (127) of entries
with same hash value in microzap. We could find out the size of name of every mze and that
would be accurate. But, it is expensive to compute the length every time. Alternatively, we
could compute the length of each entry and cache it. I felt that the amount of code needed
for this is not worth the gain. I am open to changing it if necessary.
zfs_link_create() : Move the call to zap_add() to the beginning and in case of a failure
return. This ensures that we can bailout easily before making any other modifications to
the parent-zap or the child-dnode. Keeps the code simpler.
ZPL interfaces (zfs_create(), zfs_mkdir(), zfs_symlink()) : Handle the failure of zfs_link_create()
and rollback the operation.
With these changes a call to create a file could fail with ENOSPC. Not the best error value.
But, this is the closest I found. Any alternate suggestions are welcome.
Motivation and Context
With "casesensitivity=mixed" it is easy to panic the node with a simple test case
as described in #7011
How Has This Been Tested?
The following tests were run :
Types of changes
Checklist:
Signed-off-by
.