-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When a ZFS File system is set to CASE INSENSITIVE we should be calling d_add_ci on the dentry #4136
Comments
@RichardSharpe nice analysis. It looks like the fix your proposing can be made entirely in the Linux specific It would be great if you could propose a patch for this and open a pull request. |
I will work something up over the next few days. |
Simple test case:
Hung and deadlocked at this point. |
Here is a first attempt at this. It is untested (although it compiles), and I have at least one reservation: diff --git a/module/zfs/zpl_inode.c b/module/zfs/zpl_inode.c
index e0e8ee3..ccbe68a 100644
--- a/module/zfs/zpl_inode.c
+++ b/module/zfs/zpl_inode.c
@@ -44,13 +44,24 @@ zpl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
struct inode *ip;
int error;
fstrans_cookie_t cookie;
+ pathname_t *ppn = NULL;
+ pathname_t pn;
+ zfs_sb_t *zsb = dentry->d_sb->s_fs_info;
+
+ /* What about mixed case? */
+ /* If we are a case insensitive fs, we need the real name */
+ if (zsb->z_case == ZFS_CASE_INSENSITIVE) {
+ pn.pn_bufsize = ZFS_MAXNAMELEN;
+ pn.pn_buf = kmem_zalloc(ZFS_MAXNAMELEN, KM_SLEEP);
+ ppn = &pn;
+ }
if (dlen(dentry) > ZFS_MAXNAMELEN)
return (ERR_PTR(-ENAMETOOLONG));
crhold(cr);
cookie = spl_fstrans_mark();
- error = -zfs_lookup(dir, dname(dentry), &ip, 0, cr, NULL, NULL);
+ error = -zfs_lookup(dir, dname(dentry), &ip, 0, cr, NULL, ppn);
spl_fstrans_unmark(cookie);
ASSERT3S(error, <=, 0);
crfree(cr);
@@ -69,7 +80,23 @@ zpl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
return (ERR_PTR(error));
}
- return (d_splice_alias(ip, dentry));
+ /*
+ * If we are case insensitive, call the correct function
+ * to install the name.
+ */
+ if (ppn) {
+ struct dentry *new_dentry;
+ struct qstr real_name;
+
+ real_name.name = pn.pn_buf;
+ real_name.len = strlen(pn.pn_buf);
+ real_name.hash = full_name_hash(real_name.name, real_name.len);
+ new_dentry = d_add_ci(dentry, ip, &real_name);
+ kmem_free(pn.pn_buf, ZFS_MAXNAMELEN);
+ return (new_dentry);
+ }
+ else
+ return (d_splice_alias(ip, dentry));
}
void My reservation is that we are adding a kmem_alloc and kmem_free in the lookup path. It would have been nicer if zfs_lookup could return a pointer to the actual name, since we are not going to touch the name. An additional reservation is that I have not thought about or handled caseensitivity=mixed. I will try to test this later today. |
With a small change (MAXNAMELEN rather than ZFS_MAXNAMELEN) seems to work. By work I mean:
More testing required. |
@ptx0: It might turn up with Roaming Profiles but I don't think it is specifically related to roaming profiles. We found it running FSCT. |
I have run fsstress many times against the changes and dozens of renames where the case differs in the from and to paths (not the last component) and no deadlocks seen. I created a pull request. |
@RichardSharpe thanks for tackling this and opening a PR. The heart of the fix looks good, I've added some review comments so we can get it polished for inclusion. Please go ahead and refresh it when you can. |
Re:
Hmmm, I can't remember a panic. However, I was printing out some names for debugging and was concerned that there might not be enough space for the trailing NULL. |
@RichardSharpe I must have misinterpreted your comment above. |
|
@RichardSharpe when you open a pull request it's recommended that you use a branch per change. This way you can easily pull master from the ZoL repository and rebase your branch on top of those changes. Github provides some nice documentation, I'd suggest closing out #4147 and opening a new PR using a branch if you need to refresh this again. |
OK, I will try to close out 4147. We hit that memory leak in QA a couple of times. It is quite impressive. 733MB leaked in the malloc-256 slab. |
I created a new pull request with what I think are the fixes but I noticed that two of the checks failed. However, that seems to be because wget failed ... what can I do about it? |
@RichardSharpe looks like an unrelated hiccup in the test framework. I'll resubmit them for the sake of completeness but the patch looks good, and tests well, for me. Thanks for pushing this over the finish line, I'll get it merged it master. |
When casesensitivity=insensitive is set for the file system, we can deadlock in a rename if the user uses different case for each path. For example rename("A/some-file.txt", "a/some-file.txt"). The simple test for this is: 1. mkdir some-dir in a ZFS file system 2. touch some-dir/some-file.txt 3. mv Some-dir/some-file.txt some-dir/some-other-file.txt This last request deadlocks trying to relock the i_mutex on the inode for the parent directory. The solution is to use d_add_ci in zpl_lookup if we are on a file system that has the casesensitivity=insensitive attribute set. This patch checks if we are working on a case insensitive file system and if so, allocates storage for the case insensitive name and passes it to zfs_lookup and then calls d_add_ci instead of d_splice_alias. The performance impact seems to be minimal even though we have introduced a kmalloc and kfree in the lookup path. The problem was found when running Microsoft's FSCT against Samba on top of ZFS On Linux. Signed-off-by: Richard Sharpe <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4136
When casesensitivity=insensitive is set for the file system, we can deadlock in a rename if the user uses different case for each path. For example rename("A/some-file.txt", "a/some-file.txt"). The simple test for this is: 1. mkdir some-dir in a ZFS file system 2. touch some-dir/some-file.txt 3. mv Some-dir/some-file.txt some-dir/some-other-file.txt This last request deadlocks trying to relock the i_mutex on the inode for the parent directory. The solution is to use d_add_ci in zpl_lookup if we are on a file system that has the casesensitivity=insensitive attribute set. This patch checks if we are working on a case insensitive file system and if so, allocates storage for the case insensitive name and passes it to zfs_lookup and then calls d_add_ci instead of d_splice_alias. The performance impact seems to be minimal even though we have introduced a kmalloc and kfree in the lookup path. The problem was found when running Microsoft's FSCT against Samba on top of ZFS On Linux. Signed-off-by: Richard Sharpe <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4136
Hi folks,
We just ran into an issue that caused a rename deadlock in Linux.
Kernel 3.10.0, but the same will happen, I believe with the latest kernel.
We use the casesensitivity=insensitive attribute.
Because ZoL does not call d_add_ci with the canonical case (at least, I cannot find any such calls) Linux ends up with two dentries pointing to the same inode if you present two paths with different case.
This happened to us and causes a self-deadlock in lock_rename because it is passed two different dentries that point to the same inode.
I think the fix has to be in zpl_inode.c:zpl_lookup where we check if the file system is case insensitive and if so call d_add_ci(dentry, ip, realpnp) and add a realpnp parameter to the zfs_lookup call we make, or something like that.
For now, we can work around the issue in Samba, I believe, however, it would be good to have a real solution to this problem.
The text was updated successfully, but these errors were encountered: