-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
after rsync'in some files on target host inaccessible at any way with xattr=sa (0.6.2 - 0.6.3 - current git) #2717
Comments
See #2663, #2700 and one other that escapes my mind. I've been trying to get information regarding those and, so far, haven't found a "smoking gun" but it's clear there's a problem caused by mixing posix ACLs and xattr=sa. Could you please find the inode number of "test.file" (ls -di file) and then run Also, what's the ashift of each top-level vdev in this pool? Are they all ashift=9 or ashift=12 or are their a mix of the 2? |
another host/pool, and a similar situation: zfs - 52dd454
P.S. P.P.S
|
Could you please build the latest https://github.com/dweeezil/zfs/tree/zdb and run the zdb with 7 -d's. You'll need a current spl master, too, to build it. There are a bunch of users having problems related to Posix ACL SA xattrs but I've yet to see a good dump on one of them that's corrupted. I think it would go a long way to helping track down the problem. |
|
@TioNisla Your case is interesting in that the corruption is not on a directory. Unfortunately your zdb output above is not from my "zdb" branch (dweeezil/zfs@ce58fc1). |
zfs-ce58fc178bd5c6e8d462c21f1b8952685d2f852d
|
other file on same filesystem (all OK):
|
|
@TioNisla Thanks for the additional information. I'll continue to look into this issue shortly. I've been totally sidetracked for the past week with my company's office moving and also chasing down hangs caused by |
@TioNisla I'm finally able to get back to these issues this weekend. The problem with "a.test.file" is, as I suspected, a corrupted dnode. Given that the dumped spill blockptr looks OK and that the recorded bonus size of 196 is on the small side, the problem is that the layout is wrong which causes the whole thing to be parsed incorrectly. Could you please confirm that the spill block actually contains the SA xattr by running This is exactly the same problem as we're seeing in #2700. Can you reliably reproduce this problem? |
@dweeezil |
@TioNisla Thanks. The spill block you dumped had a perfectly valid SA xattr for a posix acl:
One other question comes to mind: When you were running the rsync which caused the corruption, was there any other concurrent activity on the target filesystem? If so, what kind (normal filesystem access, read, write, zfs operations such as snapshot or destroy, etc.)? |
@TioNisla Does this system have ECC memory? You mentioned the problem is hard to reproduce but have you ever been able to reproduce it? |
@TioNisla Sorry, but as I look into this further, more questions arise. Did the directory in which the corrupted "a.test.file" was created have a default ACL? If so, does the ACL I showed above precisely match the default ACL? The reason I ask is that the code path through which xattr is set is quite different between the two cases. |
No activity at all (exclude `rsync' from remote host). This is fresh installation and newly created zpool.
This is virtual machine @ vmware esxi 5.1 hv, running on HP ProLiant DL980 G7 hardware, of couse all memory ECC. There is another file and it's parent dir - |
These damages are mostly random, on different files from time to time. Need run full rsync and some files (~3-5%) get corrupted SA. |
And now, after removal of some files, there is a new problem:
|
@TioNisla since you seem to be able to reproduce it, could you build your spl and zfs with - - enable-debug and retry the rsync? I'm beginning to think that #2718 might be causing this but I can't prove it. I've continued working on fixing that issue but unfortunately the original approach I outlined isn't going to work well. I'm hoping to get a working patch within the next week. In the mean time, could you also do a |
@dweeezil spl-81857a34 |
If a spill block's dbuf hasn't yet been written when a spill block is freed, the unwritten version will still be written. This patch handles the case in which a spill block's dbuf is freed and undirties it to prevent it from being written. The most common case in which this could happen is when xattr=sa is being used and a long xattr is immediately replaced by a short xattr as in: setfattr -n user.test -v very_very_very..._long_value <file> setfattr -n user.test -v short_value <file> The first value must be sufficiently long that a spill block is generated and the second value must be short enough to not require a spill block. In practice, this would typically happen due to internal xattr operations as a result of setting acltype=posixacl. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#2663 Closes openzfs#2700 Closes openzfs#2701 Closes openzfs#2717 Closes openzfs#2863 Closes openzfs#2884 Conflicts: module/zfs/dbuf.c
If a spill block's dbuf hasn't yet been written when a spill block is freed, the unwritten version will still be written. This patch handles the case in which a spill block's dbuf is freed and undirties it to prevent it from being written. The most common case in which this could happen is when xattr=sa is being used and a long xattr is immediately replaced by a short xattr as in: setfattr -n user.test -v very_very_very..._long_value <file> setfattr -n user.test -v short_value <file> The first value must be sufficiently long that a spill block is generated and the second value must be short enough to not require a spill block. In practice, this would typically happen due to internal xattr operations as a result of setting acltype=posixacl. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2663 Closes #2700 Closes #2701 Closes #2717 Closes #2863 Closes #2884
Short description: after "rsync -PSAXrltgoD" from remote host some files on target inaccessible at any way. Same host/pool as https://gist.github.com/anonymous/39e252399acb6912a16e
The text was updated successfully, but these errors were encountered: