-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xattr=sa acltype=posixacl bug #2700
Comments
@sopmot, could you please run the zdb tests on the "documents" directory I suggested in #2663. |
Hmm, it's weird, now I can see the CC directory if I try to check it outside of the container So now I can see the same from in- and outside the container:
What command do you need exactly? BTW, I can run this on its filesystem:
But I don't know, what the magical number does at the end of the command. |
@sopmot Sorry, I was on a mobile device when I sent the request and had to be brief. The "magical" number is the object number within the dataset Last, please run the same command on the inode number of the "documents" directory. You should be able to get its inode number with Based on the output shown above, I have a very good idea what we're going to see: we'll see the "documents" directory has a spill block and that it otherwise looks OK as far as ZDB is concerned. I've got a debugging version of zdb which prints a bit more information but it doesn't (yet) deal with the spill block. |
|
for the documents directory:
for the ccc directory:
|
@sopmot The fist ls command should be been "ls -di" rather than "ls -li" but I see your second one got its indoe number. The output confirms a spill block. Could you please build dweeezil/zfs@277a0f7 (my "zdb" branch in https://github.com/dweeezil/zfs/tree/zdb) and run the same zdb command directly from the build directory as "cmd/zdb/zdb -dddddd tank/...". This will show us the spill block which I suspect is corrupted. This version of zdb will print a LOT more than we need, I'll I'm interested in the section labeled as "Spill blkptr". |
@dweeezil Just a note, I do backup to a machine with send/recv and it transferred the errors without problems. |
@dweeezil
It looks the same to me, doesn't it? |
@sopmot Doesn't look like you've got my latest zdb branch. Does "git show" show 277a0f7 as the last commit? Probably the easiest way to get the right code would be to do a fresh clone and checkout:
The output of this zdb will be very noisy. We're only interested in the "Spill blkptr:" section which should look something like this:
And, in fact, if you get that far (if the DVAs don't look totally bogus), the next thing I'd like you to do is:
where the last argument should be one of the blkptrs listed from the previous zdb command (for example "0:20c00:200" in the example above). |
Argh, I forgot the checkout.
After ./configure --with-spl=/data/src/spl-linux-0.6.3 Do I make something wrong? |
@sopmot Ugh, you need the latest spl as well. All we really need is the very last small commit to zdb I added in the zdb branch. I'd recommend cleaning up your checkout and then just cherry picking the patch:
and the it ought to build just fine with your existing SPL. |
On 09/15/2014 06:44 PM, Tim Chase wrote:
@dweezil, de is no zfs-0.6.3 branch I applied that patch manually, to (ubuntu) 0.6.3 branch, but failed with make[3]: Leaving directory I'll be travelling now and I am not sure, I can reply to you soon. This is the patch, right? commit 277a0f7
diff --git a/cmd/zdb/zdb.c b/cmd/zdb/zdb.c
10x |
@sopmot Double ugh. Another post-0.6.3-ism. Change "snprintf_blkptr_compact" to "sprintf_blkptr_compact" (drop the "n") and remove the second argument "sizeof (blkbuf)". |
I hope, it helps:) |
@sopmot Getting close, you need to use "zdb -R" with the DVA arguments as follows:
Those DVAs look reasonable to me. The txgs are pretty high, however, so I'm guessing the pool has been around for awhile. Was this entire pool written with 0.6.3 or was some of it written with 0.6.2 or some intermediate snapshot between 0.6.2 and 0.6.3? The reason I ask is that there were some issued fixed prior to 0.6.3 which were known to have caused corrupted dnodes (but so far we're seeing no evidence of it). |
@sopmot OK, that's great. I was able to reconstruct the block from that dump. At first glance, it looks OK. I'd really like to see a little more output from my "zdb" branch. Could you please try to cherry pick dweeezil/zfs@e396021 into your zdb and run it again with 6 -d's (same arguments that produced the "Spill blkptr" output) and post the full output to a gist. I'll manually decode the block regardless, but won't have time to do that until this evening. |
@sopmot Your spill block decodes just fine. There appears to be no file system corruption. I was basing this line of testing on the use of SA xattrs and the presence of a spill block in the dnode and your error condition:
when it seemed pretty clear there were no actual I/O errors occuring. I had assumed the issue was with the "documents" directory, itself and to be sure, it seems the system is not seeing the Posix ACL (no "+" in the "ls" output) on it even though according to your debugging, it does have one (stored as an SA). In reviewing the commands you showed above, none of them give me a very good idea of the system call producing the I/O error. Could you please strace a command which produces the I/O error and post the output surrounding the system call(s) which are returning an EIO. As I mentioned, I've got no explanation as to why the system isn't seeing the Posix ACL on your "documents" directory. The output of a 6-d zdb with dweeezil/zfs@e396021 still might shed some light on that. |
OMG, I don't know, how I could overlook it. I compiled spl from the master and linked your branch to it. So everything should be the latest now. This is the output: For sure, this is the other one:
|
[we can speed it up via a skype/hangouts chat, if you prefer] strace snippet:
A healthy one:
This is an ubuntu machine, with no selinux. And it seems, you're right. A directory is accessible if it has ACLs. But if directory in that directory does not, then it produces the same error. |
@sopmot Thanks, the output from above clearly shows the problem:
That's EFAULT. This should be enough information to help get started on this problem. I'll follow-up after I have time to analyze it further. |
@sopmot It's being decoded improperly because it thinks the size is 20 rather than 196 (the size of the SA in the spill block according to your separate |
Just let me know, whatever you need. |
@sopmot I've gotten a chance to look into this further and, unfortunately, it looks like the the filesystem is corrupted. The dnode in question has a bad SA layout (is 3, should be 4). This will cause all sorts of havoc when trying to read it. Is this issue repeatable? If you create a new filesystem and rsync again, is the new filesystem corrupted in the same way? If so, could you please strace the rsync and grep out all the xattr-related syscalls? I've tried reproducing the problem in several obvious ways but have not yet been able to reproduce it. I've also skimmed over the code paths involved and am not seeing an immediate problem. One other idea, could you please show me the Posix ACLs on the source file system (maybe a |
@sopmot After further investigation, I don't think I need to see any strace output from the rsync since it's pretty clear what rsync does. I'll continue to look into this and will send another followup when I've got more information or need you to try some additional debugging. |
@dweeezil setfacl -R -m d:g:it:rwX Unfortunately this all is working now. |
@dweeezil |
What do you mean by that? Do you need the parent directory or really the whole filesystem. BTW acl isn't set on the whole fs. Anyway, I am afraid it would not be useful, because I did setfacl on the parent directory to check if it changes anything... it was a mistake |
@sopmot Yes, sorry, I was mixing up all these various issues and simply assumed the corrupted filesystem was created by rsync and that's why I was asking about the ACLs on the "source" file system. Your description of the events that led to the corruption is very helpful in any case. Do you know what set the setgid bit (g+s) on the corrupted "documents" directory shown above? Was it inherited because it was set on parent directory? Did you set it manually (chmod g+s)? The reason I ask is because all of these related problem reports involve a directory with the setgid bit set. The other interesting clue is the "pflags" ends with "044" rather than "144". This suggests an unusual order of permission-setting operations or is yet another manifestation of the problem. I understand you can't recreate the problem now, but if you've still got the original zip file, could you please unzip it again, apply your recursive setfacl commands and then run a |
Sorry to butt in; I just want to say my experience with this issue is similar to @sopmot 's. The corruption doesn't manifest right away. If you set up the directory "properly" (in my experience: setgid, with a default ACL), it will get corrupted after a while, but never immediately. |
@akorn No problem, thanks for the updated information. Are either of you (or anyone experiencing this class of problem) running with |
@dweezil, yes, those seem to be the defaults. I only set
OTOH, on a different computer I have |
Mine (default): /sys/module/spl/parameters/spl_debug_mask:188 |
https://gist.github.com/sopmot/3ca667f777c7d184ce9f
Yes.
As far as I can remember I didn't do that. |
@sopmot Thanks for the followup information. Your case has almost identical characteristics to the others. Unfortunately, I've not been able to reproduce it yet nor have I had any time this week to look into it further. I'm trying to come up with a reproducer and think that memory pressure may be part of the equation. |
@dweeezil |
If a spill block's dbuf hasn't yet been written when a spill block is freed, the unwritten version will still be written. This patch handles the case in which a spill block's dbuf is freed and undirties it to prevent it from being written. The most common case in which this could happen is when xattr=sa is being used and a long xattr is immediately replaced by a short xattr as in: setfattr -n user.test -v very_very_very..._long_value <file> setfattr -n user.test -v short_value <file> The first value must be sufficiently long that a spill block is generated and the second value must be short enough to not require a spill block. In practice, this would typically happen due to internal xattr operations as a result of setting acltype=posixacl. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#2663 Closes openzfs#2700 Closes openzfs#2701 Closes openzfs#2717 Closes openzfs#2863 Closes openzfs#2884 Conflicts: module/zfs/dbuf.c
If a spill block's dbuf hasn't yet been written when a spill block is freed, the unwritten version will still be written. This patch handles the case in which a spill block's dbuf is freed and undirties it to prevent it from being written. The most common case in which this could happen is when xattr=sa is being used and a long xattr is immediately replaced by a short xattr as in: setfattr -n user.test -v very_very_very..._long_value <file> setfattr -n user.test -v short_value <file> The first value must be sufficiently long that a spill block is generated and the second value must be short enough to not require a spill block. In practice, this would typically happen due to internal xattr operations as a result of setting acltype=posixacl. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2663 Closes #2700 Closes #2701 Closes #2717 Closes #2863 Closes #2884
hi All,
As you can see, this is a host machine for LXC containers.
From the host machine one of the directories isn't accessible:
However it is from the container:
In this directory some files are accessible, some not:
scrub finds no errors.
My best guess is that it's related to xattr=sa and acltype=posixacl. However in this particular directory it wasn't used (it is now, I set it, but did not change anything).
The text was updated successfully, but these errors were encountered: