Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ceph osd crush. #1437

Closed
nazarianin opened this issue Apr 30, 2013 · 5 comments
Closed

Ceph osd crush. #1437

nazarianin opened this issue Apr 30, 2013 · 5 comments
Milestone

Comments

@nazarianin
Copy link

When run ceph osd on later zfs git I have crush osd. In log this message.

2013-04-30 13:58:08.379230 7fa60d015780 20 filestore(/ceph/osd/osd.0) check_replay_guard no xattr
2013-04-30 13:58:08.379248 7fa60d015780 15 filestore(/ceph/osd/osd.0) write 1.7_head/96f33707/2.00000000/head//1 0230
2013-04-30 13:58:08.380443 7fa60d015780 10 filestore(/ceph/osd/osd.0) write 1.7_head/96f33707/2.00000000/head//1 0
230 = 230
2013-04-30 13:58:08.380562 7fa60d015780 20 filestore(/ceph/osd/osd.0) check_replay_guard no xattr
2013-04-30 13:58:08.380722 7fa60d015780 20 filestore(/ceph/osd/osd.0) fgetattrs 21 getting '
'
2013-04-30 13:58:08.380777 7fa60d015780 20 filestore(/ceph/osd/osd.0) fgetattrs 21 getting '
'
2013-04-30 13:58:08.380797 7fa60d015780 15 filestore(/ceph/osd/osd.0) setattrs 1.7_head/96f33707/2.00000000/head//1
2013-04-30 13:58:08.380804 7fa60d015780 30 filestore(/ceph/osd/osd.0) setattrs 1.7_head/96f33707/2.00000000/head//1 _:
0000 : 0b 08 cf 00 00 00 04 03 2b 00 00 00 00 00 00 00 : ........+.......
0010 : 0a 00 00 00 32 2e 30 30 30 30 30 30 30 30 fe ff : ....2.00000000..
0020 : ff ff ff ff ff ff 07 37 f3 96 00 00 00 00 00 01 : .......7........
0030 : 00 00 00 00 00 00 00 04 03 10 00 00 00 01 00 00 : ................
0040 : 00 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 : ................
0050 : 00 01 00 00 00 00 00 00 00 03 00 00 00 00 00 00 : ................
0060 : 00 00 00 00 00 00 00 00 00 02 02 15 00 00 00 02 : ................
0070 : 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 : ................
0080 : 01 00 00 00 e6 00 00 00 00 00 00 00 51 bf 74 51 : ............Q.tQ
0090 : 90 f0 6e 10 02 02 15 00 00 00 00 00 00 00 00 00 : ..n.............
00a0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : ................
00b0 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : ................
00c0 : 00 00 00 00 01 00 00 00 00 00 00 00 03 00 00 00 : ................
00d0 : 00 00 00 00 00 : .....

2013-04-30 13:58:08.381096 7fa60d015780 -1 filestore(/ceph/osd/osd.0) _setattrs: _fgetattr returned (95) Operation not supported on read o
2013-04-30 13:58:08.382782 7fa60d015780 -1 os/FileStore.cc: In function 'int FileStore::_setattrs(coll_t, const hobject_t&, std::map<std::
os/FileStore.cc: 4036: FAILED assert(0)

It`s problem in ceph or in zfs?

@behlendorf
Copy link
Contributor

I believe this same issue was mentioned by @clusterfaq here, #1409 (comment). The last I heard we hadn't yet identified if it was a ZFS or Ceph issue, but I'm glad you opened an issue so we can track it.

If someone familiar with the Ceph code could trace through and see exactly where that EOPNOTSUPP (95) errno is being returned from that would be helpful. Is being returned from a library or system call? If so which one?

@nazarianin
Copy link
Author

ZFS return EOPNOTSUPP because ceph wip-debug-xattr branch have bug. Ceph try read '' attr instead 'user.ceph.'. This patch decides problem.

    diff -rupN ceph_src/src/os/FileStore.cc ceph/src/os/FileStore.cc
    --- ceph_src/src/os/FileStore.cc        2013-05-06 09:49:17.000000000 +0600
    +++ ceph/src/os/FileStore.cc    2013-05-07 16:41:03.000000000 +0600
    @@ -4028,7 +4028,9 @@ int FileStore::_setattrs(coll_t cid, con
             i != inline_to_set.end();
             ++i) {
           bufferptr bp;
    -      r = _fgetattr(fd, i->first.c_str(), bp);
    +      char n[CHAIN_XATTR_MAX_NAME_LEN];
    +      get_attrname(i->first.c_str(), n, CHAIN_XATTR_MAX_NAME_LEN);
    +      r = _fgetattr(fd, n, bp);
           if (r < 0) {
            derr << __func__ << ": _fgetattr returned " << cpp_strerror(-r)
                 << " on read of attr " << i->first << " for object "

But it works when xattr='dir', dosn`t works with xattr='sa'. I write small code to demonstrate problem.

    #include <stdio.h>
    #include <attr/xattr.h>
    #include <fcntl.h>
    #include <errno.h>

    #define XATTRN "user.ceph._\0"
    #define XATTRV "0sC"

    int main(int argc, char **argv) {
        char xlist[]=XATTRV;
        int fd;
        char val[1];
        ssize_t r;
        size_t size = 0;
        mode_t mode = S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
        fd = open("./zattr",O_CREAT | O_RDWR);
        size = strlen(xlist);
        r = fsetxattr(fd, XATTRN, xlist , size,0);
        r =  fgetxattr (fd, XATTRN, val, sizeof(val));
        if (r != size) printf("fgetxattr  ret r: %d errno: %d\n", r, errno);
        close(fd);
        return 0;
    }

When xattr='sa' this code output

ct1 tmp # ./test
fgetxattr ret r: -1 errno: 61

when xattr='dir'

ct1 tmp # ./test
fgetxattr ret r: -1 errno: 34

ZFS retrun ENOATTR must ERANGE when use xattr='sa'.

Thanks

behlendorf added a commit to behlendorf/zfs that referenced this issue May 8, 2013
When SA xattrs are enabled only fallback to checking the directory
xattrs when the name is not found as a SA xattr.  Otherwise, the SA
error which should be returned to the caller is overwritten by the
directory xattr errors.  Positive return values indicating success
will also be immediately returned.

In the case of openzfs#1437 the ERANGE error was being correctly returned
by zpl_xattr_get_sa() only to be overridden with ENOENT which was
returned by the subsequent unnessisary call to zpl_xattr_get_dir().

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#1437
@behlendorf
Copy link
Contributor

@nazarianin Thank you, that's exactly what I needed to know. I've opened pull request #1451 with a small patch to ensure we return the correct error. As expected, it passes your test case but if you and/or @clusterfaq could verify it also resolves the ceph crash in the wip-debug-xattr branch that would be helpful. Thank you for spending the time to help us run down these edge cases.

@nazarianin
Copy link
Author

I run ceph on zfs with this patch 5 days. All work fine under heavy load.
Thank you, Brian!

@behlendorf
Copy link
Contributor

@nazarianin Thanks for the feedback, this was merged as 0377189 and it will be fixed in the 0.6.2 tag.

unya pushed a commit to unya/zfs that referenced this issue Dec 13, 2013
When SA xattrs are enabled only fallback to checking the directory
xattrs when the name is not found as a SA xattr.  Otherwise, the SA
error which should be returned to the caller is overwritten by the
directory xattr errors.  Positive return values indicating success
will also be immediately returned.

In the case of openzfs#1437 the ERANGE error was being correctly returned
by zpl_xattr_get_sa() only to be overridden with ENOENT which was
returned by the subsequent unnessisary call to zpl_xattr_get_dir().

Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#1437
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants