Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xattr=sa acltype=posixacl bug #2700

Closed
tomposmiko opened this issue Sep 14, 2014 · 36 comments
Closed

xattr=sa acltype=posixacl bug #2700

tomposmiko opened this issue Sep 14, 2014 · 36 comments
Milestone

Comments

@tomposmiko
Copy link

hi All,

# zfs get all tank/lxc/files/shared/it
NAME                      PROPERTY              VALUE                                           SOURCE
tank/lxc/files/shared/it  type                  filesystem                                      -
tank/lxc/files/shared/it  creation              Tue Jul  8 13:37 2014                           -
tank/lxc/files/shared/it  used                  50.9G                                           -
tank/lxc/files/shared/it  available             596G                                            -
tank/lxc/files/shared/it  referenced            50.9G                                           -
tank/lxc/files/shared/it  compressratio         1.00x                                           -
tank/lxc/files/shared/it  mounted               yes                                             -
tank/lxc/files/shared/it  quota                 none                                            default
tank/lxc/files/shared/it  reservation           none                                            default
tank/lxc/files/shared/it  recordsize            128K                                            default
tank/lxc/files/shared/it  mountpoint            /tank/lxc/files/rootfs/data/shared/projects/it  received
tank/lxc/files/shared/it  sharenfs              off                                             default
tank/lxc/files/shared/it  checksum              on                                              default
tank/lxc/files/shared/it  compression           lz4                                             inherited from tank/lxc
tank/lxc/files/shared/it  atime                 off                                             inherited from tank
tank/lxc/files/shared/it  devices               on                                              default
tank/lxc/files/shared/it  exec                  on                                              default
tank/lxc/files/shared/it  setuid                on                                              default
tank/lxc/files/shared/it  readonly              off                                             default
tank/lxc/files/shared/it  zoned                 off                                             default
tank/lxc/files/shared/it  snapdir               hidden                                          default
tank/lxc/files/shared/it  aclinherit            restricted                                      default
tank/lxc/files/shared/it  canmount              on                                              default
tank/lxc/files/shared/it  xattr                 sa                                              inherited from tank/lxc/files/shared
tank/lxc/files/shared/it  copies                1                                               default
tank/lxc/files/shared/it  version               5                                               -
tank/lxc/files/shared/it  utf8only              off                                             -
tank/lxc/files/shared/it  normalization         none                                            -
tank/lxc/files/shared/it  casesensitivity       sensitive                                       -
tank/lxc/files/shared/it  vscan                 off                                             default
tank/lxc/files/shared/it  nbmand                off                                             default
tank/lxc/files/shared/it  sharesmb              off                                             default
tank/lxc/files/shared/it  refquota              none                                            default
tank/lxc/files/shared/it  refreservation        none                                            default
tank/lxc/files/shared/it  primarycache          all                                             default
tank/lxc/files/shared/it  secondarycache        all                                             default
tank/lxc/files/shared/it  usedbysnapshots       0                                               -
tank/lxc/files/shared/it  usedbydataset         50.9G                                           -
tank/lxc/files/shared/it  usedbychildren        0                                               -
tank/lxc/files/shared/it  usedbyrefreservation  0                                               -
tank/lxc/files/shared/it  logbias               latency                                         default
tank/lxc/files/shared/it  dedup                 off                                             default
tank/lxc/files/shared/it  mlslabel              none                                            default
tank/lxc/files/shared/it  sync                  standard                                        default
tank/lxc/files/shared/it  refcompressratio      1.00x                                           -
tank/lxc/files/shared/it  written               50.9G                                           -
tank/lxc/files/shared/it  logicalused           50.9G                                           -
tank/lxc/files/shared/it  logicalreferenced     50.9G                                           -
tank/lxc/files/shared/it  snapdev               hidden                                          default
tank/lxc/files/shared/it  acltype               posixacl                                        inherited from tank/lxc/files/shared
tank/lxc/files/shared/it  context               none                                            default
tank/lxc/files/shared/it  fscontext             none                                            default
tank/lxc/files/shared/it  defcontext            none                                            default
tank/lxc/files/shared/it  rootcontext           none                                            default
tank/lxc/files/shared/it  relatime              off                                             default

As you can see, this is a host machine for LXC containers.
From the host machine one of the directories isn't accessible:

$ ls /tank/lxc/files/rootfs/data/shared/projects/it/www/CC
ls: cannot access /tank/lxc/files/rootfs/data/shared/projects/it/www/CC: No such file or directory

However it is from the container:

$ ls -ld /data/shared/projects/it/www/CC
drwxrwsr-x 6 tpapp chemaxon 13 Sep 10 16:24 /data/shared/projects/it/www/CC/

In this directory some files are accessible, some not:

$ ls -l documents/
ls: documents/: Bad address
ls: cannot open directory documents/: Input/output error

scrub finds no errors.

My best guess is that it's related to xattr=sa and acltype=posixacl. However in this particular directory it wasn't used (it is now, I set it, but did not change anything).

@dweeezil
Copy link
Contributor

@sopmot, could you please run the zdb tests on the "documents" directory I suggested in #2663.

@tomposmiko
Copy link
Author

Hmm, it's weird, now I can see the CC directory if I try to check it outside of the container
Actually a directory in that is the problematic one.

So now I can see the same from in- and outside the container:

# ls /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents/
ls: cannot open directory /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents/: Input/output error

What command do you need exactly?
I could not find out, how to run it, always returns with 'no such file or directory'.
I am afraid, I am not experienced enough to find out myself:)

BTW, I can run this on its filesystem:

# zdb -dddddd tank/lxc/files/shared/it 6
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
         6    1    16K    16K    16K    32K  100.00  SA attr layouts (K=inherit) (Z=inherit)
    dnode flags: USED_BYTES USERUSED_ACCOUNTED 
    dnode maxblkid: 1
    Fat ZAP stats:
        Pointer table:
            1024 elements
            zt_blk: 0
            zt_numblks: 0
            zt_shift: 10
            zt_blks_copied: 0
            zt_nextblk: 0
        ZAP entries: 3
        Leaf blocks: 1
        Total blocks: 2
        zap_block_type: 0x8000000000000001
        zap_magic: 0x2f52ab2ab
        zap_salt: 0x1a4da0699d
        Leafs with 2^n pointers:
            9:      1 *
        Blocks with n*5 entries:
            0:      1 *
        Blocks n/10 full:
            1:      1 *
        Entries with n chunks:
            3:      1 *
            4:      2 **
        Buckets with n entries:
            0:    509 ****************************************
            1:      3 *

        3 = [ 5  6  4  12  13  7  11  0  1  2  3  8  16  19  20 ]
        4 = [ 20 ]
        2 = [ 5  6  4  12  13  7  11  0  1  2  3  8  16  19 ]
Indirect blocks:
               0 L0 1:21d1378000:1000 0:23701b8000:1000 4000L/400P F=1 B=5571/5571
            4000 L0 1:21d1379000:1000 0:23701b9000:1000 4000L/a00P F=1 B=5571/5571

        segment [0000000000000000, 0000000000008000) size   32K

But I don't know, what the magical number does at the end of the command.
Does it help?

@dweeezil
Copy link
Contributor

@sopmot Sorry, I was on a mobile device when I sent the request and had to be brief. The "magical" number is the object number within the dataset tank/lxc/files/shared/it. Object 6 shows the different SA layouts used throughout the dataset. Please run the command on object 5 as well. Note that 5 and 6 are only the typical object numbers containing SA information but in the current version of ZFS, they're typically always the right ones.

Last, please run the same command on the inode number of the "documents" directory. You should be able to get its inode number with ls -di /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents but if that gives and error, try ls -i /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04 (the parent) and it should show you inode number of "documents". Use the inode number of "documents" as the zdb argument.

Based on the output shown above, I have a very good idea what we're going to see: we'll see the "documents" directory has a spill block and that it otherwise looks OK as far as ZDB is concerned. I've got a debugging version of zdb which prints a bit more information but it doesn't (yet) deal with the spill block.

@tomposmiko
Copy link
Author

# zdb -dddddd tank/lxc/files/shared/it 5
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
         5    1    16K  1.50K     8K  1.50K  100.00  SA attr registration (K=inherit) (Z=inherit)
    dnode flags: USED_BYTES USERUSED_ACCOUNTED 
    dnode maxblkid: 0
    microzap: 1536 bytes, 21 entries

        ZPL_GEN =  8000004 : [8:0:4]
        ZPL_LINKS =  8000008 : [8:0:8]
        ZPL_ATIME =  10000000 : [16:0:0]
        ZPL_DACL_ACES =  40013 : [0:4:19]
        ZPL_RDEV =  800000a : [8:0:10]
        ZPL_UID =  800000c : [8:0:12]
        ZPL_SYMLINK =  30011 : [0:3:17]
        ZPL_XATTR =  8000009 : [8:0:9]
        ZPL_ZNODE_ACL =  5803000f : [88:3:15]
        ZPL_PARENT =  8000007 : [8:0:7]
        ZPL_PAD =  2000000e : [32:0:14]
        ZPL_GID =  800000d : [8:0:13]
        ZPL_FLAGS =  800000b : [8:0:11]
        ZPL_DACL_COUNT =  8000010 : [8:0:16]
        ZPL_MODE =  8000005 : [8:0:5]
        ZPL_DXATTR =  30014 : [0:3:20]
        ZPL_SCANSTAMP =  20030012 : [32:3:18]
        ZPL_SIZE =  8000006 : [8:0:6]
        ZPL_CRTIME =  10000003 : [16:0:3]
        ZPL_CTIME =  10000002 : [16:0:2]
        ZPL_MTIME =  10000001 : [16:0:1]
Indirect blocks:
               0 L0 1:21d1375000:1000 0:23701b5000:1000 600L/200P F=1 B=5571/5571

        segment [0000000000000000, 0000000000000600) size 1.50K

@tomposmiko
Copy link
Author

# ls -li /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents
ls: /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents: Bad address
ls: cannot open directory /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents: Input/output error
# ls -li /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04 -d
329 drwxrwsr-x+ 6 10034 10000 6 Jun 11 12:03 /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/
# ls /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/ -l
ls: /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/ccc: Bad address
ls: /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/test data for structure and text files: Bad address
ls: /tank/lxc/files/rootfs/data/shared/projects/it/www/CC/cc-setup_2014-08-04/documents: Bad address
total 74
drwxrwsr-x  2 10034 10000 4 Jun 11 10:58 ccc/
drwxrwsr-x  2 10034 10000 5 Aug  4 11:03 documents/
drwxrwsr-x+ 3 10034 10000 9 Jun 11 12:03 setup files/
drwxrwsr-x  2 10034 10000 9 Jun 11 12:03 test data for structure and text files/
# zdb -dddddd tank/lxc/files/shared/it 329
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
       329    1    16K    512     8K    512  100.00  ZFS directory (K=inherit) (Z=inherit)
                                        284   bonus  System attributes
    dnode flags: USED_BYTES USERUSED_ACCOUNTED 
    dnode maxblkid: 0
    path    /www/CC/cc-setup_2014-08-04
    uid     10034
    gid     10000
    atime   Wed Jun 11 12:03:30 2014
    mtime   Wed Jun 11 12:03:30 2014
    ctime   Mon Aug  4 12:47:14 2014
    crtime  Mon Aug  4 12:47:09 2014
    gen 400274
    mode    42775
    size    6
    parent  323
    links   6
    pflags  40800000044
    SA xattrs: 108 bytes, 1 entries

        system.posix_acl_default = \002\000\000\000\001\000\007\000\377\377\377\377\004\000\007\000\377\377\377\377\010\000\007\000c'\000\000\020\000\007\000\377\377\377\377 \000\005\000\377\377\377\377
    microzap: 512 bytes, 4 entries

        ccc = 345 (type: Directory)
        test data for structure and text files = 348 (type: Directory)
        documents = 341 (type: Directory)
        setup files = 330 (type: Directory)
Indirect blocks:
               0 L0 0:f15babb000:1000 1:ed430fb000:1000 200L/200P F=1 B=400280/400280

        segment [0000000000000000, 0000000000000200) size   512

for the documents directory:

# zdb -dddddd tank/lxc/files/shared/it 341
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
       341    2    16K    16K    32K    32K  100.00  ZFS directory (K=inherit) (Z=inherit)
                                        196   bonus  System attributes
    dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR
    dnode maxblkid: 1
    path    /www/CC/cc-setup_2014-08-04/documents
    uid     10034
    gid     10000
    atime   Wed Jun 11 12:51:30 2014
    mtime   Mon Aug  4 11:03:21 2014
    ctime   Mon Aug  4 12:47:14 2014
    crtime  Mon Aug  4 12:47:14 2014
    gen 400280
    mode    42775
    size    5
    parent  329
    links   2
    pflags  40800000044
    Fat ZAP stats:
        Pointer table:
            1024 elements
            zt_blk: 0
            zt_numblks: 0
            zt_shift: 10
            zt_blks_copied: 0
            zt_nextblk: 0
        ZAP entries: 3
        Leaf blocks: 1
        Total blocks: 2
        zap_block_type: 0x8000000000000001
        zap_magic: 0x2f52ab2ab
        zap_salt: 0x2a195f27b
        Leafs with 2^n pointers:
            9:      1 *
        Blocks with n*5 entries:
            0:      1 *
        Blocks n/10 full:
            1:      1 *
        Entries with n chunks:
            4:      1 *
            5:      2 **
        Buckets with n entries:
            0:    509 ****************************************
            1:      3 *

        Setting of Java for Compliance Checker MarvinApplet.docx = 342 (type: Regular File)
        Compliance Checker Web service and Client tool.docx = 344 (type: Regular File)
        Compliance_Checker_setup_guide.docx = 343 (type: Regular File)
Indirect blocks:
               0 L1  0:f15bac1000:1000 1:ed430fe000:1000 4000L/400P F=2 B=400280/400280
               0  L0 0:f15babc000:1000 1:ed430fc000:1000 4000L/400P F=1 B=400280/400280
            4000  L0 0:f15babd000:1000 1:ed430fd000:1000 4000L/a00P F=1 B=400280/400280

        segment [0000000000000000, 0000000000008000) size   32K

for the ccc directory:

# zdb -dddddd tank/lxc/files/shared/it 345
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
       345    1    16K    512    16K    512  100.00  ZFS directory (K=inherit) (Z=inherit)
                                        196   bonus  System attributes
    dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR
    dnode maxblkid: 0
    path    /www/CC/cc-setup_2014-08-04/ccc
    uid     10034
    gid     10000
    atime   Wed Jun 11 10:58:48 2014
    mtime   Wed Jun 11 10:58:48 2014
    ctime   Mon Aug  4 12:47:14 2014
    crtime  Mon Aug  4 12:47:14 2014
    gen 400280
    mode    42775
    size    4
    parent  329
    links   2
    pflags  40800000044
    microzap: 512 bytes, 2 entries

        takata_20140603.ccc = 347 (type: Regular File)
        takata_20140604.ccc = 346 (type: Regular File)
Indirect blocks:
               0 L0 0:f15bac2000:1000 1:ed430ff000:1000 200L/200P F=1 B=400280/400280

        segment [0000000000000000, 0000000000000200) size   512

@dweeezil
Copy link
Contributor

@sopmot The fist ls command should be been "ls -di" rather than "ls -li" but I see your second one got its indoe number. The output confirms a spill block. Could you please build dweeezil/zfs@277a0f7 (my "zdb" branch in https://github.com/dweeezil/zfs/tree/zdb) and run the same zdb command directly from the build directory as "cmd/zdb/zdb -dddddd tank/...". This will show us the spill block which I suspect is corrupted. This version of zdb will print a LOT more than we need, I'll I'm interested in the section labeled as "Spill blkptr".

@tomposmiko
Copy link
Author

@dweeezil
Sorry for the noise, I'm trying to gather as much information as possible, most of them might be useless for you.

Just a note, I do backup to a machine with send/recv and it transferred the errors without problems.

@tomposmiko
Copy link
Author

@dweeezil
With your stuff:

# ./cmd/zdb/zdb -dddddd tank/lxc/files/shared/it 341
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
       341    2    16K    16K    32K    32K  100.00  ZFS directory (K=inherit) (Z=inherit)
                                        196   bonus  System attributes
    dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR
    dnode maxblkid: 1
    path    /www/CC/cc-setup_2014-08-04/documents
    uid     10034
    gid     10000
    atime   Wed Jun 11 12:51:30 2014
    mtime   Mon Aug  4 11:03:21 2014
    ctime   Mon Aug  4 12:47:14 2014
    crtime  Mon Aug  4 12:47:14 2014
    gen 400280
    mode    42775
    size    5
    parent  329
    links   2
    pflags  40800000044
    Fat ZAP stats:
        Pointer table:
            1024 elements
            zt_blk: 0
            zt_numblks: 0
            zt_shift: 10
            zt_blks_copied: 0
            zt_nextblk: 0
        ZAP entries: 3
        Leaf blocks: 1
        Total blocks: 2
        zap_block_type: 0x8000000000000001
        zap_magic: 0x2f52ab2ab
        zap_salt: 0x2a195f27b
        Leafs with 2^n pointers:
            9:      1 *
        Blocks with n*5 entries:
            0:      1 *
        Blocks n/10 full:
            1:      1 *
        Entries with n chunks:
            4:      1 *
            5:      2 **
        Buckets with n entries:
            0:    509 ****************************************
            1:      3 *

        Setting of Java for Compliance Checker MarvinApplet.docx = 342 (type: Regular File)
        Compliance Checker Web service and Client tool.docx = 344 (type: Regular File)
        Compliance_Checker_setup_guide.docx = 343 (type: Regular File)
Indirect blocks:
               0 L1  0:f15bac1000:1000 1:ed430fe000:1000 4000L/400P F=2 B=400280/400280
               0  L0 0:f15babc000:1000 1:ed430fc000:1000 4000L/400P F=1 B=400280/400280
            4000  L0 0:f15babd000:1000 1:ed430fd000:1000 4000L/a00P F=1 B=400280/400280

        segment [0000000000000000, 0000000000008000) size   32K

It looks the same to me, doesn't it?

@dweeezil
Copy link
Contributor

@sopmot Doesn't look like you've got my latest zdb branch. Does "git show" show 277a0f7 as the last commit? Probably the easiest way to get the right code would be to do a fresh clone and checkout:

% git clone https://github.com/dweeezil/zfs.git zfs-dweeezil
% cd zfs-dweeezil
% git checkout zdb
% ./autogen.sh
% ./configure
% make
% sudo cmd/zdb/zdb ...

The output of this zdb will be very noisy. We're only interested in the "Spill blkptr:" section which should look something like this:

    Spill blkptr:
        0:20c00:200 0:3c020600:200 200L/200P F=1 B=6151/6151

And, in fact, if you get that far (if the DVAs don't look totally bogus), the next thing I'd like you to do is:

% sudo cmd/zdb/zdb -R <pool>/<fs> <vdev>:<offset>:size

where the last argument should be one of the blkptrs listed from the previous zdb command (for example "0:20c00:200" in the example above).

@tomposmiko
Copy link
Author

Argh, I forgot the checkout.

make[3]: Entering directory `/usr/src/linux-headers-3.13.0-35-generic'
  CC [M]  /data/src/zfs/module/avl/../../module/avl/avl.o
  LD [M]  /data/src/zfs/module/avl/zavl.o
  CC [M]  /data/src/zfs/module/nvpair/../../module/nvpair/nvpair.o
  CC [M]  /data/src/zfs/module/nvpair/../../module/nvpair/fnvpair.o
  CC [M]  /data/src/zfs/module/nvpair/../../module/nvpair/nvpair_alloc_spl.o
  CC [M]  /data/src/zfs/module/nvpair/../../module/nvpair/nvpair_alloc_fixed.o
  LD [M]  /data/src/zfs/module/nvpair/znvpair.o
  CC [M]  /data/src/zfs/module/unicode/../../module/unicode/u8_textprep.o
  CC [M]  /data/src/zfs/module/unicode/../../module/unicode/uconv.o
  LD [M]  /data/src/zfs/module/unicode/zunicode.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zfs_deleg.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zfs_prop.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zprop_common.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zfs_namecheck.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zfs_comutil.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zfs_fletcher.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zfs_uio.o
  CC [M]  /data/src/zfs/module/zcommon/../../module/zcommon/zpool_prop.o
  LD [M]  /data/src/zfs/module/zcommon/zcommon.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/arc.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/blkptr.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/bplist.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/bpobj.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/dbuf.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/dbuf_stats.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/bptree.o
  CC [M]  /data/src/zfs/module/zfs/../../module/zfs/ddt.o
/data/src/zfs/module/zfs/../../module/zfs/ddt.c: In function ‘ddt_stat_update’:
/data/src/zfs/module/zfs/../../module/zfs/ddt.c:426:2: error: implicit declaration of function ‘highbit64’ [-Werror=implicit-function-declaration]
  bucket = highbit64(dds.dds_ref_blocks) - 1;
  ^
cc1: some warnings being treated as errors
make[5]: *** [/data/src/zfs/module/zfs/../../module/zfs/ddt.o] Error 1
make[4]: *** [/data/src/zfs/module/zfs] Error 2
make[3]: *** [_module_/data/src/zfs/module] Error 2
make[3]: Leaving directory `/usr/src/linux-headers-3.13.0-35-generic'
make[2]: *** [modules] Error 2
make[2]: Leaving directory `/data/src/zfs/module'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/data/src/zfs'
make: *** [all] Error 2

After ./configure --with-spl=/data/src/spl-linux-0.6.3

Do I make something wrong?

@dweeezil
Copy link
Contributor

@sopmot Ugh, you need the latest spl as well. All we really need is the very last small commit to zdb I added in the zdb branch. I'd recommend cleaning up your checkout and then just cherry picking the patch:

% git clean -fxd
% git checkout zfs-0.6.3
% git cherry-pick f8a5bd0eef205c1166a89abfb9f2b3a1c0ef4ccb

and the it ought to build just fine with your existing SPL.

@tomposmiko
Copy link
Author

On 09/15/2014 06:44 PM, Tim Chase wrote:

@sopmot https://github.com/sopmot Ugh, you need the latest spl as
well. All we really need is the very last small commit to zdb I added
in the zdb branch. I'd recommend cleaning up your checkout and then
just cherry picking the patch:

|% git clean -fxd
% git checkout zfs-0.6.3
% git cherry-pick f8a5bd0eef205c1166a89abfb9f2b3a1c0ef4ccb
|

and the it ought to build just fine with your existing SPL.

@dweezil, de is no zfs-0.6.3 branch

I applied that patch manually, to (ubuntu) 0.6.3 branch, but failed with
this.

make[3]: Leaving directory /data/src/zfs-linux-0.6.3/cmd/zpool' Making all in zdb make[3]: Entering directory/data/src/zfs-linux-0.6.3/cmd/zdb'
CC zdb.o
../../cmd/zdb/zdb.c: In function ‘dump_object’:
../../cmd/zdb/zdb.c:1711:16: warning: implicit declaration of function
‘snprintf_blkptr_compact’ [-Wimplicit-function-declaration]
snprintf_blkptr_compact(blkbuf, sizeof (blkbuf),
&dn->dn_phys->dn_spill);
^
CC zdb_il.o
CCLD zdb
zdb.o: In function dump_object': /data/src/zfs-linux-0.6.3/cmd/zdb/../../cmd/zdb/zdb.c:1711: undefined reference tosnprintf_blkptr_compact'
collect2: error: ld returned 1 exit status
make[3]: *** [zdb] Error 1
make[3]: Leaving directory /data/src/zfs-linux-0.6.3/cmd/zdb' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory/data/src/zfs-linux-0.6.3/cmd'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/data/src/zfs-linux-0.6.3'
make: *** [all] Error 2

I'll be travelling now and I am not sure, I can reply to you soon.
My guess is that I just missed something here?

This is the patch, right?

commit 277a0f7
Author: Tim Chase [email protected]
Date: Mon Sep 15 07:38:22 2014 -0500

 Dump the spill blkptr in verbose >= 6.

diff --git a/cmd/zdb/zdb.c b/cmd/zdb/zdb.c
index 2600012..770aed2 100644
--- a/cmd/zdb/zdb.c
+++ b/cmd/zdb/zdb.c
@@ -1884,6 +1884,12 @@ dump_object(objset_t *os, uint64_t object, int
verbosity, int *print_header)
*print_header = 1;
}

  •   if (verbosity >= 6 && (dn->dn_phys->dn_flags & 
    
    DNODE_FLAG_SPILL_BLKPTR)) {
  •           char blkbuf[BP_SPRINTF_LEN];
    
  •           snprintf_blkptr_compact(blkbuf, sizeof (blkbuf), 
    
    &dn->dn_phys->dn_spill);
  •           (void) printf("\n\tSpill blkptr:\n\t\t%s\n", blkbuf);
    
  •   }
    
    • if (verbosity >= 5)
      dump_indirect(dn);

10x
tamas

@dweeezil
Copy link
Contributor

@sopmot Double ugh. Another post-0.6.3-ism. Change "snprintf_blkptr_compact" to "sprintf_blkptr_compact" (drop the "n") and remove the second argument "sizeof (blkbuf)".

@tomposmiko
Copy link
Author

    Spill blkptr:
        0:f15baba000:1000 1:ed430fa000:1000 200L/200P F=1 B=400280/400280
# ./cmd/zdb/zdb -dddddd tank/lxc/files/shared/it 0:f15baba000:1000
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 594 objects, rootbp DVA[0]=<1:14fcb4c7000:1000> DVA[1]=<0:143cbc8b000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1086311L/1086311P fill=594 cksum=1ad5aac0e9:77e44a5f414:12de3295fc3c7:22c34dc5781868

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
         0    7    16K    16K   264K   304K   97.70  DMU dnode (K=inherit) (Z=inherit)
    dnode flags: USED_BYTES 
    dnode maxblkid: 18
Indirect blocks:
               0 L6       1:14fcb4a3000:1000 0:143cbc69000:1000 4000L/400P F=594 B=1086311/1086311
               0  L5      1:14fcb49c000:1000 0:143cbc62000:1000 4000L/400P F=594 B=1086311/1086311
               0   L4     1:14fcb49a000:1000 0:143cbc60000:1000 4000L/400P F=594 B=1086311/1086311
               0    L3    1:14fcb497000:1000 0:143cbc5d000:1000 4000L/400P F=594 B=1086311/1086311
               0     L2   1:14fcb48f000:1000 0:143cbc58000:1000 4000L/400P F=594 B=1086311/1086311
               0      L1  1:14fcb48d000:1000 0:143cbc56000:1000 4000L/800P F=594 B=1086311/1086311
               0       L0 1:ed52c22000:1000 0:f16b623000:1000 4000L/1000P F=31 B=400314/400314
            4000       L0 1:21d3266000:2000 0:2372133000:2000 4000L/1200P F=32 B=5571/5571
            8000       L0 1:21d3265000:1000 0:2372132000:1000 4000L/e00P F=32 B=5571/5571
            c000       L0 1:21d3269000:1000 0:2372136000:1000 4000L/c00P F=32 B=5571/5571
           10000       L0 1:21d3285000:1000 0:2372137000:1000 4000L/e00P F=32 B=5571/5571
           14000       L0 1:6da0473000:2000 0:23741f9000:2000 4000L/1200P F=32 B=5572/5572
           18000       L0 1:6dae161000:2000 0:2390230000:2000 4000L/1600P F=32 B=5586/5586
           1c000       L0 0:2395a20000:2000 1:6db172c000:2000 4000L/1400P F=32 B=5589/5589
           20000       L0 1:21e412c000:2000 0:2396a1e000:2000 4000L/1200P F=32 B=5590/5590
           24000       L0 1:24f12e8000:2000 0:28edfae000:2000 4000L/1600P F=32 B=6234/6234
           28000       L0 1:14fcb485000:2000 0:143cbc4f000:2000 4000L/1600P F=32 B=1086311/1086311
           2c000       L0 1:14fcb483000:2000 0:143cbc4d000:2000 4000L/1200P F=28 B=1086311/1086311
           30000       L0 0:f314daa000:1000 1:ef0c95e000:1000 4000L/1000P F=32 B=418249/418249
           34000       L0 0:f314dbb000:1000 1:ef0c961000:1000 4000L/1000P F=32 B=418249/418249
           38000       L0 1:ef0d434000:1000 0:f315754000:1000 4000L/1000P F=32 B=418250/418250
           3c000       L0 1:ef0d436000:1000 0:f315756000:1000 4000L/1000P F=32 B=418250/418250
           40000       L0 0:f3161d4000:1000 1:ef0defb000:1000 4000L/1000P F=32 B=418251/418251
           44000       L0 0:f3161d3000:1000 1:ef0defa000:1000 4000L/1000P F=32 B=418251/418251
           48000       L0 1:14fcb481000:1000 0:143cbc4b000:1000 4000L/e00P F=23 B=1086311/1086311

        segment [0000000000000200, 000000000002cc00) size  179K
        segment [000000000002d400, 000000000004ae00) size  119K

I hope, it helps:)

@dweeezil
Copy link
Contributor

@sopmot Getting close, you need to use "zdb -R" with the DVA arguments as follows:

zdb -R tank/lxc/files/shared/it 0:f15baba000:1000 1:ed430fa000:1000

Those DVAs look reasonable to me. The txgs are pretty high, however, so I'm guessing the pool has been around for awhile. Was this entire pool written with 0.6.3 or was some of it written with 0.6.2 or some intermediate snapshot between 0.6.2 and 0.6.3? The reason I ask is that there were some issued fixed prior to 0.6.3 which were known to have caused corrupted dnodes (but so far we're seeing no evidence of it).

@tomposmiko
Copy link
Author

@dweeezil
Copy link
Contributor

@sopmot OK, that's great. I was able to reconstruct the block from that dump. At first glance, it looks OK. I'd really like to see a little more output from my "zdb" branch. Could you please try to cherry pick dweeezil/zfs@e396021 into your zdb and run it again with 6 -d's (same arguments that produced the "Spill blkptr" output) and post the full output to a gist. I'll manually decode the block regardless, but won't have time to do that until this evening.

@dweeezil
Copy link
Contributor

@sopmot Your spill block decodes just fine. There appears to be no file system corruption. I was basing this line of testing on the use of SA xattrs and the presence of a spill block in the dnode and your error condition:

$ ls -l documents/
ls: documents/: Bad address
ls: cannot open directory documents/: Input/output error

when it seemed pretty clear there were no actual I/O errors occuring. I had assumed the issue was with the "documents" directory, itself and to be sure, it seems the system is not seeing the Posix ACL (no "+" in the "ls" output) on it even though according to your debugging, it does have one (stored as an SA).

In reviewing the commands you showed above, none of them give me a very good idea of the system call producing the I/O error. Could you please strace a command which produces the I/O error and post the output surrounding the system call(s) which are returning an EIO.

As I mentioned, I've got no explanation as to why the system isn't seeing the Posix ACL on your "documents" directory. The output of a 6-d zdb with dweeezil/zfs@e396021 still might shed some light on that.

@tomposmiko
Copy link
Author

OMG, I don't know, how I could overlook it.

I compiled spl from the master and linked your branch to it. So everything should be the latest now.

This is the output:
https://gist.github.com/sopmot/a87aa84d621c88075681

For sure, this is the other one:

# ./cmd/zdb/zdb -dddddd tank/lxc/files/shared/it 341
Dataset tank/lxc/files/shared/it [ZPL], ID 232, cr_txg 5569, 55.6G, 595 objects, rootbp DVA[0]=<1:1634d035000:1000> DVA[1]=<0:15b40415000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=1180913L/1180913P fill=595 cksum=1970f6ebdf:7af3a789f48:147cbdd6634e1:274b14fba1bce6

    Object  lvl   iblk   dblk  dsize  lsize   %full  type
       341    2    16K    16K    32K    32K  100.00  ZFS directory (K=inherit) (Z=inherit)
                                        196   bonus  System attributes
    dnode flags: USED_BYTES USERUSED_ACCOUNTED SPILL_BLKPTR
    dnode maxblkid: 1

    Spill blkptr:
        0:f15baba000:1000 1:ed430fa000:1000 200L/200P F=1 B=400280/400280
    SA hdrsize 16
    SA layout 3
    path    /www/CC/cc-setup_2014-08-04/documents
    uid     10034
    gid     10000
    atime   Wed Jun 11 12:51:30 2014
    mtime   Mon Aug  4 11:03:21 2014
    ctime   Mon Aug  4 12:47:14 2014
    crtime  Mon Aug  4 12:47:14 2014
    gen 400280
    mode    42775
    size    5
    parent  329
    links   2
    pflags  40800000044
    ndacl   3
    dump_znode_sa_xattr: sa_xattr_size=20 sa_size error=0
    SA packed dump sa_xattr_size=20: \001\001\000\000\000\000\000\000\000\000\000\001\000\000\000\000\010\000\000\000
    SA xattr dump:
    dump_znode_sa_xattr: nvlist_unpack error=14
    Fat ZAP stats:
        Pointer table:
            1024 elements
            zt_blk: 0
            zt_numblks: 0
            zt_shift: 10
            zt_blks_copied: 0
            zt_nextblk: 0
        ZAP entries: 3
        Leaf blocks: 1
        Total blocks: 2
        zap_block_type: 0x8000000000000001
        zap_magic: 0x2f52ab2ab
        zap_salt: 0x2a195f27b
        Leafs with 2^n pointers:
              9:      1 *
        Blocks with n*5 entries:
              0:      1 *
        Blocks n/10 full:
              1:      1 *
        Entries with n chunks:
              4:      1 *
              5:      2 **
        Buckets with n entries:
              0:    509 ****************************************
              1:      3 *

        Setting of Java for Compliance Checker MarvinApplet.docx = 342 (type: Regular File)
        Compliance Checker Web service and Client tool.docx = 344 (type: Regular File)
        Compliance_Checker_setup_guide.docx = 343 (type: Regular File)
Indirect blocks:
               0 L1  0:f15bac1000:1000 1:ed430fe000:1000 4000L/400P F=2 B=400280/400280
               0  L0 0:f15babc000:1000 1:ed430fc000:1000 4000L/400P F=1 B=400280/400280
            4000  L0 0:f15babd000:1000 1:ed430fd000:1000 4000L/a00P F=1 B=400280/400280

        segment [0000000000000000, 0000000000008000) size   32K

@tomposmiko
Copy link
Author

@dweeezil

[we can speed it up via a skype/hangouts chat, if you prefer]

strace snippet:

lstat("documents/", {st_mode=S_IFDIR|S_ISGID|0775, st_size=5, ...}) = 0
lgetxattr("documents/", "security.selinux", 0xc6bd40, 255) = -1 EFAULT (Bad address)
write(2, "ls: ", 4ls: )                     = 4
write(2, "documents/", 10documents/)              = 10
write(2, ": Bad address", 13: Bad address)           = 13
write(2, "\n", 1
)                       = 1

A healthy one:

9 drwxrwsr-x+ 6 10034 10000 7 Sep 17 18:52 ./
lstat(".", {st_mode=S_IFDIR|S_ISGID|0775, st_size=7, ...}) = 0
lgetxattr(".", "security.selinux", 0x14eed40, 255) = -1 ENODATA (No data available)
getxattr(".", "system.posix_acl_access", 0x0, 0) = -1 ENODATA (No data available)
getxattr(".", "system.posix_acl_default", 0x0, 0) = 44
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3

This is an ubuntu machine, with no selinux.

And it seems, you're right. A directory is accessible if it has ACLs. But if directory in that directory does not, then it produces the same error.

@dweeezil
Copy link
Contributor

@sopmot Thanks, the output from above clearly shows the problem:

dump_znode_sa_xattr: nvlist_unpack error=14

That's EFAULT. This should be enough information to help get started on this problem. I'll follow-up after I have time to analyze it further.

@dweeezil
Copy link
Contributor

@sopmot It's being decoded improperly because it thinks the size is 20 rather than 196 (the size of the SA in the spill block according to your separate zdb -R output, which, somewhat interestingly happens to be exactly the same size as the bonus buffer). I'm going to try to look into this further today but may not get a chance to until later this evening. Hopefully this is enough information but I might have one more debugging zdb for you to try.

@tomposmiko
Copy link
Author

Just let me know, whatever you need.
Thanks for the efforts!

@dweeezil
Copy link
Contributor

@sopmot I've gotten a chance to look into this further and, unfortunately, it looks like the the filesystem is corrupted. The dnode in question has a bad SA layout (is 3, should be 4). This will cause all sorts of havoc when trying to read it. Is this issue repeatable? If you create a new filesystem and rsync again, is the new filesystem corrupted in the same way? If so, could you please strace the rsync and grep out all the xattr-related syscalls? I've tried reproducing the problem in several obvious ways but have not yet been able to reproduce it. I've also skimmed over the code paths involved and am not seeing an immediate problem.

One other idea, could you please show me the Posix ACLs on the source file system (maybe a getfacl -R <dir>? With that information, especially if the problem is repeatable, I may be able to construct a source file system which exhibits the problem. In the mean time, I'm going to continue going over the code involved with the suspect operations.

@dweeezil
Copy link
Contributor

@sopmot After further investigation, I don't think I need to see any strace output from the rsync since it's pretty clear what rsync does. I'll continue to look into this and will send another followup when I've got more information or need you to try some additional debugging.

@tomposmiko
Copy link
Author

@dweeezil
I didn't do rsync on this filesystem, you might confuse this issue and the other one (#2717 ?)
Anyway, I cannot reproduce it.
Basically this is what I did as far as I recall:
I downloaded via ftp a zip file (it was created on a windows machine), did some fs thing on that directory, unzipped the file now those directories are corrupted. In the meantime acl was set on the parent directory like this:

setfacl -R -m d:g:it:rwX
setfacl -R -m g:it:rwX

Unfortunately this all is working now.

@tomposmiko
Copy link
Author

@dweeezil
I think it was not get corrupted at creation time.
I found another corrupted directory and I am almost totally sure, that it was placed there with a plain mv command and it was fine right after that.
However I see no other corrupted directories on that fs.

@tomposmiko
Copy link
Author

@dweeezil

One other idea, could you please show me the Posix ACLs on the source file system (maybe a getfacl -R

?

What do you mean by that? Do you need the parent directory or really the whole filesystem. BTW acl isn't set on the whole fs.

Anyway, I am afraid it would not be useful, because I did setfacl on the parent directory to check if it changes anything... it was a mistake

@dweeezil
Copy link
Contributor

@sopmot Yes, sorry, I was mixing up all these various issues and simply assumed the corrupted filesystem was created by rsync and that's why I was asking about the ACLs on the "source" file system.

Your description of the events that led to the corruption is very helpful in any case.

Do you know what set the setgid bit (g+s) on the corrupted "documents" directory shown above? Was it inherited because it was set on parent directory? Did you set it manually (chmod g+s)? The reason I ask is because all of these related problem reports involve a directory with the setgid bit set. The other interesting clue is the "pflags" ends with "044" rather than "144". This suggests an unusual order of permission-setting operations or is yet another manifestation of the problem.

I understand you can't recreate the problem now, but if you've still got the original zip file, could you please unzip it again, apply your recursive setfacl commands and then run a zdb -dddd <pool>/<file_system> <inode_number> on the resulting "documents" directory so we can see the values when it's working properly. Thanks.

@akorn
Copy link
Contributor

akorn commented Sep 22, 2014

Sorry to butt in; I just want to say my experience with this issue is similar to @sopmot 's. The corruption doesn't manifest right away. If you set up the directory "properly" (in my experience: setgid, with a default ACL), it will get corrupted after a while, but never immediately.

@dweeezil
Copy link
Contributor

@akorn No problem, thanks for the updated information.

Are either of you (or anyone experiencing this class of problem) running with spl_kmem_cache_slab_limit set to a value > 0 or spl_kmem_cache_reclaim set to zero?

@akorn
Copy link
Contributor

akorn commented Sep 22, 2014

@dweezil, yes, those seem to be the defaults.

I only set spl_kmem_cache_expire=0 explicitly, and yet:

/sys/module/spl/parameters/spl_debug_mask:188
/sys/module/spl/parameters/spl_debug_mb:-1
/sys/module/spl/parameters/spl_debug_panic_on_bug:0
/sys/module/spl/parameters/spl_debug_printk:60
/sys/module/spl/parameters/spl_debug_subsys:18446744073709551615
/sys/module/spl/parameters/spl_hostid:0
/sys/module/spl/parameters/spl_hostid_path:/etc/hostid
/sys/module/spl/parameters/spl_kmem_cache_expire:0
/sys/module/spl/parameters/spl_kmem_cache_kmem_limit:1024
/sys/module/spl/parameters/spl_kmem_cache_max_size:32
/sys/module/spl/parameters/spl_kmem_cache_obj_per_slab:16
/sys/module/spl/parameters/spl_kmem_cache_obj_per_slab_min:8
/sys/module/spl/parameters/spl_kmem_cache_reclaim:0
/sys/module/spl/parameters/spl_kmem_cache_slab_limit:16384
/sys/module/spl/parameters/spl_taskq_thread_bind:0

OTOH, on a different computer I have spl_kmem_cache_reclaim:1 and spl_kmem_cache_slab_limit:0 without setting either of them explicitly, so I don't know what's going on.

@tomposmiko
Copy link
Author

Mine (default):

/sys/module/spl/parameters/spl_debug_mask:188
/sys/module/spl/parameters/spl_debug_mb:-1
/sys/module/spl/parameters/spl_debug_panic_on_bug:0
/sys/module/spl/parameters/spl_debug_printk:60
/sys/module/spl/parameters/spl_debug_subsys:18446744073709551615
/sys/module/spl/parameters/spl_hostid:2148165376
/sys/module/spl/parameters/spl_hostid_path:/etc/hostid
/sys/module/spl/parameters/spl_kmem_cache_expire:2
/sys/module/spl/parameters/spl_kmem_cache_kmem_limit:1024
/sys/module/spl/parameters/spl_kmem_cache_max_size:32
/sys/module/spl/parameters/spl_kmem_cache_obj_per_slab:16
/sys/module/spl/parameters/spl_kmem_cache_obj_per_slab_min:8
/sys/module/spl/parameters/spl_kmem_cache_reclaim:1
/sys/module/spl/parameters/spl_kmem_cache_slab_limit:0
/sys/module/spl/parameters/spl_taskq_thread_bind:0

@tomposmiko
Copy link
Author

we can see the values when it's working properly

https://gist.github.com/sopmot/3ca667f777c7d184ce9f

Do you know what set the setgid bit (g+s) on the corrupted "documents" directory shown above? Was it inherited because it was set on parent directory?

Yes.

Did you set it manually (chmod g+s)?

As far as I can remember I didn't do that.
If everything worked as I wanted, it was inherited from the parent directory, so 99% that I didn't do that:)
In case something was wrong, then I might do that but very unlikely.

@dweeezil
Copy link
Contributor

@sopmot Thanks for the followup information. Your case has almost identical characteristics to the others. Unfortunately, I've not been able to reproduce it yet nor have I had any time this week to look into it further. I'm trying to come up with a reproducer and think that memory pressure may be part of the equation.

@tomposmiko
Copy link
Author

@dweeezil
I have tried to find the root cause of the problem, but have not been able reproduce it yet.

@behlendorf behlendorf added this to the 0.6.4 milestone Nov 17, 2014
ryao pushed a commit to ryao/zfs that referenced this issue Nov 29, 2014
If a spill block's dbuf hasn't yet been written when a spill block is
freed, the unwritten version will still be written.  This patch handles
the case in which a spill block's dbuf is freed and undirties it to
prevent it from being written.

The most common case in which this could happen is when xattr=sa is being
used and a long xattr is immediately replaced by a short xattr as in:

	setfattr -n user.test -v very_very_very..._long_value  <file>
	setfattr -n user.test -v short_value  <file>

The first value must be sufficiently long that a spill block is generated
and the second value must be short enough to not require a spill block.
In practice, this would typically happen due to internal xattr operations
as a result of setting acltype=posixacl.

Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#2663
Closes openzfs#2700
Closes openzfs#2701
Closes openzfs#2717
Closes openzfs#2863
Closes openzfs#2884

Conflicts:
	module/zfs/dbuf.c
behlendorf pushed a commit that referenced this issue Dec 23, 2014
If a spill block's dbuf hasn't yet been written when a spill block is
freed, the unwritten version will still be written.  This patch handles
the case in which a spill block's dbuf is freed and undirties it to
prevent it from being written.

The most common case in which this could happen is when xattr=sa is being
used and a long xattr is immediately replaced by a short xattr as in:

	setfattr -n user.test -v very_very_very..._long_value  <file>
	setfattr -n user.test -v short_value  <file>

The first value must be sufficiently long that a spill block is generated
and the second value must be short enough to not require a spill block.
In practice, this would typically happen due to internal xattr operations
as a result of setting acltype=posixacl.

Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2663
Closes #2700
Closes #2701
Closes #2717
Closes #2863
Closes #2884
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants