Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPLError: 11346:0:(zio.c:792:zio_write()) when trying to create a filesystem w/ DEBUG #1933

Closed
FransUrbo opened this issue Dec 6, 2013 · 16 comments
Milestone

Comments

@FransUrbo
Copy link
Contributor

Can't say for certain if this is a ZoL issue or ZFS-Crypto one (can't test without crypto since I have the encryption feature enable on my pool). I've created a ticket at zfsrogue as well, but this MIGHT be related to my dead lock problem (#1862).

Trying to create a filesystem with DEBUG enabled on both spl and zfs gives:

Message from syslogd@Celia at Dec  6 13:26:19 ...
 kernel:[  225.960618] SPLError: 11346:0:(zio.c:792:zio_write()) ASSERTION(zp->zp_checksum >= ZIO_CHECKSUM_OFF && zp->zp_checksum < ZIO_CHECKSUM_FUNCTIONS && zp->zp_compress >= ZIO_COMPRESS_OFF && zp->zp_compress < ZIO_COMPRESS_FUNCTIONS && DMU_OT_IS_VALID(zp->zp_type) && zp->zp_crypt >= ZIO_CRYPT_OFF && zp->zp_crypt < ZIO_CRYPT_FUNCTIONS && zp->zp_type < DMU_OT_NUMTYPES && zp->zp_level < 32 && zp->zp_copies > 0 && zp->zp_copies <= spa_max_replication(spa)) failed

Message from syslogd@Celia at Dec  6 13:26:19 ...
 kernel:[  225.960943] SPLError: 11346:0:(zio.c:792:zio_write()) SPL PANIC

After the SPLEerror, zfs hangs. Creating a filesystem without DEBUG enabled works (or did work a few days ago when I was running without it - have cherry-picked some commits since).

Trying to figure out what test fails, I ended up with this:

zp->zp_type(196) < DMU_OT_NUMTYPES(55)

So somehow/somewhere zp_type is set to an unvalid number. I'm currently trying to figure out where/how zp_type got a value of 196...

See http://bayour.com/misc/SPLError.txt for a previous debug test. It seems that it's cycling through something and have correct/acceptable values except for the last cycle just before the SPLError...

@FransUrbo
Copy link
Contributor Author

Adding a printk() in dmu.c:dmu_write_policy():

[  444.705368] dmu.c: zp_type=196 (wp=0, dn_bonustype=0, type=196)

@behlendorf
Copy link
Contributor

The ASSERT here looks correct, a type of 192 isn't possible. The type must be in the range 0-55 (DMU_OT_NUMTYPES), or 128-137. The lower range is for legacy compatibility with pre-feature flag pools, and the upper range is to allow new generic types. If you could also print the dataset and object number it would be interesting to see what zdb outputs about the object. Is the type 192 really stored on disk or do it somehow get damaged in memory.

@FransUrbo
Copy link
Contributor Author

@behlendorf Indeed, saw that to.

I wanted to printout the dataset, but I have no idea to get to it from any of the zio functions. The level and number of structs is impossible to get a grips on... Any pointers?

@behlendorf
Copy link
Contributor

@FransUrbo You should be able to do something like this (untested). The bookmark hung off the zio should provide the information you need. The only tricky bit is you're going to get the objset id not its name. You'll need to use zdb to dump all the datasets to get the id to name mapping.

diff --git a/module/zfs/zio.c b/module/zfs/zio.c
index 97f2549..75a7a7a 100644
--- a/module/zfs/zio.c
+++ b/module/zfs/zio.c
@@ -730,6 +730,14 @@ zio_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr_t *b
        zio->io_physdone = physdone;
        zio->io_prop = *zp;

+#ifdef _KERNEL
+       if (unlikely(!DMU_OT_IS_VALID(zp->zp_type))) {
+               printk("Invalid zp_type=%d dataset=%llu object=%llu\n",
+                   zp->zp_type, zio->io_bookmark.zb_objset,
+                   zio->io_bookmark.zb_object);
+       }
+#endif /* KERNEL */
+
        return (zio);
 }

@FransUrbo
Copy link
Contributor Author

@behlendorf unfortunately I still can't run any zdb command option without getting an I/O error :(.

@lundman
Copy link
Contributor

lundman commented Dec 20, 2013

We do create 1 new DMU_OT type, just like upstream, but it is placed just before NUM_TYPES

        DMU_OT_BPOBJ_SUBOBJ,            /* UINT64 */

        // FIXME
        DMU_OT_NEW_TYPE,            /* ZAP */

        DMU_OT_NUMTYPES,

Which is quite shy of 192/196. I would want to get rid of that DMU_OT type as well, but the new system is well confusing, and I dare say, not documented :)

'''The level and number of structs is impossible to get a grips on... Any pointers?'''

They are all pointers mate, hah! mirite?!

@FransUrbo
Copy link
Contributor Author

@lundman haha, very catchy :).

I'm currently building with the patch from @behlendorf and I'll see what that leads to. But considering I can't run any zdb command, I'm unsure what good it will do...

@FransUrbo
Copy link
Contributor Author

@behlendorf is it safe to disable the ASSERT()? The code you showed is after the assert is triggered, so it's not executed...

But why do I only get it when compiling with debugging and not without it?

@behlendorf
Copy link
Contributor

@FransUrbo All the ASSERTs are compiled out if debugging is not enabled.

@FransUrbo
Copy link
Contributor Author

It seems that zp_type is only invalid the first time.

celia(SSD1):~# mount /dev/zvol/share/VirtualMachines/Debian/ZoLRepo/ZoLRepo_x32-part1 /mnt/disk
celia(SSD1):~# dmesg | tail
[...]
[  264.413667] zp->zp_type (196) < DMU_OT_NUMTYPES (55)
[  264.413671] Invalid zp_type=196 dataset=0 object=8328
[  265.463887] EXT4-fs (zd192p1): recovery complete
[  265.483462] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
celia(SSD1):~# umount /mnt/disk
celia(SSD1):~# mount /dev/zvol/share/VirtualMachines/Debian/ZoLRepo/ZoLRepo_x32-part1 /mnt/disk
celia(SSD1):~# dmesg | tail
[...]
[  264.413667] zp->zp_type (196) < DMU_OT_NUMTYPES (55)
[  264.413671] Invalid zp_type=196 dataset=0 object=8328
[  265.463887] EXT4-fs (zd192p1): recovery complete
[  265.483462] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
[  346.369850] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
celia(SSD1):~# umount /mnt/disk 
celia(SSD1):~# dmesg | tail
[  264.413667] zp->zp_type (196) < DMU_OT_NUMTYPES (55)
[  264.413671] Invalid zp_type=196 dataset=0 object=8328
[  265.463887] EXT4-fs (zd192p1): recovery complete
[  265.483462] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
[  346.369850] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
[  565.067350] zp->zp_type (196) < DMU_OT_NUMTYPES (55)
[  565.067354] Invalid zp_type=196 dataset=0 object=8328
[  566.186518] EXT4-fs (zd640p1): 1 orphan inode deleted
[  566.186521] EXT4-fs (zd640p1): recovery complete
[  566.372928] EXT4-fs (zd640p1): mounted filesystem with ordered data mode. Opts: (null)
celia(SSD1):~# umount /mnt/disk   
celia(SSD1):~# mount /dev/zvol/share/VirtualMachines/Debian/Sid/Devel-part1 /mnt/disk/
celia(SSD1):~# dmesg | tail
[  264.413671] Invalid zp_type=196 dataset=0 object=8328
[  265.463887] EXT4-fs (zd192p1): recovery complete
[  265.483462] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
[  346.369850] EXT4-fs (zd192p1): mounted filesystem with ordered data mode. Opts: (null)
[  565.067350] zp->zp_type (196) < DMU_OT_NUMTYPES (55)
[  565.067354] Invalid zp_type=196 dataset=0 object=8328
[  566.186518] EXT4-fs (zd640p1): 1 orphan inode deleted
[  566.186521] EXT4-fs (zd640p1): recovery complete
[  566.372928] EXT4-fs (zd640p1): mounted filesystem with ordered data mode. Opts: (null)
[  617.047970] EXT4-fs (zd640p1): mounted filesystem with ordered data mode. Opts: (null)

Trying to mount everything that is mountable shows:

celia(SSD1):~# find /dev/zvol/share/VirtualMachines/Debian -name '*part[0-9]' | while read part; do mount $part /mnt/disk && umount /mnt/disk; done
[...]
celia(SSD1):~# dmesg | grep object=
[  264.413671] Invalid zp_type=196 dataset=0 object=8328
[  565.067354] Invalid zp_type=196 dataset=0 object=8328
[  732.999998] Invalid zp_type=196 dataset=0 object=8328
[  735.176367] Invalid zp_type=196 dataset=0 object=8328
[  758.983273] Invalid zp_type=196 dataset=0 object=8328
[  793.876344] Invalid zp_type=196 dataset=0 object=8328
[  796.204576] Invalid zp_type=196 dataset=0 object=8328
[  821.316946] Invalid zp_type=196 dataset=0 object=8328
[  822.451291] Invalid zp_type=196 dataset=0 object=8328
[  864.443638] Invalid zp_type=196 dataset=0 object=8328
[  866.145956] Invalid zp_type=196 dataset=0 object=8328
[  880.496276] Invalid zp_type=196 dataset=0 object=8328
[  882.363648] Invalid zp_type=196 dataset=0 object=8328
[  901.853932] Invalid zp_type=196 dataset=0 object=8328

So all of the ZVOL accesses give the same dataset and object!

@behlendorf
Copy link
Contributor

@FransUrbo That may make some sense but I'd be skeptical. Dataset 0 is always the MOS for a given pool, it contains the metadata which describes the contents of a pool (the list of datasets, references to spacemaps, etc). It wouldn't shock me if the crytpo bits added a new object type linked off the MOS and we're trying to replay a log entry during import for the unknown type.

Have you tried importing the pool read-only? That should effectively prevent any replay and disable the zio_write() code path where you're hitting this bug. That may allow for recovery of the data.

zpool import -o readonly=on <pool>

@FransUrbo
Copy link
Contributor Author

Yes, importing the pool readonly avoid most of, if not all of the problems. Unfortunately, that won't help me much at the moment. Spending quite a lot of money at this time to build a storage backup server with enough disks isn't something I really can/want to do. I'm planning a 'one killer of a storage server' to be build in the spring/summer (enough CPU, MEM and disk to run dedup+compression+better redundancy+crypto on all of my data instead of just just some selected bits).

But I have started to accept the fact that maybe I don't have a choice :(.

@lundman Any thoughts, especially on 'ZoL trying to replay a log entry which is part of the crypto bit'?

@FransUrbo
Copy link
Contributor Author

@lundman do you have any idea where that 192 value came from? Is there anything in the old code (pre-0.6.3) that could have set it?

@lundman
Copy link
Contributor

lundman commented Dec 25, 2013

192? 196?

I think I mentioned earlier we do indeed add a new type, but shouldn't that high. At the moment the log replay will simply skip the unknown value and everything is fine. We should probably drill down to find out where it comes from though. It would also be interesting to import a Solaris pool v30 with crypto to see if they as well have 196.

The first crypto code I run has uptime of 400 days, so I'm not sure how "scary" it actually is :)

It would be nice to change it to the newer style dmu_object_set_checksum dmu_object_set_compress with a companion dmu_object_set_crypto but I don't understand how these new style dmus work.

@FransUrbo
Copy link
Contributor Author

The first crypto code I run has uptime of 400 days, so I'm not sure how "scary" it actually is :)

@lundman I didn't had any problem either until I rebooted. See #1848.

@FransUrbo
Copy link
Contributor Author

I'm closing this, because it's now quite obvious that it's the zfs-crypto patch that fucked everything up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants