Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZVOL not showing up after reboot #599

Closed
lembregtse opened this issue Mar 13, 2012 · 73 comments
Closed

ZVOL not showing up after reboot #599

lembregtse opened this issue Mar 13, 2012 · 73 comments
Labels
Component: ZVOL ZFS Volumes
Milestone

Comments

@lembregtse
Copy link

I've got a weird problem with some of my zvol's.

My main pool is 'data'.

I created disks with 'zfs create data/disks'.

Then I create two zvol's with 'zfs create -V 10G data/disks/router-root' and 'zfs create -V 1G data/disks/router-swap'

These devices show up in /dev/zd... and /dev/zvol/data/disks/...

Now when I reboot my host machine, the /dev/zd... and /dev/zvol/ devices disappear. The datasets still show up in 'zfs list'.

At first I thought it might have had something to do with the sub dataset 'disks' I used. But then I created a zvol directly under data, and it still is not showing up after reboot.

Anyone any idea why it's not showing up?

I'm using ubuntu oneiric with 3.0.0-16-server amd64, I've tried both ppa daily and stable. If any more info is needed, don't hesistate to ask.

@behlendorf
Copy link
Contributor

Check if the zd device appears under /sys/block, if it does there's an issue with the zvol udev rules.

@lembregtse
Copy link
Author

dm-0 dm-3 loop2 loop5 ram0 ram11 ram14 ram3 ram6 ram9 sdc sdf sdi
dm-1 loop0 loop3 loop6 ram1 ram12 ram15 ram4 ram7 sda sdd sdg sdj
dm-2 loop1 loop4 loop7 ram10 ram13 ram2 ram5 ram8 sdb sde sdh sdk

No zd devices there.

@lembregtse
Copy link
Author

Oh yes, I don't know if this could be related but I used -o ashift=12 to create my data pool.

@behlendorf
Copy link
Contributor

Any failure messages in dmesg after the zfs modules load? At module load time we should be registering all the majors and minors for the various zvols.

@lembregtse
Copy link
Author

[ 65.553078] SPL: Loaded module v0.6.0.53, using hostid 0x007f0101
[ 65.556448] zunicode: module license 'CDDL' taints kernel.
[ 65.559450] Disabling lock debugging due to kernel taint
[ 65.595909] ZFS: Loaded module v0.6.0.53, ZFS pool version 28, ZFS filesystem version 5
[ 65.630370] udevd[671]: starting version 173

No error messages or warnings are shown by dmesg on boot or on modprobe. It's really vague, I'll destroy the pool and recreate it with normal ashift 9 to see if the problem persists.

@lembregtse
Copy link
Author

  1. zpool create data raidz devices
  2. zfs create -V 10G data/fish
  3. The following exists: /dev/zd0 /dev/zvol/data/fish
  4. reboot
  5. /dev/zvol/data/fish and /dev/zd0 do not exist -> no zd device in /sys/block

So it's the same for ashift 9

zfs list shows
NAME USED AVAIL REFER MOUNTPOINT
data 10.3G 14.2T 48.0K /data
data/fish 10.3G 14.2T 28.4K -

so the volume is still there.

@lembregtse
Copy link
Author

Ok I've debugged this a little further. I created the same setup in a virtual machine and could not reproduce the error.

Now there is one big differnce between my vm and host machine, the disk id's.

On my VM I use /dev/sd* as disk for the pool. On my host I use /dev/disk/by-id/...

When I use /dev/sd* on my host, the problem goes away and zvol pops up after reboot. I hope this will help you find the problem.

@behlendorf
Copy link
Contributor

Thanks for additional debugging on this I'm sure it will help us get to the bottom on the issue, particularly if we're able to reproduce this issue locally

@Phoenixxl
Copy link

I have the same problem . I'm using the daily snapshot of precise however due to driver needs.

Using /dev/sd* is impossible for me , due to the fact my mb controller will be having removables connected.

Another slight difference is I'm not using by-id but have configured a zdev.conf file.

Also , mounts aren't done upon reboot either.

Thank you for taking the time to look into this.

@Phoenixxl
Copy link

Looking at issues from the past where someone reported the same thing , I tried renaming the zvol . That made the zvol appear in /dev/ again.

This isn't really a workaround , but it might help pinpoint the issue.

@lembregtse
Copy link
Author

What os are you using? I use the "mountall" specific zfs package. It mounts all zfs filesystems automatically (if specified).

@Phoenixxl
Copy link

Fresh precise pangolin install from daily snapsot. : http://cdimage.ubuntu.com/ubuntu-server/daily/20120315/precise-server-amd64.iso

(I usually don't go for development releases and stay away from daily builds as well. I like stable and tested. Error messages from my controller seemed to be unimplemented in kernels before 3.2 . And since this is going to be the LTS release and I'm planning to migrate to it on all my running machines I figured what the hell..)

Only sshd and samba are the installed packages.

Right after the OS I Installed zfs from Darik Horn's ZFS PPA. : ppa:zfs-native/daily

During install zfs-mountall seemed to be part of it and was configured , yet it's not doing it's thing.

I haven't edited /etc/default/zfs to do mounting either .

Does renaming the zvol make it show up in /dev/ for you again ??

@lembregtse
Copy link
Author

I use oneiric, as precise had kvm errors on pci passthrough. I think AMD IOMMU was wrongly mapped by the kernel.

On oneiric I use "mountall=2.31-zfs1" to mount zfs for me. The regular method is not working, but someone offered a sollution somewhere in the issues.

I also reverted from /dev/disk/by-id to /dev/sd* to solve the problem. You could try export / import -d /dev/disk/by-id or /dev to switch between those.

@Phoenixxl
Copy link

I can't use /dev/sd* . My motherboard's controller has removables attached to it. depending on what's inserted at boot time the order changes. /dev/sdc to /dev/sdh are highly variable. sda and sdb are my cache.

I need to be able to use by-path. making a zdev.conf is the sane option in that case.

I have mountall installed :

root@Pollux:/home/zebulon# dpkg --list mountall
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-==========================================-=====================
ii mountall 2.35-zfs1 filesystem mounting tool
root@Pollux:/home/zebulon#

It just isn't mounting at boot . On natty it did so fine , not on precise. I've looked for configuration options for mountall , but don't see anything in /etc or /etc/default.

Zvol's not showing up in /dev is the bigger issue for me though. Making a script that renames my zvol's at boot time , then renames them back seems like a plaster on a wooden leg. And at this point I see no other way of making them pop up.

calling mountall a second time (assuming it gets called at boot) after boot does mount my storage , but that's just the same as "zfs mount -a" .. maybe it's a priority thing /shrug .. For mounting there's several things that can be done in a script that can pass for sane.(mountall working at boot would be the expected fix though)

@Phoenixxl
Copy link

As suggested here : http://groups.google.com/a/zfsonlinux.org/group/zfs-discuss/browse_thread/thread/5b25e2a172cd2616

checked if zfs was in : /etc/initramfs-tools/modules , which wasn't the case
I checked if zfs.ko was in initrd , it wasn't .
(does installing the package zfs-ubuntu do this on oneiric ?)

So I added zfs to modules , updated initrd , verified if it was in initrd

http://pastebin.com/qCa83Kyw

Rebooted .. no change .. my storage wasn't mounted.(no zvol's either)

Also as suggested in that thread , reinstalled gawk , spl-dkms , zfs-dkms , rebooted

dkms status
uname -a
cat /proc/partitions

http://pastebin.com/Am8RexPe

No change still .. no zvol , no storage

I just tried setting mount to 'y' in /etc/default/zfs. It doesn't doesn't mount on startup either ... manual zfs mount -a works... something must be really off.

@stephane-chazelas
Copy link

Maybe not related. But the issue I have here is that upon booting, udev fires some blkid and zvol_id concurrently for ever /sys/block/zdxxx, and as there are hundreds and every zfs operation takes a long time to complete is normal conditions where only one command is run at a time, udev eventually times out and kills those commands resulting in missing /dev/zvol files.

Work around for me is to run
for f (/sys/block/zd* ) (udevadm trigger -v --sysname-match=$f:t; sleep 1)

(zsh syntax) manually after booting to restore the missing zvols, one at a time with a 1 second delay in between each.

@rssalerno
Copy link

I just tried setting mount to 'y' in /etc/default/zfs. It doesn't doesn't mount on startup either ...
manual zfs mount -a works... something must be really off.

I am having the same issue as Phoenixxl. It started after going from ppa stable to ppa daily. I suspect this is related to /etc/init.d/zfs being changed into /etc/init.d/zfs-mount and zfs-share. I tried the following:

update-rc.d zfs-mount defaults

but it didn't work, plus this caused "zpool status -x" to show corruption. Once the rc(n).d links were removed, the "corruption" disappeared and still had to to be mounted manually after boot.

@dajhorn
Copy link
Contributor

dajhorn commented Mar 17, 2012

There seem to be several potentially conflated problems reported in this ticket:

  1. You can get spurious "OFFLINE" or "FAULTED" errors if the /etc/zfs/zpool.cache file is stale. Try regenerating it.
  2. Always use zpool create ... /dev/disk/by-id/... and zpool import -d /dev/disk/by-id. If you found a circumstance where using /dev/sd* is actually required, then please open a new ticket for it. KVM and VMware have bugs that can cause missing links in /dev/disk/by-id.
  3. ZoL is incompatible with lazy drive spin-up. Set the BIOS, HBA, or virtual machine host to always fully spin-up all drives. KVM and/or Virtual Box sometimes issues a hotplug event on INT13 disks after POST, which never happens in a sane system, which breaks ZoL.
  4. Please, therefore, try to reproduce any bug involving KVM without KVM.
  5. Calling zfs mount -a in an init script is futile on a fast or big system that can race upstart. Manually running zfs mount -a afterwards while the system is quiescent is not diagnostic.
  6. Ubuntu Natty and Ubuntu Precise need the zfs-mountall package. If the /sbin/mountall that is patched for ZFS is failing -- while the init.d scripts are disabled -- then please open a separate ticket for it.
  7. udev trips over its shoelaces when more than a few hundred ZoL device nodes are instantiated.
  8. udev can call zpool_id on a storage node before the ZFS driver is ready for the request, which spews warnings onto the system console and can hang dependent services. The provisos about lazy drive spin-up and virtualized environments apply here too.

The last two points might resolve by making the zfs.ko module load and plumb itself synchronously.

@Phoenixxl
Copy link

Not wanting to make the faux pas of creating a duplicate ticket , having an issue with the same symptoms , IE zvol's not showing up at boot , I added to this one.

Having both zvol's not showing up and mounting not happening I didn't want to assume both issues were unrelated. Hence I mentioned both here.

I for one am not using any form of virtualization .

As to point 5 , it being very unrelated to diagnosing anything , I am in the very real situation of needing to rename zvol's , and doing a mount -a in a batch file since it's the only way to get my machine started for doing it's job . I am willing to try whatever you ask for diagnostic purposes , I'm already happy these things are being looked into.

As for point 6 , I did a clean precise install , installed zfs using the ppa / daily . I saw something scroll by mentioning zfs mountall , so i'm assuming that was the patched mountall being installed .. I will start a new ticket relating to that.

Before any boot specific things get started my computer spends 30 seconds detecting my first 8 drives , where I see drive activity , then another minute spinning up a further 12 on another controller . I assume everything is spun up by the time anything zfs related gets loaded.

In syslog the only thing I see with the keyword "zfs" is :
"ZFS: Loaded module v0.6.0.54, ZFS pool version 28, ZFS filesystem version 5"
Am I supposed to see more after that if things run correctly?

Also , -And I think this is somewhat related but if it isn't I'll make a new ticket , unless it's normal behavior- I created a zvol , which showed up as zd0 in /dev at creation time . During the next 2 days , to be able to use it I had to rename it on every reboot . Now for the weird bit , Today I created a second zvol which showed up as zd16 .. 16 sounds about right when it comes to the amount of reboots combined with renames since zd0 was made.. Is there some kind of counter that gets incremented with every rename ?

Should I start 2 new tickets ? One for zvol's not showing up , the other for zfs-mountall not starting , both specific to precise. Maybe it was wrong to think mr lembregtse and my own zvol issue were the same.

Kind Regards
Phoenixxl.

@dajhorn
Copy link
Contributor

dajhorn commented Mar 18, 2012

@Phoenixxl:

Not wanting to make the faux pas of creating a duplicate ticket

Don't worry about that. The regular ZoL contributors are friendly and duplicate tickets are easy to handle.

I am in the very real situation of needing to rename zvol's , and doing a mount -a in a batch file since it's the only way to get my machine started for doing it's job

If you are renaming volumes, then perhaps you have bug #408. @schakrava proposes a fix in pull request #523, but you would need to rebuild ZoL to try it.

In syslog the only thing I see with the keyword "zfs" is :
"ZFS: Loaded module v0.6.0.54, ZFS pool version 28, ZFS filesystem version 5"
Am I supposed to see more after that if things run correctly?

No, this is normal. Anything more than that usually indicates a problem.

Today I created a second zvol which showed up as zd16 .. 16 sounds about right when it comes to the amount of reboots combined with renames since zd0 was made.. Is there some kind of counter that gets incremented with every rename ?

This is also normal and is caused by an implementation detail.

The number on each bare zvol device node increments by sixteen. Notice how these numbers correspond to the minor device number that is allocated by the system. If you partition /dev/zdN, then each partition node will get a minor device number between N+1 and N+15.

For example, this system has four zvols, the first of which is partitioned:

# ls -l /dev/zd*
brw-rw---- 1 root disk 230,  0 2012-03-18 13:16 /dev/zd0
brw-rw---- 1 root disk 230,  1 2012-03-18 13:16 /dev/zd0p1
brw-rw---- 1 root disk 230,  2 2012-03-18 13:16 /dev/zd0p2
brw-rw---- 1 root disk 230,  3 2012-03-18 13:16 /dev/zd0p3
brw-rw---- 1 root disk 230, 16 2012-03-18 13:09 /dev/zd16
brw-rw---- 1 root disk 230, 32 2012-03-18 13:09 /dev/zd32
brw-rw---- 1 root disk 230, 48 2012-03-18 13:09 /dev/zd48

Also note that device nodes are created for snapshots and persist after a zfs rename. If the system is running something that automatically creates snapshots, or if you frequently rename zvols, then the /dev/zd* number can become large, which is normal.

Note that manually renaming any /dev/zd* device node, or using one that is stale, will confuse udev and break the system. Always use the /dev/zvol/ aliases instead. (Which are sometimes incorrect per bug #408.)

Should I start 2 new tickets ? One for zvol's not showing up , the other for zfs-mountall not starting , both specific to precise.

Yes, but please check whether you affected by the problem described in bug #408.

@toadicus
Copy link

I've been running in to this issue: after creating a zvol, it appears as /dev/zdX (but not as /dev/pool/path), but after a reboot it doesn't appear either as /dev/zdX or /sys/block/zdX. I'm running 64-bit Gentoo on baremetal. Is there any information I can provide to help diagnose the issue?

@toadicus
Copy link

toadicus commented Apr 2, 2012

After some further investigation, it looks like this might be something to do with order of operations, perhaps in module loading? After a fresh reboot, the ZVOLs did not appear in /dev or /sys/block, but if I manually zfs stop, rmmod zfs, zfs start, they appear. For now I've removed the zfs module entry from /etc/conf.d/modules, and I'll let the zfs script load the modules. I'll let you know if that solved it next time I get a chance to reboot.

@craig-sanders
Copy link
Contributor

FYI, for anyone else encountering this problem...

I ran into this again this morning on one system. did the export followed by import trick to get the /dev/zd* devices to show up, but still no /dev/zvol/ or /dev/poolname/ symlinks.

this seems to be the easiest way to get the symlinks to show up again:

udevadm trigger -v --subsystem-match=block --sysname-match=zd* --action=change

the '-v' is optional, for verbose output. also optional is '-n', for --dry-run.

@pyavdr pyavdr mentioned this issue Apr 10, 2012
@ryao
Copy link
Contributor

ryao commented Apr 16, 2012

This appears to be a duplicate of issue #441.

@Phoenixxl
Copy link

It isn't the same.

I don't need to import or export anything , renaming the zvol works.

This showed up after using rc6 on natty just fine , it popped up when using precise ..

Zpool status shows zpools being ok , just nothing is mounted at boot.

@Phoenixxl
Copy link

This still is happening .. Today after having resolved another unrelated issue , I am confronted with this again.

zfs release last tested is : 0.6.0.62 .
ubuntu server with kernel 3.2.0-24-generic

I have been exporting and importing my pools all week due to another issue , whatever ryao was suffering from , for me , unlike issue 441 , it doesn't resolve when exporting/importing. unless ryao means he has to export / import on every reboot it might still be the same issue , since doing that , like renaming , would make them show up as well I suppose.

Would uploading a dmesg output or any other log / diagnostic output help in finding the cause for this ? I would gladly provide.

Kind regards.

@bitloggerig
Copy link

Tested on Arch Linux, as drukargin suggested:

the ZVOLs did not appear in /dev or /sys/block,
but if I manually zfs stop, rmmod zfs, zfs start, they appear

Symlinks in /dev/zvol appear as well.

@ryao
Copy link
Contributor

ryao commented Dec 2, 2012

@Phoenixxl I can say with confidence that the package manager does install those files properly. For some reason, this particular individual's system had a damaged package that was missing those files.

With that said, do the zvols reappear when you use udevadm trigger -v --subsystem-match=block --sysname-match=zd* --action=change, as @craig-sanders had suggested? If not, are zd* devices missing from both /sys/devices/virtual/block/ and /sys/class/block/?

There is more than one way that this problem can occur and it sounds like your system suffers from a different one.

@Phoenixxl
Copy link

@ryao It's great that you found an issue in the package manager. However , as you are clearly stating in that last line , it doesn't encompass everything this thread is about / why it was started. Hence , I think it would be appropriate if you started another thread that focuses on that particular problem. Closing this ticket upon fixing an install issue will do nothing for me.

To recap:

After reboot zvol's aren't present.
an udev trigger like craig-sanders suggests doesn't do anything.
there is no sign of z* in /sys/devices/virtual/block/ nor in /sys/class/block/ there is no /dev/zvol . (Before or after doing an udevadm trigger)
Mount points are available.
"zpool status" shows all disks.
"zfs list -t" all lists zvols and mount points correctly.
Files on mount points are accessible as normal.
Zpool.cache is fine.

To be able to use zvols the only thing I can do is rename the zvol to something else , then give it back it's original name. Once that's done , the zvol appears in /sys/devices/virtual/block/ , /sys/class/block/ , /dev/zvol and everything works fine.

@ryao
Copy link
Contributor

ryao commented Dec 3, 2012

@Phoenixxl I am the Gentoo Linux ZFS maintainer. That issue was specific to this one person's system and had nothing to do with the package manager.

With that said, the issue being discussed here involves situations where udev fails to do its job. On your system, the kernel module fails to setup zvol devices. That is issue #441, which is similar, but does not involve udev.

@Phoenixxl
Copy link

@ryao Not really . Issue 441 implies something wrong with zpool cache. every post above craig-sanders hs nothing to do with udev. You are making this thread about something else. Nobody ever implied udev wasn't doing it's job.
I believe at some point cjdelisle , stephane-chazelas believed to have the same issue as the rest of us , where his was udev related instead? Take the time to read up from the top please.

having nothing in /sys makes an udev cause implausible.

Message 3 from lembregtse : quote "No zd devices there."

Someone halfway down the thread having a different issue hijacking it seems to be what's happening.

Also in issue 441 you also imply "udevadm trigger " fixes it as well. You are clearly having another issue .

Friendly regards.
Phoenixxl.

@ryao
Copy link
Contributor

ryao commented Dec 20, 2012

The lack of zd* devices is the exact reason why issue #441 was opened. Anyway, the /dev/zd* devices are created by udev. I asked if people were missing zd* devices missing from both /sys/devices/virtual/block/ and /sys/class/block/ to try to understand who is suffering from what issue. Quite often, you can have multiple issues that result in the same effect and people tend to think that they are the same.

For all of the people posting here, it is not clear to me that they are having the same issue or multiple issues that result in the same outcome. That is why I asked for data from their systems to determine what is going wrong on each individual system.

@Phoenixxl
Copy link

Lack of population in /sys/block . Which isn't udev related.

Edit: I have unloaded the ZFS modules using the zfs.sh script . When executing a zfs related command to load them again , /dev/zvol is populated. I have also exported and imported my pools , after which /dev/zvol get populated as well. This does not change anything permanently though , the next reboot they're gone again.
Reading everything in #441 again , I don't see any noticeable difference between this and it. If they were to be merged , I could see the logic in it. However that doesn't make them udev related . Both speak about unpopulated /sys .

It's nice to see someone spending all their time making zfs work better with udev. I applaud you for the time and effort you put into this. However , neither this nor #441 are threads about udev. Hence me mentioning , wouldn't it be better to create an udev specific thread. Anything udev related that gets fixed / changed won't change this nor #441. Fixing udev ill not magically populate /sys/block. Fixing udev will not close either case either.

Many people will add to this thread asking if their issue is similar to what is being discussed here , that doesn't make this thread about their issue. Since the start @lembregtse and I have had identical symptoms. He seems to have moved on since a few months ago. However for me , this is still happening to my main storage machine. Is is the main issue I have with ZFS atm. I really do not want this thread to be hijacked and end up being closed fixing absolutely nothing related to why it was created in the first place.

My main concern is , the topic at hand gets overshadowed by something unrelated , and at worse the issue gets closed because something unrelated gets fixed.

Regards.

@tyrotex
Copy link

tyrotex commented Jan 20, 2013

I'm a n00b ZoL user, so I'm still trying to fathom how the whole system hangs together.

  • Debian 6/Proxmox 2.2 vmlinuz-2.6.32-17-pve
  • ZoL 0.6.0-rc12 built from source

I'm experiencing this same problem of /dev/zvol or /dev/zd* not being created on boot. Nothing in /sys/block, etc, etc.

What I've noticed on my system is that this only occurs when some of my pool vdevs are not available at boot.

I have a pool defined on a single external USB drive. If it is attached, all is good. If I don't attach that drive at boot, /dev/zvol goes AWOL.

If my external USB drive is not attached, I am able to restore /dev/zvol by exporting & importing an available pool.

Regarding boot timing, I haven't yet found where the zfs kernel module is normally loaded. Advice would be appreciated.

After installing spl & zfs, /etc/init.d/zfs was installed but was not being executed because nothing linked to it from the other /etc/rc?.d directories. Recently, I ran 'insserv zfs', which gave it an S01prefix in rc2.d. Before discovering 'insserv', I used a 'zfs mount -a' command in /etc/rc.local (through /etc/rc2.d/S17rc.local) to get my pools mounted.

Regarding other suggestions made above:

  • Adding a 'rmmod zfs' command before 'zfs mount -a' in /etc/rc.local had no effect.
  • The 'udevadm trigger" command has no effect.
  • An interactive 'rmmod zfs' fails with a 'busy' message. If I 'force' its removal and then run /etc/init.d/zfs, the zfs module is reloaded but no /dev/zvol is created.

Regards,

@Phoenixxl
Copy link

@tyrotex
Your issue seems to be with zvols not appearing if all the pools that are defined in /etc/zfs/zpool.cache aren't available on boot?

This seems like a serious issue that warrants a new thread as well. At first glance it doesn't seem to be the same issue being discussed here , but is's at least if not more important than this one..

Have you tried experimenting with this more? Is it only USB , or does this happen with pools that are created on other devices beside your main controller as well ?
I'm not involved with the zol project , so I'm merely asking this to satisfy my own curiosity.
This issue intrigues me somewhat , since it would mean the order in which the OS detects controllers could affect zvols showing up at boot ...

@DanielSmedegaardBuus
Copy link

Having the same issue here. Ubuntu, 3.0.0-24-server. Darik's PPA, 0.6.0.91 (same issue with my previously installed version).

If I don't have a zvol in my pool, everything loads and mounts perfectly on every startup.

Creating a zvol, it's sort of random what happens after a boot. Either,

  • The zvol is present under /dev/zvol/, but no zfs mounts
  • No zvol, but all zfs mounts
  • The zvol, and some zfs mounts
  • No zvol, but some zfs mounts
  • Neither a zvol nor any zfs mounts

Completely random. I can reboot and get a different combo. I have no warnings or errors in dmesg after the zfs module loads.

I have mountall from the repo installed. By itself it doesn't seem to do anything, so I've also edited /etc/default/zfs to automount and autounmount. If I use the vanilla config, I always have neither my zvol nor any zfs mounts.

zfs mount -a will fail on empty directories because "cannot be mounted, unable to open the dataset", non-empty ones because "directory is not empty". Emptying it gives the other error. Destroying the zvol fixes everything, but only after a subsequent reboot.

rmmod zfs && modprobe zfs will give me my zvol, but I cannot mount any zfs afterwards.

Exporting my pool which uses /dev/disk/by-partlabel/* paths for its members, and re-importing it using /dev/sd* instead will make everything work consistently, but only after a subsequent reboot. This also makes everything automount on boot without me having to enable it in /etc/default/zfs. It actually makes everything automount even though I've specifically said 'no' to this in that file.

Exporting and reimporting via /dev/disk/by-id/ will import but immediately fail mounts with the "dataset" error. A reboot left me with neither zvol, nor mounts. zfs mount -a again gave "dataset" errors.

I'm now running my zpool using /dev/sd* paths as this works, although I'll get problems once I happen to reboot with a USB device attached, or a disk fails at POST.

Here's hoping for a fix :)

@aarcane
Copy link

aarcane commented Jan 22, 2013

Zpool.cache is the primary cause of /dev/sd* based imports. I'd like to
again request that it be deprecated wherever possible or its format be
otherwise simplified and its ability to impede functionality forever
reduced or eliminated.
On Jan 22, 2013 7:59 AM, "Daniel Smedegaard Buus" [email protected]
wrote:

Having the same issue here. Ubuntu, 3.0.0-24-server. Darik's PPA, 0.6.0.91
(same issue with my previously installed version).

If I don't have a zvol in my pool, everything loads and mounts perfectly
on every startup.

Creating a zvol, it's sort of random what happens after a boot. Either,

  • The zvol is present under /dev/zvol/, but no zfs mounts
  • No zvol, but all zfs mounts
  • The zvol, and some zfs mounts
  • No zvol, but some zfs mounts
  • Neither a zvol nor any zfs mounts

Completely random. I can reboot and get a different combo. I have no
warnings or errors in dmesg after the zfs module loads.

I have mountall from the repo installed. By itself it doesn't seem to do
anything, so I've also edited /etc/default/zfs to automount and
autounmount. If I use the vanilla config, I always have neither my zvol nor
any zfs mounts.

zfs mount -a will fail on empty directories because "cannot be mounted,
unable to open the dataset", non-empty ones because "directory is not
empty". Emptying it gives the other error. Destroying the zvol fixes
everything, but only after a subsequent reboot.

rmmod zfs && modprobe zfs will give me my zvol, but I cannot mount any zfs
afterwards.

Exporting my pool which uses /dev/disk/by-partlabel/* paths for its
members, and re-importing it using /dev/sd* instead will make everything
work consistently, but only after a subsequent reboot. This also makes
everything automount on boot without me having to enable it in
/etc/default/zfs. It actually makes everything automount even though I've
specifically said 'no' to this in that file.

Exporting and reimporting via /dev/disk/by-id/ will import but immediately
fail mounts with the "dataset" error. A reboot left me with neither zvol,
nor mounts. zfs mount -a again gave "dataset" errors.

I'm now running my zpool using /dev/sd* paths as this works, although I'll
get problems once I happen to reboot with a USB device attached, or a disk
fails at POST.

Here's hoping for a fix :)


Reply to this email directly or view it on GitHubhttps://github.com//issues/599#issuecomment-12550589.

@toadicus
Copy link

toadicus commented Feb 4, 2013

I'm having this problem on several boxes running CentOS 6.3. Specifically:

  • On boot, the ZFS tree is mounted at /, but no ZVOLs exist in /dev/zvol or /dev/ or /dev/zd*
  • udevadm -v trigger --sysname-match="zd*" has no effect and prints nothing
  • zpool export ; zpool import -d /dev/disk/by-id brings ZVOL nodes to /dev/zvol or /dev/ and/dev/zd*
  • I do not appear to be having any problems with zpool.cache, which is in an LVM-backed ext4 volume
  • I do have zfs enabled in chkconfig

If there's any logs, etc, that I can offer, let me know.

@GregorKopka
Copy link
Contributor

On Gentoo with zfs root, data pool on LUKS encrypted partitions

  • Linux version 3.7.10-gentoo (root@install) (gcc version 4.6.3 (Gentoo Hardened 4.6.3 p1.11, pie-0.5.2) ) Use Barriers in pre-2.6.24 kernels #1 SMP Thu Mar 14 07:06:58 CET 2013
  • sys-fs/zfs-0.6.0_rc14-r1 USE="rootfs -custom-cflags (-kernel-builtin) -static-libs -test-suite"
  • setup along https://github.com/ryao/zfs-overlay/blob/master/zfs-install without using an overlay
  • no zfs udev rules in /etc/udev/rules.d, reemerging zfs dosn't create some there
  • dmesg like this when data zpool import starts (4 disk raidz on LUKS encrypted partitions)
    Mar 14 19:11:14 install kernel: zd0: p1 p2 <== some seconds after zpool import started
    .
    Mar 14 19:13:31 install kernel: INFO: task udevd:6271 blocked for more than 120 seconds.
    Mar 14 19:13:31 install kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    Mar 14 19:13:31 install kernel: udevd D 0000000000000002 0 6271 4623 0x00000004
    Mar 14 19:13:31 install kernel: ffff8807d40a5a28 0000000000000086 ffff8807f96d8630 ffff8807d40a5fd8
    Mar 14 19:13:31 install kernel: ffff8807d40a5fd8 0000000000011440 ffff8807fb89c410 ffff8807f96d8630
    Mar 14 19:13:31 install kernel: ffff8807d40a5958 ffffffff810edc00 ffff8807d40a5a28 ffffffff810ef595
    Mar 14 19:13:31 install kernel: Call Trace:
    Mar 14 19:13:31 install kernel: [] ? zone_watermark_ok+0x1a/0x1c
    Mar 14 19:13:31 install kernel: [] ? get_page_from_freelist+0x3dc/0x46b
    Mar 14 19:13:31 install kernel: [] ? kmem_alloc_debug+0xc2/0x13b [spl]
    Mar 14 19:13:31 install kernel: [] schedule+0x5f/0x61
    Mar 14 19:13:31 install kernel: [] schedule_preempt_disabled+0x9/0xb
    Mar 14 19:13:31 install kernel: [] __mutex_lock_slowpath+0x14c/0x266
    Mar 14 19:13:31 install kernel: [] mutex_lock+0xf/0x20
    Mar 14 19:13:31 install kernel: [] zrl_is_locked+0xa0f/0x144a [zfs]
    Mar 14 19:13:31 install kernel: [] __blkdev_get+0xba/0x3ab
    Mar 14 19:13:31 install kernel: [] ? __mutex_init+0x29/0x2b
    Mar 14 19:13:31 install kernel: [] blkdev_get+0x1c7/0x2b6
    Mar 14 19:13:31 install kernel: [] ? bdget+0x118/0x123
    Mar 14 19:13:31 install kernel: [] ? blkdev_get+0x2b6/0x2b6
    Mar 14 19:13:31 install kernel: [] blkdev_open+0x62/0x67
    Mar 14 19:13:31 install kernel: [] do_dentry_open.isra.16+0x159/0x21d
    Mar 14 19:13:31 install kernel: [] finish_open+0x1d/0x28
    Mar 14 19:13:31 install kernel: [] do_last+0x942/0xb9e
    Mar 14 19:13:31 install kernel: [] ? link_path_walk+0x79/0x765
    Mar 14 19:13:31 install kernel: [] path_openat+0xb9/0x414
    Mar 14 19:13:31 install kernel: [] do_filp_open+0x33/0x81
    Mar 14 19:13:31 install kernel: [] ? __alloc_fd+0x61/0xf3
    Mar 14 19:13:31 install kernel: [] do_sys_open+0x10b/0x19d
    Mar 14 19:13:31 install kernel: [] sys_open+0x1c/0x1e
    Mar 14 19:13:31 install kernel: [] system_call_fastpath+0x16/0x1b
    _3 instances of this repeating every 2 minutes while * _zpool import* is running
    .
    Mar 14 19:20:29 install init: Switching to runlevel: 4 <== zpool import returned at this point after 7 minutes
  • 236 datasets (with 25457 snapshots)in pool, freshly send/recv'ed over from several to be decomissioned servers (which are to be consolidated with this new machine), 1 80GB zvol created at this point to test iscsi target
  • after reboot and reimporting the pool /dev/zvol is missing sometimes, adding udevadm trigger to the script which unlocks the partitions for the pool and imports it makes it appear

further testing with more ZVOLs and snapshots on them in the next few days

@byteharmony
Copy link

I've had this issue on many zol systems with over 100 snapshots on the system. (all centos 6.3) The issue for us was simply the creation of the devices (we use /dev// caused MASSIVE load. On systems with 2000 snapshots (6000 devices might be created, remember devices are created for snapshots AND their partitions) for example boot loads of 2500 were common and many missing devices were a result. A recent feature added to master creates a new zfs attribute snapdev which is set by default to hidden. This skips snapshot processing. This fixed the issue for me.

I've got this master code on about 10 machines now and it's worked flawless! Also in the latest master release is feature flags. You need to upgrade your pool for this. LOTS of great improvements from this: (see this video Brian suggested for details)

http://www.youtube.com/playlist?list=PLFC9970A828416AE5

Another thing I did was create an init script that will sleep until the load drops below 1. I put this init script in right after ZFS loads and the system sits waiting until all the volumes are finished processing. This fixes issues with iscsi services starting before zvol devices are available. It also helps with VMs that may boot direct of zvol devices.

The manual udev trigger is what we did before the release with snapdev hidden. Of course that is a nightmare for iscsi or vm systems that must then be manually addressed. Thanks a ton to the team for the snapdev feature!

BK

@ryao
Copy link
Contributor

ryao commented May 28, 2013

Would someone suffering from missing zvols test the patch in issue #1477? It should fix issue #441 and by extension, I would expect it to fix this issue. However, some people think that this is a different issue, in which case, it should have no effect.

@Dolpa70
Copy link

Dolpa70 commented Jun 14, 2013

Hi ryao,

I tried your patch to my kernel module and its work for me well. My system is Archlinux kernel 3.9.5 and zfs 0.6.1. I have now /sys/block/zd* , /dev/zvol/* and /dev/zd* after reboot. Thanks for your work. I hope they will fix soon officially in the next stable release.

behlendorf added a commit to behlendorf/zfs that referenced this issue Jul 2, 2013
One of the side effects of calling zvol_create_minors() in
zvol_init() is that all pools listed in the cache file will
be opened.  Depending on the state and contents of your pool
this operation can take a considerable length of time.

Doing this at load time is undesirable because the kernel
is holding a global module lock.  This prevents other modules
from loading and can serialize an otherwise parallel boot
process.  Doing this after module inititialization also
reduces the chances of accidentally introducing a race
during module init.

To ensure that /dev/zvol/<pool>/<dataset> devices are
still automatically created after the module load completes
a udev rules has been added.  When udev notices that the
/dev/zfs device has been create the 'zpool list' command
will be run.  This then will cause all the pools listed
in the zpool.cache  file to be opened.

Because this process in now driven asynchronously by udev
there is the risk of problems in downstream distributions.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#441
Issue openzfs#599
Issue openzfs#756
Issue openzfs#1020
ryao pushed a commit to ryao/zfs that referenced this issue Jul 3, 2013
One of the side effects of calling zvol_create_minors() in
zvol_init() is that all pools listed in the cache file will
be opened.  Depending on the state and contents of your pool
this operation can take a considerable length of time.

Doing this at load time is undesirable because the kernel
is holding a global module lock.  This prevents other modules
from loading and can serialize an otherwise parallel boot
process.  Doing this after module inititialization also
reduces the chances of accidentally introducing a race
during module init.

To ensure that /dev/zvol/<pool>/<dataset> devices are
still automatically created after the module load completes
a udev rules has been added.  When udev notices that the
/dev/zfs device has been create the 'zpool list' command
will be run.  This then will cause all the pools listed
in the zpool.cache  file to be opened.

Because this process in now driven asynchronously by udev
there is the risk of problems in downstream distributions.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#441
Issue openzfs#599
Issue openzfs#756
Issue openzfs#1020
behlendorf pushed a commit that referenced this issue Jul 3, 2013
There is an extremely odd bug that causes zvols to fail to appear on
some systems, but not others. Recently, I was able to consistently
reproduce this issue over a period of 1 month. The issue disappeared
after I applied this change from FreeBSD.

This is from FreeBSD's pool version 28 import, which occurred in
revision 219089.

Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #441
Issue #599
@behlendorf
Copy link
Contributor

The zvol initialization code has been slightly reworked in the latest master source as of commit 91604b2. The zvols are now created when a pool s imported and not at module load time. This is expected to resolve the issue but I'd appreciate if someone suffering from this problem could verify the fix.

@behlendorf
Copy link
Contributor

@ryao verified the fix, I'm closing this issue.

@Phoenixxl
Copy link

I finally found a time slot to take the machine which was suffering from this issue off the network and update it.

I can confirm the zvol issue is fixed for me.

Friendly regards.

PS:
I did however have to switch from zdev.conf to vdev_id.conf and export-import my pools again.

unya pushed a commit to unya/zfs that referenced this issue Dec 13, 2013
There is an extremely odd bug that causes zvols to fail to appear on
some systems, but not others. Recently, I was able to consistently
reproduce this issue over a period of 1 month. The issue disappeared
after I applied this change from FreeBSD.

This is from FreeBSD's pool version 28 import, which occurred in
revision 219089.

Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#441
Issue openzfs#599
behlendorf pushed a commit to behlendorf/zfs that referenced this issue May 21, 2018
Resolve a false positive in the kmemleak checker by shifting to the
kernel slab.  It shows up because vn_file_cache is using KMC_KMEM
which is directly allocated using __get_free_pages, which is not
automatically tracked by kmemleak.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes openzfs#599
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: ZVOL ZFS Volumes
Projects
None yet
Development

No branches or pull requests