Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default cachefile property to 'none'. #3526

Closed
wants to merge 1 commit into from

Conversation

FransUrbo
Copy link
Contributor

This in preparation to a later complete removal of cache file.

A cache file is almost never needed, at least not on Linux, so the long-time plan is to remove it altogether. This is the first step.

I noticed that this affects the import as well as create (which was not my intention - I only wanted it for the create command). That is, if I create a pool with the old code, using defaults (-), export it and then import it with the new code, the cachefile property will be none.

That might not be what we want, although I'm for it...

@behlendorf
Copy link
Contributor

It might be a really good idea to change this default as your suggesting before the tag. This is the kind of change which would be nice to make sooner rather than latter. This way the default behavior for zpool import will be scan the contents of /dev/ when importing a pool. I'd expect this to resolve the various initramfs issues regarding having an up to date cache file. The only downside to this should be slightly longer import times for pools with 100's of devices (maybe).

It would be great if we could get a few users running 0.6.4.1 to set the module option spa_config_path to "none" and verify everything works as expected.

@FransUrbo
Copy link
Contributor Author

It might be a really good idea to change this default as your suggesting before the tag.

This is the kind of change which would be nice to make sooner rather than latter.

I was kind'a thinking that it might be better to accept it as one of the very first commits after 0.6.5, to get it as much testing as possible and not 'risk' "normal" users system.

I can't really see any big problems, but you never know. For those that have huge pools with a lot of devices (which I gather is a lot fewer than those with 'a bunch' of vdevs), it is reasonably easy to "fix" the long import (if they see any) they (might!) get the first time with this, to set cachefile=/etc/zfs/zpool.cache, instead of NULL/'-' (which is the current default - 'use default path') or 'none' (this new default).

Because of this, I have no problem what so ever to add it before the tag but if someone sees (or can think of) ANY problem with this, maybe we should wait. Next tag isn't that far of...

I'd expect this to resolve the various initramfs issues regarding having an up to date cache file.

Indeed!! We've talked about this for years, but no-one have had time or interest to deal with it (it being an extremely low-prio thing). I was kind'a bored today, and I figured it would only be a few lines fixes. Turned out to be a couple more than I expected, but it's still small enough :)

Once we've run with this for a couple of months or so, we can discuss what the next step would be, but maybe this is enough? Maybe we should leave the cachefile property [set to default 'none'] from this PR and not go any further, for those very few instances where IT IS needed (or at least better - as with hundreds/thousands of vdevs)?

@behlendorf behlendorf added this to the 0.7.0 milestone Jul 17, 2015
This in preparation to a later complete removal of cache file.
A cache file is almost never needed, at least not on Linux, so the
long-time plan is to remove it altogether. This is the first step.
@sempervictus
Copy link
Contributor

There seems to be a bug here in that cachefiles are created during pool creation, in the current working directory, named 'none'. The pool doesnt have the file set in its cachefile property, but it is created nonetheless.

@FransUrbo
Copy link
Contributor Author

Bummer! Didn't notice that :(.

I'm stumped, so if someone have any hints, I'd appreciate it.

@jameslikeslinux
Copy link
Contributor

The solution to the stale cache file in the initramfs is to not include the cache file in the initramfs. The root pool can be found by scanning devices. Once you leave the initramfs, however, the cache file is important to maintain for all the reasons I outlined here. It keeps the record of which pools the user or system administrator wants to have reimported. The current behavior of the zfs-import service is incorrect with respect to the history and philosophy of ZFS.

@FransUrbo
Copy link
Contributor Author

The solution to the stale cache file in the initramfs is to not include the cache file in the initramfs.

Then riddle me this: How do you import a pool that can't be imported without a cache file, if there is no cache file?!

You referenced #3526, but you've simply misunderstand what we talked about in that pull request.

We've talked about removing the cache file for several years, but in the PR (which isn't quite finished and which was intended to be the first-stop-gap into doing that) we came to the conclusion that in certain situations one HAVE TO have a cache file!

HOWEVER, the PR is still valid (once it's completed), because in 98% (guesstimate) it's perfectly fine to do without a cache file and in those cases, it's more trouble than it's worth to have one. Hence the change in default value, not the complete removal of the option!

We're probably just not going to go beyond that. Changing the default is enough.

The current behavior of the zfs-import service is incorrect with respect to the history and philosophy of ZFS.

No it's not. It takes the 98% majority (still a guesstimate) into account, not the 2% special exceptions!

@jameslikeslinux
Copy link
Contributor

Then riddle me this: How do you import a pool that can't be imported without a cache file, if there is no cache file?!

Using the -d option to zpool like zpool import -d /dev/weird/location weirdpool (which then updates the default cachefile so you don't have to worry about it again) or by specifying a cache location at pool creation time like:

zpool create -o cachefile=/etc/zfs/weirdpool.cache weirdpool
zpool export weirdpool
zpool import -c /etc/zfs/weirdpool.cache weirdpool

No it's not. It takes the 98% majority (still a guesstimate) into account, not the 2% special exceptions!

That's interesting. I've got 2000 servers at work running ZFS on Linux. I am a Solaris certified administrator and have been using ZFS since 2007. Am I a "special exception" to expect that ZFS works as it always has? You had better have a very good reason to break what was working, and I think I have pointed out two very good examples of where your changes break existing functionality: SAN and USB, and @ryao pointed out another: multipathing, where individual pools are visible under multiple redundant device nodes, and the zpool.cache informs ZFS on which to use, as established by the system administrator when the pool was created or last imported with the '-d' option.

@FransUrbo
Copy link
Contributor Author

Using the -d option to zpool like zpool import -d /dev/weird/location weird pool

Which is the point we're making in the PR (and here). It won't work for at least two cases:

1. HUGE pools with _a lots_ of vdevs
2. Multi-path systems.

You had better have a very good reason to break what was working

It's NOT working! For that 98% of cases which don't need a cache file. Keeping it in sync in the initrd have proven to be a real pain in the behind!

It works for Solaris because their boot loader is able to find the pool and deal with everything. We have Grub, which can't. We have to use a initrd to do the logic for us, and that needs to be maintained and updated. Which is a huge pain in the behind, but that's the cards we're dealt.

SAN and USB

Works (without a cache file) in both of these cases.

But I fail to see what you're saying… I've already told you that we don't intend to remove the cache file, just change the default.

@jameslikeslinux
Copy link
Contributor

It won't work for at least two cases:

  1. HUGE pools with a lots of vdevs
  2. Multi-path systems.

I don't understand. These two issues are exactly what the cache file is designed to address. The need to use the '-d' option to 'zpool import' is an exceptional case, and you only need to do it once. After it's been imported, the device information is stored in the cache so you don't have to do it again until after you've exported and reimported the pool. And as I explained in #3777, the system should never export the pool, only the user/admin when they want to import it on another system.

By changing the default to none, you are effectively removing the cache file for the majority of users, when instead it is the zfs-import service that needs to change to import pools based on the cache file.

For that 98% of cases which don't need a cache file.

100% of users need the cache file for the system to determine which pools to import at boot time. That is the point of the cache file. Literally, it's in the man page:

All pools in this cache are
automatically  imported  when  the  system  boots.

The fact that you broke the zfs-import service so that it doesn't use the cache file does not negate the need for a cache file. It just means that you broke the zfs-import service.

Keeping it in sync in the initrd have proven to be a real pain in the behind!

You do not need to keep it in sync. The initrd does not need to have a cache file. The initrd only needs to be able to import the root pool, which it can do without a cache file. But once you leave the initrd, questions about other pools--which ones should be imported, which devices to use for multipath vdevs--become a problem, and that is where the cache is needed.

@jameslikeslinux
Copy link
Contributor

It works for Solaris because their boot loader is able to find the pool and deal with everything. We have Grub, which can't.

It may interest you to know that Solaris x86 uses GRUB and has an equivalent to initrds called "boot archives." The boot archives do not need the zpool.cache file to be able to import the rpool. Neither should we need it for our initrds. But it is still absolutely required as a default for reimporting non-root pools after leaving the initrd, just as Solaris does after it leaves the boot archive.

@jameslikeslinux
Copy link
Contributor

SAN and USB

Works (without a cache file) in both of these cases.

Of course I'd expect these to import without a cache file, just as I am saying that the root pool can be imported in the initrd without a cache file, but the point is that the zfs-import service, which should simply run zpool import -c /etc/zfs/zpool.cache -a, will use the cache file to determine which of these SAN and USB devices to try to import. I use SAN and USB as examples because they are highly dynamic, and sometimes even out of our control, and your solution of using a variable in /etc/default/zfs is not an appropriate answer to the question of how to control which get reimported at boot.

@ilovezfs
Copy link
Contributor

When zdb is invoked with no arguments, it will dump the contents of the cache file in a human readable format. Is it the perspective of those who favor this pull request that this particular zdb functionality is A) useful but can and will be implemented in a roughly equivalent way without the cache file, B) useful but not sufficiently useful to warrant keeping the cache file or implementing in some other way, C) not useful, or D) something else?

@jameslikeslinux
Copy link
Contributor

I think you are confused, @FransUrbo, because you export all of your pools before shutdown, which, of course, negates the purpose of the cache file. That's why you say, "a cache file is almost never needed, at least not on Linux." Stop exporting your pools at shutdown, and you will see that a cache every bit as useful as it is on Solaris.

@FransUrbo
Copy link
Contributor Author

I don't understand. These two issues are exactly what the cache file is designed to address.

Yes. And that's why we're keeping it.

The need to use the '-d' option to 'zpool import' is an exceptional case

On the contrary. It works perfectly fine in 98% of the cases so there's no need to complicate matters with a cache file.

100% of users need the cache file

No they don't. I haven't used a cache file in years. As is many others.

Another huge downside of the cache file (and the primary reason why we want to remove it) is that on Linux devices sometimes changes names (especially if/when using "-d /dev" - which most users do).

When that happens, the pool become un-importable and it's a pain to get it back.

The fact that you broke the zfs-import service so that it doesn't use the cache file does not negate the need for a cache file. It just means that you broke the zfs-import service.

Yes it does, and no I didn't.

I'm starting to thing you have an even bigger understanding disability than I do, but there is, in most cases, no need for a cache file. I think I've said that enough times now, so I'm dropping out of the discussion.

The initrd only needs to be able to import the root pool

Which it can't do if the root pool is on multi-path. Or using huge numbers of vdevs (all perfectly legal on Linux).

jameslikeslinux added a commit to jameslikeslinux/zfs that referenced this pull request Sep 18, 2015
This change modifies the import service to use the default cache file
to reimport pools at boot.  This fixes code that exhaustively searched
the entire system and imported all visible pools.  Using the cache
file is in keeping with the way ZFS has always worked, and is how it is
written in the man page (zpool(1M,8)):

    All pools  in  this  cache  are  automatically imported when the
    system boots.

Importantly, the cache contains important information for importing
multipath devices, and helps control which pools get imported in more
dynamic environments like SANs, which may have thousands of visible
and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable
is not equipped to handle.

The change also stops the service from exporting pools at shutdown.
Exporting pools is only meant to be performed by the administrator
of the system.

Closes openzfs#3777
Closes openzfs#3526
@jameslikeslinux
Copy link
Contributor

100% of users need the cache file

No they don't. I haven't used a cache file in years. As is many others.

Hold the presses! Internet user @FransUrbo hasn't used a cache file in years. He must know more than those stupid Solaris engineers who designed the thing. Better let him change the default options for an enterprise grade filesystem.

In all seriousness, you had better have a damn good case to change defaults, and I have pointed out many, many examples of where your change breaks things. Yet all you can claim is that "you haven't had to use a cache file" so that should be the default? The fact is you changed the behavior of the reimport script so that you didn't need the cache file...it is no wonder then that you didn't need one.

I have just submitted a pull request, #3800, that restores the proper operation of the import service, resolves #3777, and superceeds this pull request.

Another huge downside of the cache file (and the primary reason why we want to remove it) is that on Linux devices sometimes changes names (especially if/when using "-d /dev" - which most users do).

No users should be using devices directly in /dev. There are persistent paths for that. If users are using /dev straight up, that's a documentation issue, not something that should require rewriting the way pools get imported and getting rid of the cache file.

Which it can't do if the root pool is on multi-path. Or using huge numbers of vdevs (all perfectly legal on Linux).

Any user smart enough to do rpools on multipath devs should be smart enough to update their initrd with the updated zpool cache when necessary.

jameslikeslinux added a commit to jameslikeslinux/zfs that referenced this pull request Sep 18, 2015
This change modifies the import service to use the default cache file
to reimport pools at boot.  This fixes code that exhaustively searched
the entire system and imported all visible pools.  Using the cache
file is in keeping with the way ZFS has always worked, and is how it is
written in the man page (zpool(1M,8)):

    All pools  in  this  cache  are  automatically imported when the
    system boots.

Importantly, the cache contains important information for importing
multipath devices, and helps control which pools get imported in more
dynamic environments like SANs, which may have thousands of visible
and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable
is not equipped to handle.

The change also stops the service from exporting pools at shutdown.
Exporting pools is only meant to be performed by the administrator
of the system.

Closes openzfs#3777
Closes openzfs#3526
@sempervictus
Copy link
Contributor

@MrStaticVoid: cachefiles shouldn't be removed in their entirety, they just shouldn't be mandated either since ZFS is used on enterprise storage, servers, workstations, and laptops these days. The last one means you've got boot scenarios with different vdev layouts and configurations - a docked laptop may have more disks, or as happens on my system, the optical bay is used for another SSD and that's occasionally removed. Another good example is running atop encrypted volumes which may not have been unlocked by the appropriately priviliged user, or where the key becomes inaccessible (NFS/SSHFS mount or something). Situations like that can induce unpleasant boot symptoms and aren't ideal. Finally, there's also the basic concept of running multiple workstations on the same streamed rootFS for consistency of experience, config, etc. That can also get nasty with cachefiles...

Thank you for submitting a fix to the original implementation. Think you might have a middle-ground approach which allows legacy behavior but permits skipping/disabling cachefiles in a modprobe option or as a feature flag which can be enabled/disabled for pools?

@jameslikeslinux
Copy link
Contributor

Thanks for the constructive examples, @sempervictus. I would suggest that behavior ZFS has always had, the zpool.cache, will handle all of those cases well, but maybe many people don't realize it because the cache has been misused and misunderstood (for example, that the zfs boot services export pools at shutdown).

To illustrate this, I will attempt to show how a typical kind of dynamic setup as you suggest behaves using the cachefile and my boot scripts from #3800:

Imagine I have a laptop that I use at work and at home. At work, I have a dock with a big data disk attached. I create a pool on that disk (in my case, I'm actually using a USB stick, but the behavior is the same):

> zpool create -f usbpool usb-Lexar_microSD_RDR_000000000001-0:0

That pool gets created and added to my zpool.cache file automatically:

> grep usbpool /etc/zfs/zpool.cache   
Binary file /etc/zfs/zpool.cache matches

I shutdown my laptop (which, importantly, does not export the pool) and take it home, where the disk is not present, and I start the laptop. I do get an error message:

cannot import 'usbpool': no such pool or dataset

but critically, everything else in the zpool.cache file that is present gets reimported and everything is happy.

The next day I shutdown my laptop and go back to work and dock it. The path to the vdevs hasn't changed because I'm a good ZFS user who read the documentation and uses persistent paths, whether that's by device name or by UUID. I start my laptop and, because the usbpool still exists in my cache, it gets reimported. All is well.

Now lets imagine I shutdown and give the disk away. I no longer have it and will never use it again, but I forgot to export the pool. Unfortunately, the pool will still exist in the cache, so there will always be a message at boot saying:

cannot import 'usbpool': no such pool or dataset

This is not a problem, but it's kind of annoying. The solution is to get rid of the cache file and regenerate it from the presently imported pools:

> rm /etc/zfs/zpool.cache
> for pool in `zpool list -H -o name`; do zpool set cachefile='' $pool; done

Then all is well again.

So, in this example, the only problems I would identify are two user interface issues:

  1. At boot, maybe we shouldn't make the warning about being unable to import unavailable pools so scary.
  2. It should be easier to remove unwanted pools from the cache, or to regenerate it entirely, as suggested by @ryao in Provide command to generate zpool.cache on demand #711.

Let's fix the user interface, but let's not throw the baby (cachefile) out with the bath water (interface).

@jameslikeslinux
Copy link
Contributor

Here's an example of why getting rid of the cache file by default is a bad idea (in addition to the 4 or 5 other examples I've given in this pull request and #3777):

I'm a bad guy. I make a USB pool with a virus on it. I sneak into a datacenter and discreetly plug it into a web server. The next time that web server reboots, because there is no cache file, the boot scripts try to import everything, including the USB pool, which then mounts itself up at the document root and spreads wildly. There are a million variations of this.

In response you might say (and @FransUrbo has already said) that the boot scripts could be configured not to import everything at boot, but then you're left configuring which ones you do want to import at boot. But why do that when we already have a means of capturing that information, the zpool.cache, without having to edit any files. ZFS was meant to make life easy--it's why we don't have to edit the fstab for every new filesystem we create and why, in theory, we don't have to edit the exports file for every filesystem we share. Why add new configuration files to the equation? The zpool.cache works.

@Bronek
Copy link

Bronek commented Sep 19, 2015

@MrStaticVoid this is good explanation and I like the idea of using cachefile for the selection of pools to import at boot. However the problem you have not addressed yet (and the one which prompted search for alternative means to import pools at startup) is that the copy of cachefile has to be maintained inside initramfs , which means that every time we import a new pool, or change the path to existing pool (which we wish to be auto-imported on boot), a new initramfs will have to generated. In other words, it is very easy for a copy of cachefile inside initramfs to become obsolete, which may lead to difficulties starting up the system (if the affected filesystem is root or other essential directory).

@sempervictus
Copy link
Contributor

Maybe the solution is something like /etc/zfs/zpool.cache.d/ to store
individual cachefiles per pool and permanently disable em via pool property
as needed on others. Most of our production deployments are over dm-crypt
anyway, which is handled via separate keyfiles. We do perform remote
unlocks of the OS via dropbear in initram, so we could theoretically expose
the volumes at boot, but the current workflow is a bit different. Despite
these being storage systems, they don't necessarily have access to the
block devs on which the vdevs are built and waiting for the import of each
pool timing out is annoying. Before this we would wipe the cache file
manually on every initrd rebuild which can also get annoying. Although this
stack produces an unwanted cache file, the pool doesn't use it, and it does
solve a bunch of these issues. Hence I suggest a middle ground which keeps
expected behavior across openzfs and allows a permanent de-cachefiling of
pools as desired.
On Sep 19, 2015 4:59 AM, "Bronek Kozicki" [email protected] wrote:

@MrStaticVoid https://github.com/MrStaticVoid this is good explanation
and I like the idea of using cachefile for the selection of pools to import
at boot. However the problem you have not addressed yet (and the one which
prompted search for alternative means to import pools at startup) is that
the copy of cachefile has to be maintained inside initramfs , which means
that every time we import a new pool, or change the path to existing pool
(which we wish to be imported on boot), a new initramfs will have to
generated. In other words, it is very easy for initramfs to become
obsolete, which may lead to difficulties starting up the system (if the
affected filesystem is root or other essential directory).


Reply to this email directly or view it on GitHub
#3526 (comment).

@jameslikeslinux
Copy link
Contributor

@MrStaticVoid this is good explanation and I like the idea of using cachefile for the selection of pools to import at boot. However the problem you have not addressed yet (and the one which prompted search for alternative means to import pools at startup) is that the copy of cachefile has to be maintained inside initramfs , which means that every time we import a new pool, or change the path to existing pool (which we wish to be auto-imported on boot), a new initramfs will have to generated.

@Bronek, the cachefile, as I've said a couple times in this pull request, does not need to be in the initramfs at all. The only job of the initramfs is to import the rpool and mount /, which it can do without a cache. You seem to suggest that other pools get imported by the initrd and thus it needs to maintain an up-to-date copy in the initramfs, but that is not the case. The other pools, data pools if you want to call them that, are imported by a plain old init script, zfs-import, (or, I guess, a systemd service), but at that point, and I want you to think about this very carefully, the root is already mounted and the system has access to /etc/zfs/zpool.cache.

@FransUrbo
Copy link
Contributor Author

Maybe the solution is something like /etc/zfs/zpool.cache.d/ to store individual cachefiles per pool and permanently disable em via pool property as needed on others.

Why? Just set the 'cachefile' property accordingly when creating the pool.

Don't overcomplicate things for the majority, to solve the problem for a minority!

@MrStaticVoid You seem to be purposefully ignoring everything I said. Is there a reason why, or are you just a troll?

@jameslikeslinux
Copy link
Contributor

You keep using that word, "majority," but so far all I've heard is your own incorrect version of the facts. I've been able to point to nine different examples showing how you're wrong and how your code and proposed changes will hurt not just the majority, but everybody.

I am not trolling, I am using facts and evidence to show how you're wrong. Sorry you don't like that.

@FransUrbo
Copy link
Contributor Author

I am using facts and evidence to show how you're wrong. Sorry you don't like that.

Actually, you're not. You're using old, stale information from a system that only vaguely resembles the one ZoL is running on to justify your deluded opinions instead of learning the actual facts.

Linux isn't Solaris. It doesn't work the same. It vaguely look the same, but looks is deceiving when you start scratching at the surface.

@sempervictus
Copy link
Contributor

ZoLoft may be something to consider...

In all seriousness, hostility of this nature does not belong here, use IRC
or any other valid method of stress release. Contributor or not, this is a
community largely consisting of IT professionals and you probably wouldn't
communicate that way with peers in a room. It will deter potential adopters
(whom we need) who review GH when they read it and portray us as a
squabbling mass, incorrectly.

Taking the approach of altering ZoL away from OpenZFS implementation should
be (IMO) reserved for extreme cases of systemic incompatibility - memory
management is a good example of such required changes. The comment about
not importing rpool via initrd makes sense, the issue there is maintenance
of all the scriptage for every distro using the current approach. Far as
setting the property to none, doesn't seem to work consistently as pools
tend to re-acquire cache files after some number of imports or zfs version
changes. Ill build up a stack using the new PR to get acquainted with the
behavior before trying to pass any judgement, but at present I think I
still side with the idea of not having it enabled by default as most users
aren't running full scale storage but using this for home nas and
workstations (consider how many geeks have resources to build SAN/NAS and
how many just tend to learn it by using daily). Those of us who do need to
build things with complex SAS or iSCSI topologies could enable the cache
file and live happily with a working storage implementation...
On Sep 19, 2015 1:21 PM, "Turbo Fredriksson" [email protected]
wrote:

I am using facts and evidence to show how you're wrong. Sorry you don't
like that.

Actually, you're not. You're using old, stale information from a system
that only vaguely resembles the one ZoL is running on to justify your
deluded opinions instead of learning the actual facts.

Linux isn't Solaris. It doesn't work the same. It vaguely look the same,
but looks is deceiving when you start scratching at the surface.


Reply to this email directly or view it on GitHub
#3526 (comment).

@FransUrbo
Copy link
Contributor Author

Taking the approach of altering ZoL away from OpenZFS implementation should be (IMO) reserved for extreme cases of systemic incompatibility - memory management is a good example of such required changes.

Contrary to belief, I actually agree with that! I'm usually the first to "storm the beaches" when there's a compatibility problem with (Open)ZFS. I have reported several issues of the fact, all still open.. :(

I spent almost two years on the list and quite a while on the IRC channel helping users as well as seeing almost every issue here (some people seems to believe this is a support forum), so I see "the big picture". @MrStaticVoid does not. He only sees his own, limited system and thinks that THAT is the real world and THAT is how everyone should do it (extremely rude if you ask me).

It works for him, but not for the large community. And letting the minority dictate the actions of the majority is "rude" at best! Insane at it's worst...

There is a very simple way to configure the system for him, and still allowing everyone else to both enjoy the marvels of ZFS On Linux!

The comment about not importing rpool via initrd makes sense

Actually, it doesn't. We have no choice! Our boot loader, Grub, can't mount and import pools (yet/properly).

And if/when a root pool can't be imported without a cache file (such as when it consists of huge amounts of VDEVs and/or from multi path devices), then things gets really complicated, real fast!

This I've said several times, but @MrStaticVoid choose to ignore all that. Due to ignorance, incompetence or simply stupidity is anyones guess.

@sempervictus
Copy link
Contributor

The point about complex rpools makes sense. We generally build out
dedicated storage with os on flash, and only really use rpools on
workstations (zfs underpins the virtual storage for workloads or database
style implementations on metal as appropriate). Does removing the
cachefile altogether address the complex mp topologies? Seems we would
need to check every block dev in the host for labels. Also, when you're
talking about grub support, do you mean cases where /boot is ZFS?
On Sep 19, 2015 3:56 PM, "Turbo Fredriksson" [email protected]
wrote:

Taking the approach of altering ZoL away from OpenZFS implementation
should be (IMO) reserved for extreme cases of systemic incompatibility -
memory management is a good example of such required changes.

Contrary to belief, I actually agree with that! I'm usually the first to
"storm the beaches" when there's a compatibility problem with (Open)ZFS. I
have reported several issues of the fact, all still open.. :(

I spent almost two years on the list and quite a while on the IRC channel
helping users as well as seeing almost every issue here (some people
seems to believe this is a support forum), so I see "the big picture".
@MrStaticVoid https://github.com/MrStaticVoid does not. He only sees
his own, limited system and thinks that THAT is the real world and THAT is
how everyone should do it (extremely rude if you ask me).

It works for him, but not for the large community. And letting the
minority dictate the actions of the majority is "rude" at best! Insane at
it's worst...

There is a very simple way to configure the system for him, and still
allowing everyone else to both enjoy the marvels of ZFS On Linux!

The comment about not importing rpool via initrd makes sense

Actually, it doesn't. We have no choice! Our boot loader, Grub, can't
mount and import pools (yet/properly).

And if/when a root pool can't be imported without a cache file (such as
when it consists of huge amounts of VDEVs and/or from multi path devices),
then things gets really complicated, real fast!

This I've said several times, but @MrStaticVoid
https://github.com/MrStaticVoid choose to ignore all that. Due to
ignorance, incompetence or simply stupidity is anyones guess.


Reply to this email directly or view it on GitHub
#3526 (comment).

@FransUrbo
Copy link
Contributor Author

Does removing the cachefile altogether address the complex mp topologies?

I'm not entirely sure I understand what you're asking here… It have been brought to my attention that multi-path system needs a cache file. I don't quite understand or know why, but if they say so, I have nothing to add to that… :)

Also, when you're talking about grub support, do you mean cases where /boot is ZFS?

I'm unsure at the exact state of ZFS in Grub is at the moment. About a year ago, I worked several months to get a bootable ISO with native support for ZFS/ZoL and a large part of the problem was with Grub. Regarding /boot, for sure! If it can't load the kernel and the initrd, then of course you can't boot of a ZFS system. But also, it (as far as I know) also can't mount any filesystem (it can "mount" them internally to get to the kernel/initrd, but not mount them so that they're available for init - or systemd).

That have ALWAYS been the work of the initrd in Linux.

jameslikeslinux added a commit to jameslikeslinux/zfs that referenced this pull request Sep 20, 2015
This change modifies the import service to use the default cache file
to perform a verbatim import of pools at boot.  This fixes code that
searches all devices and imported all visible pools.

Using the cache file is in keeping with the way ZFS has always worked,
how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how
it is written in the man page (zpool(1M,8)):

    All pools  in  this  cache  are  automatically imported when the
    system boots.

Importantly, the cache contains important information for importing
multipath devices, and helps control which pools get imported in more
dynamic environments like SANs, which may have thousands of visible
and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable
is not equipped to handle.  Verbatim imports prevent rogue pools from
being automatically imported and mounted where they shouldn't be.

The change also stops the service from exporting pools at shutdown.
Exporting pools is only meant to be performed explicitly by the
administrator of the system.

The old behavior of searching and importing all visible pools is
preserved and can be switched on by heeding the warning and toggling
the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs.

Closes openzfs#3777
Closes openzfs#3526
jameslikeslinux added a commit to jameslikeslinux/zfs that referenced this pull request Oct 1, 2015
This change modifies the import service to use the default cache file
to perform a verbatim import of pools at boot.  This fixes code that
searches all devices and imported all visible pools.

Using the cache file is in keeping with the way ZFS has always worked,
how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how
it is written in the man page (zpool(1M,8)):

    All pools  in  this  cache  are  automatically imported when the
    system boots.

Importantly, the cache contains important information for importing
multipath devices, and helps control which pools get imported in more
dynamic environments like SANs, which may have thousands of visible
and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable
is not equipped to handle.  Verbatim imports prevent rogue pools from
being automatically imported and mounted where they shouldn't be.

The change also stops the service from exporting pools at shutdown.
Exporting pools is only meant to be performed explicitly by the
administrator of the system.

The old behavior of searching and importing all visible pools is
preserved and can be switched on by heeding the warning and toggling
the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs.

Closes openzfs#3777
Closes openzfs#3526
@timemaster67
Copy link

Hello there,
just giving a summary of grub current status, from the point of view of a simple user.
The current development version of git can import and mount a zfs pool. It can then access a linux kernel image and initramfs on a zfs pool and start the init from there which will import/reimport officially the pool using zfsonlinux code and then boot up the system. The init might use or not the cache file, but this is only an implementation details of the init. I share no opinion on this matter as things just work for me right now.
The current development version support all features flag of zfs on linux 0.6.5.2. Reference : https://wiki.archlinux.org/index.php/ZFS#GRUB-compatible_pool_creation
More information on how to boot zfs pool on grub here : https://wiki.archlinux.org/index.php/Installing_Arch_Linux_on_ZFS#Install_and_configure_the_bootloader.
Hope this help.

@ryao
Copy link
Contributor

ryao commented Oct 9, 2015

This is the first time I have ever NACKed a patch, but there is a first time for everything. This patch should not be merged. The method for getting rid of the cachefile that @FransUrbo uses on Debian opens VM hosts to exploitation from guests on reboot. I have sketched out the zero-day attack based on it and emailed the details to @behlendorf. Trying to change the default cachefile setting to none will just make it harder to fix the problem, whose solution is to use the cachefile.

This patch will break import on systems using the systemd unit files and basically every other system that is consistent with Illumos, which includes plenty of custom boot scripts. That also includes the entire Gentoo family of distributions, where the auto-import behavior was disabled by default to be consistent with Illumos, the systemd unit files and how Gentoo originally did things.

If this is merged, I will revert it in Gentoo and encourage the others distribution maintainers to revert it. This patch builds on a really bad idea and is a step in the wrong direction.

@FransUrbo
Copy link
Contributor Author

The large majority isn't running virtual hosts, ZFS from SAS devices etc, etc so for them, not having a cache file IS (and have been for the last couple of years) the correct and easiest behavior. Something you seem to refuse to see, ryao.

And I'm (again) not removing the cache file, I'm changing the default!

I would appreciate if you could send your attack to me as well, I'd really would like to see what you have on that.

@dswartz
Copy link
Contributor

dswartz commented Oct 9, 2015

I will observe that zfs should be trying to get into sas enterprise space...

Turbo Fredriksson [email protected] wrote:

The large majority isn't running virtual hosts, ZFS from SAS devices etc, etc so for them, not having a cache file IS (and have been for the last couple of years) the correct and easiest behavior. Something you seem to refuse to see, ryao.

And I'm (again) not removing the cache file, I'm changing the default!

I would appreciate if you could send your attack to me as well, I'd really would like to see what you have on that.


Reply to this email directly or view it on GitHub.

@FransUrbo
Copy link
Contributor Author

On Oct 9, 2015, at 1:17 PM, dswartz wrote:

I will observe that zfs should be trying to get into sas enterprise space�

Yes, absolutly! ZFS is the perfect FS for that and many other things. But the majority is still, and will be for the foreseeable future, smaller sites use-cases.

I don't think it's fair to set defaults that goes against the majority and only benefit a very small minority..

@Bronek
Copy link

Bronek commented Oct 9, 2015

I don't see how having cachefile goes against anyone, honestly. It will be great to see something more specific.

Judging by recent discussions it would seem that it is rather straightforward to get the correct behaviour without the need to maintain synchronized copy of cachefile inside initramfs; it only needs to be in the expected location of the root filesystem (not inside initramfs, as previously assumed). Also it is not difficult at all to trigger import-all in the absence of this file.

The installation instruction I went by when installing ZFS first time actually requires that cache file will be set, and I was only surprised and confused later when I found out that it is actually not used.

@FransUrbo
Copy link
Contributor Author

I don't see how having cachefile goes against anyone, honestly.

Indeed. It doesn't! Those that want or need it can very simply enable it, but the large majority (that DON'T need it) don't have to do anything...
Judging by recent discussions it would seem that it is rather straightforward to get the correct behaviour without the need to maintain synchronized copy of cachefile inside initramfs; and also that it is not difficult to trigger import-all in the absence of this file in the /etc/zfs of root filesystem.

I see several issues with that implementation nor do I see any point in doing it that way. It will most interesting to see what's the result of that once ryao starts pushing that on to his users… I just hope he have the decency not to have all those issues clutter the ZoL issue tracker!

nedbass pushed a commit that referenced this pull request Oct 14, 2015
This change modifies the import service to use the default cache file
to perform a verbatim import of pools at boot.  This fixes code that
searches all devices and imported all visible pools.

Using the cache file is in keeping with the way ZFS has always worked,
how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how
it is written in the man page (zpool(1M,8)):

    All pools  in  this  cache  are  automatically imported when the
    system boots.

Importantly, the cache contains important information for importing
multipath devices, and helps control which pools get imported in more
dynamic environments like SANs, which may have thousands of visible
and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable
is not equipped to handle.  Verbatim imports prevent rogue pools from
being automatically imported and mounted where they shouldn't be.

The change also stops the service from exporting pools at shutdown.
Exporting pools is only meant to be performed explicitly by the
administrator of the system.

The old behavior of searching and importing all visible pools is
preserved and can be switched on by heeding the warning and toggling
the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs.

Signed-off-by: James Lee <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #3777
Closes #3526
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants