-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default cachefile property to 'none'. #3526
Conversation
It might be a really good idea to change this default as your suggesting before the tag. This is the kind of change which would be nice to make sooner rather than latter. This way the default behavior for It would be great if we could get a few users running 0.6.4.1 to set the module option |
I can't really see any big problems, but you never know. For those that have huge pools with a lot of devices (which I gather is a lot fewer than those with 'a bunch' of vdevs), it is reasonably easy to "fix" the long import (if they see any) they (might!) get the first time with this, to set Because of this, I have no problem what so ever to add it before the tag but if someone sees (or can think of) ANY problem with this, maybe we should wait. Next tag isn't that far of...
Once we've run with this for a couple of months or so, we can discuss what the next step would be, but maybe this is enough? Maybe we should leave the cachefile property [set to default 'none'] from this PR and not go any further, for those very few instances where IT IS needed (or at least better - as with hundreds/thousands of vdevs)? |
This in preparation to a later complete removal of cache file. A cache file is almost never needed, at least not on Linux, so the long-time plan is to remove it altogether. This is the first step.
98b59f7
to
ca23981
Compare
There seems to be a bug here in that cachefiles are created during pool creation, in the current working directory, named 'none'. The pool doesnt have the file set in its cachefile property, but it is created nonetheless. |
Bummer! Didn't notice that :(. I'm stumped, so if someone have any hints, I'd appreciate it. |
The solution to the stale cache file in the initramfs is to not include the cache file in the initramfs. The root pool can be found by scanning devices. Once you leave the initramfs, however, the cache file is important to maintain for all the reasons I outlined here. It keeps the record of which pools the user or system administrator wants to have reimported. The current behavior of the |
Then riddle me this: How do you import a pool that can't be imported without a cache file, if there is no cache file?! You referenced #3526, but you've simply misunderstand what we talked about in that pull request. We've talked about removing the cache file for several years, but in the PR (which isn't quite finished and which was intended to be the first-stop-gap into doing that) we came to the conclusion that in certain situations one HAVE TO have a cache file! HOWEVER, the PR is still valid (once it's completed), because in 98% (guesstimate) it's perfectly fine to do without a cache file and in those cases, it's more trouble than it's worth to have one. Hence the change in default value, not the complete removal of the option! We're probably just not going to go beyond that. Changing the default is enough.
No it's not. It takes the 98% majority (still a guesstimate) into account, not the 2% special exceptions! |
Using the
That's interesting. I've got 2000 servers at work running ZFS on Linux. I am a Solaris certified administrator and have been using ZFS since 2007. Am I a "special exception" to expect that ZFS works as it always has? You had better have a very good reason to break what was working, and I think I have pointed out two very good examples of where your changes break existing functionality: SAN and USB, and @ryao pointed out another: multipathing, where individual pools are visible under multiple redundant device nodes, and the |
Which is the point we're making in the PR (and here). It won't work for at least two cases:
It's NOT working! For that 98% of cases which don't need a cache file. Keeping it in sync in the initrd have proven to be a real pain in the behind! It works for Solaris because their boot loader is able to find the pool and deal with everything. We have Grub, which can't. We have to use a initrd to do the logic for us, and that needs to be maintained and updated. Which is a huge pain in the behind, but that's the cards we're dealt.
Works (without a cache file) in both of these cases. But I fail to see what you're saying… I've already told you that we don't intend to remove the cache file, just change the default. |
I don't understand. These two issues are exactly what the cache file is designed to address. The need to use the '-d' option to 'zpool import' is an exceptional case, and you only need to do it once. After it's been imported, the device information is stored in the cache so you don't have to do it again until after you've exported and reimported the pool. And as I explained in #3777, the system should never export the pool, only the user/admin when they want to import it on another system. By changing the default to
100% of users need the cache file for the system to determine which pools to import at boot time. That is the point of the cache file. Literally, it's in the man page:
The fact that you broke the
You do not need to keep it in sync. The initrd does not need to have a cache file. The initrd only needs to be able to import the root pool, which it can do without a cache file. But once you leave the initrd, questions about other pools--which ones should be imported, which devices to use for multipath vdevs--become a problem, and that is where the cache is needed. |
It may interest you to know that Solaris x86 uses GRUB and has an equivalent to initrds called "boot archives." The boot archives do not need the zpool.cache file to be able to import the rpool. Neither should we need it for our initrds. But it is still absolutely required as a default for reimporting non-root pools after leaving the initrd, just as Solaris does after it leaves the boot archive. |
Of course I'd expect these to import without a cache file, just as I am saying that the root pool can be imported in the initrd without a cache file, but the point is that the |
When zdb is invoked with no arguments, it will dump the contents of the cache file in a human readable format. Is it the perspective of those who favor this pull request that this particular zdb functionality is A) useful but can and will be implemented in a roughly equivalent way without the cache file, B) useful but not sufficiently useful to warrant keeping the cache file or implementing in some other way, C) not useful, or D) something else? |
I think you are confused, @FransUrbo, because you export all of your pools before shutdown, which, of course, negates the purpose of the cache file. That's why you say, "a cache file is almost never needed, at least not on Linux." Stop exporting your pools at shutdown, and you will see that a cache every bit as useful as it is on Solaris. |
Yes. And that's why we're keeping it.
On the contrary. It works perfectly fine in 98% of the cases so there's no need to complicate matters with a cache file.
No they don't. I haven't used a cache file in years. As is many others. Another huge downside of the cache file (and the primary reason why we want to remove it) is that on Linux devices sometimes changes names (especially if/when using "-d /dev" - which most users do). When that happens, the pool become un-importable and it's a pain to get it back.
Yes it does, and no I didn't. I'm starting to thing you have an even bigger understanding disability than I do, but there is, in most cases, no need for a cache file. I think I've said that enough times now, so I'm dropping out of the discussion.
Which it can't do if the root pool is on multi-path. Or using huge numbers of vdevs (all perfectly legal on Linux). |
This change modifies the import service to use the default cache file to reimport pools at boot. This fixes code that exhaustively searched the entire system and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed by the administrator of the system. Closes openzfs#3777 Closes openzfs#3526
Hold the presses! Internet user @FransUrbo hasn't used a cache file in years. He must know more than those stupid Solaris engineers who designed the thing. Better let him change the default options for an enterprise grade filesystem. In all seriousness, you had better have a damn good case to change defaults, and I have pointed out many, many examples of where your change breaks things. Yet all you can claim is that "you haven't had to use a cache file" so that should be the default? The fact is you changed the behavior of the reimport script so that you didn't need the cache file...it is no wonder then that you didn't need one. I have just submitted a pull request, #3800, that restores the proper operation of the import service, resolves #3777, and superceeds this pull request.
No users should be using devices directly in
Any user smart enough to do rpools on multipath devs should be smart enough to update their initrd with the updated zpool cache when necessary. |
This change modifies the import service to use the default cache file to reimport pools at boot. This fixes code that exhaustively searched the entire system and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed by the administrator of the system. Closes openzfs#3777 Closes openzfs#3526
@MrStaticVoid: cachefiles shouldn't be removed in their entirety, they just shouldn't be mandated either since ZFS is used on enterprise storage, servers, workstations, and laptops these days. The last one means you've got boot scenarios with different vdev layouts and configurations - a docked laptop may have more disks, or as happens on my system, the optical bay is used for another SSD and that's occasionally removed. Another good example is running atop encrypted volumes which may not have been unlocked by the appropriately priviliged user, or where the key becomes inaccessible (NFS/SSHFS mount or something). Situations like that can induce unpleasant boot symptoms and aren't ideal. Finally, there's also the basic concept of running multiple workstations on the same streamed rootFS for consistency of experience, config, etc. That can also get nasty with cachefiles... Thank you for submitting a fix to the original implementation. Think you might have a middle-ground approach which allows legacy behavior but permits skipping/disabling cachefiles in a modprobe option or as a feature flag which can be enabled/disabled for pools? |
Thanks for the constructive examples, @sempervictus. I would suggest that behavior ZFS has always had, the zpool.cache, will handle all of those cases well, but maybe many people don't realize it because the cache has been misused and misunderstood (for example, that the zfs boot services export pools at shutdown). To illustrate this, I will attempt to show how a typical kind of dynamic setup as you suggest behaves using the cachefile and my boot scripts from #3800: Imagine I have a laptop that I use at work and at home. At work, I have a dock with a big data disk attached. I create a pool on that disk (in my case, I'm actually using a USB stick, but the behavior is the same):
That pool gets created and added to my zpool.cache file automatically:
I shutdown my laptop (which, importantly, does not export the pool) and take it home, where the disk is not present, and I start the laptop. I do get an error message:
but critically, everything else in the The next day I shutdown my laptop and go back to work and dock it. The path to the vdevs hasn't changed because I'm a good ZFS user who read the documentation and uses persistent paths, whether that's by device name or by UUID. I start my laptop and, because the usbpool still exists in my cache, it gets reimported. All is well. Now lets imagine I shutdown and give the disk away. I no longer have it and will never use it again, but I forgot to export the pool. Unfortunately, the pool will still exist in the cache, so there will always be a message at boot saying:
This is not a problem, but it's kind of annoying. The solution is to get rid of the cache file and regenerate it from the presently imported pools:
Then all is well again. So, in this example, the only problems I would identify are two user interface issues:
Let's fix the user interface, but let's not throw the baby (cachefile) out with the bath water (interface). |
Here's an example of why getting rid of the cache file by default is a bad idea (in addition to the 4 or 5 other examples I've given in this pull request and #3777): I'm a bad guy. I make a USB pool with a virus on it. I sneak into a datacenter and discreetly plug it into a web server. The next time that web server reboots, because there is no cache file, the boot scripts try to import everything, including the USB pool, which then mounts itself up at the document root and spreads wildly. There are a million variations of this. In response you might say (and @FransUrbo has already said) that the boot scripts could be configured not to import everything at boot, but then you're left configuring which ones you do want to import at boot. But why do that when we already have a means of capturing that information, the |
@MrStaticVoid this is good explanation and I like the idea of using cachefile for the selection of pools to import at boot. However the problem you have not addressed yet (and the one which prompted search for alternative means to import pools at startup) is that the copy of cachefile has to be maintained inside initramfs , which means that every time we import a new pool, or change the path to existing pool (which we wish to be auto-imported on boot), a new initramfs will have to generated. In other words, it is very easy for a copy of cachefile inside initramfs to become obsolete, which may lead to difficulties starting up the system (if the affected filesystem is root or other essential directory). |
Maybe the solution is something like /etc/zfs/zpool.cache.d/ to store
|
@Bronek, the cachefile, as I've said a couple times in this pull request, does not need to be in the initramfs at all. The only job of the initramfs is to import the rpool and mount /, which it can do without a cache. You seem to suggest that other pools get imported by the initrd and thus it needs to maintain an up-to-date copy in the initramfs, but that is not the case. The other pools, data pools if you want to call them that, are imported by a plain old init script, |
Why? Just set the 'cachefile' property accordingly when creating the pool. Don't overcomplicate things for the majority, to solve the problem for a minority! @MrStaticVoid You seem to be purposefully ignoring everything I said. Is there a reason why, or are you just a troll? |
You keep using that word, "majority," but so far all I've heard is your own incorrect version of the facts. I've been able to point to nine different examples showing how you're wrong and how your code and proposed changes will hurt not just the majority, but everybody. I am not trolling, I am using facts and evidence to show how you're wrong. Sorry you don't like that. |
Actually, you're not. You're using old, stale information from a system that only vaguely resembles the one ZoL is running on to justify your deluded opinions instead of learning the actual facts. Linux isn't Solaris. It doesn't work the same. It vaguely look the same, but looks is deceiving when you start scratching at the surface. |
ZoLoft may be something to consider... In all seriousness, hostility of this nature does not belong here, use IRC Taking the approach of altering ZoL away from OpenZFS implementation should
|
Contrary to belief, I actually agree with that! I'm usually the first to "storm the beaches" when there's a compatibility problem with (Open)ZFS. I have reported several issues of the fact, all still open.. :( I spent almost two years on the list and quite a while on the IRC channel helping users as well as seeing almost every issue here (some people seems to believe this is a support forum), so I see "the big picture". @MrStaticVoid does not. He only sees his own, limited system and thinks that THAT is the real world and THAT is how everyone should do it (extremely rude if you ask me). It works for him, but not for the large community. And letting the minority dictate the actions of the majority is "rude" at best! Insane at it's worst... There is a very simple way to configure the system for him, and still allowing everyone else to both enjoy the marvels of ZFS On Linux!
Actually, it doesn't. We have no choice! Our boot loader, Grub, can't mount and import pools (yet/properly). And if/when a root pool can't be imported without a cache file (such as when it consists of huge amounts of VDEVs and/or from multi path devices), then things gets really complicated, real fast! This I've said several times, but @MrStaticVoid choose to ignore all that. Due to ignorance, incompetence or simply stupidity is anyones guess. |
The point about complex rpools makes sense. We generally build out
|
I'm not entirely sure I understand what you're asking here… It have been brought to my attention that multi-path system needs a cache file. I don't quite understand or know why, but if they say so, I have nothing to add to that… :)
I'm unsure at the exact state of ZFS in Grub is at the moment. About a year ago, I worked several months to get a bootable ISO with native support for ZFS/ZoL and a large part of the problem was with Grub. Regarding /boot, for sure! If it can't load the kernel and the initrd, then of course you can't boot of a ZFS system. But also, it (as far as I know) also can't mount any filesystem (it can "mount" them internally to get to the kernel/initrd, but not mount them so that they're available for init - or systemd). That have ALWAYS been the work of the initrd in Linux. |
This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Closes openzfs#3777 Closes openzfs#3526
This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Closes openzfs#3777 Closes openzfs#3526
Hello there, |
This is the first time I have ever NACKed a patch, but there is a first time for everything. This patch should not be merged. The method for getting rid of the cachefile that @FransUrbo uses on Debian opens VM hosts to exploitation from guests on reboot. I have sketched out the zero-day attack based on it and emailed the details to @behlendorf. Trying to change the default cachefile setting to none will just make it harder to fix the problem, whose solution is to use the cachefile. This patch will break import on systems using the systemd unit files and basically every other system that is consistent with Illumos, which includes plenty of custom boot scripts. That also includes the entire Gentoo family of distributions, where the auto-import behavior was disabled by default to be consistent with Illumos, the systemd unit files and how Gentoo originally did things. If this is merged, I will revert it in Gentoo and encourage the others distribution maintainers to revert it. This patch builds on a really bad idea and is a step in the wrong direction. |
The large majority isn't running virtual hosts, ZFS from SAS devices etc, etc so for them, not having a cache file IS (and have been for the last couple of years) the correct and easiest behavior. Something you seem to refuse to see, ryao. And I'm (again) not removing the cache file, I'm changing the default! I would appreciate if you could send your attack to me as well, I'd really would like to see what you have on that. |
I will observe that zfs should be trying to get into sas enterprise space... Turbo Fredriksson [email protected] wrote:
|
On Oct 9, 2015, at 1:17 PM, dswartz wrote:
Yes, absolutly! ZFS is the perfect FS for that and many other things. But the majority is still, and will be for the foreseeable future, smaller sites use-cases. I don't think it's fair to set defaults that goes against the majority and only benefit a very small minority.. |
I don't see how having cachefile goes against anyone, honestly. It will be great to see something more specific. Judging by recent discussions it would seem that it is rather straightforward to get the correct behaviour without the need to maintain synchronized copy of cachefile inside initramfs; it only needs to be in the expected location of the root filesystem (not inside initramfs, as previously assumed). Also it is not difficult at all to trigger import-all in the absence of this file. The installation instruction I went by when installing ZFS first time actually requires that cache file will be set, and I was only surprised and confused later when I found out that it is actually not used. |
|
This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Signed-off-by: James Lee <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3777 Closes #3526
This in preparation to a later complete removal of cache file.
A cache file is almost never needed, at least not on Linux, so the long-time plan is to remove it altogether. This is the first step.
I noticed that this affects the
import
as well ascreate
(which was not my intention - I only wanted it for thecreate
command). That is, if I create a pool with the old code, using defaults (-
), export it and then import it with the new code, thecachefile
property will benone
.That might not be what we want, although I'm for it...