-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reading from mirror vdev limited by slowest device, II. #1742
Comments
Nice idea, still there would be the zfs heartbeat in form of uberblock updates on every txg commit. Also the question comes to my mind how to decide about 'when possible', what shall this mean for a computer? |
My understanding is that md1 devices marked |
This issue was addressed by 556011d. |
Not really the same as 556011d, but nice work nonetheless. This issue rather suggests "soft blocking" some redundant devices for read operations so the administrator can set up arrays that are used only for reading (anon ftp servers come to mind) seldomly used data (archival purposes). If accesses happen, only one (out of two+) disk has to spin up then, thereby saving energy. |
@jjYBdx4IL Yes, I see what you're saying. If you'd like we can leave this open as a feature request. Perhaps someone will find the time to work on it. |
@behlendorf while 556011d lessens the problem it will still commit reads toward a slow device (f.ex. the spinning disk side of SSD/HDD mirror) slowing I/O to the whole vdev. I think it would be nice if the admin would have the ability to make informed decisions about how ZFS should manage the bare-metal devices (in case he knows better than the default algorythm). Please keep this open as a feature request, at least as long as the read scheduler dosn't know about if a device is spinning or not (the BSD port has some code to tune for such scenarios: http://svnweb.freebsd.org/base?view=revision&revision=256956) |
@behlendorf @GregorKopka Has there been any development regarding this issue? I have been testing 2-way mirror read behaviour recently in SSD+HDD pools, and there seems to be no strong preference to the SSD when both vdevs are online. Benchmark: reading 6GB of random data in a 2GB RAM guest using a 2-way mirror, ZFS on Linux 0.6.4.1 SSD+HDD mirror zpool online: 20 sec Will do more precise benchmarking in the near future. |
The problem still is that zfs knows nothing about the individual drive is rotating or not: |
Under Linux at least it's relatively straight forward to tell if a vdev is rotational or non-rotational. It would be straight forward to use this information to always prefer the non-rotational disk. However, this is not always the right thing to do since non-rotational is not 100% synonymous with fastest. There would need to be some kind of reasonable interface which could be used to control this behavior. |
@behlendorf In case we could attach properties to vdevs and vdev-members in the form of Additional use cases for something like that which come into my mind: Does this sound like a good idea? |
@GregorKopka that was discussion of adding per-vdev property support by one of the illumos developers. There are quite a number of things it would be useful for. I've no idea how far that work got but I liked the idea. |
I just stumbled upon the need to have the exact equivalent of mdadm --write-mostly feature. Even if title of the op issue is slightly different and solved by mentioned commits, the body of the message leaves the --write-mostly feature still missing from zfs. Can this still be reopened as the feature request to be able to do exactly like mdadm? for example: |
Hello all. I'm considering migrating my volumes from mdadm to ZFS. Currently I'm doing "feasibility study". For my |
You can reach something near that, but the downside is that it is a system wide tuning. The basic idea is to tune to cost for reads from rotational media to absurd levels, in the hope that ZFS won't queue to these devices. In case you want to try this: read up about https://openzfs.github.io/openzfs-docs/Performance%20and%20tuning/ZFS%20on%20Linux%20Module%20Parameters.html#zfs-vdev-mirror-rotating-inc and the following module parameters, then play with them. |
this issue should still be open, @behlendorf despite all commits mentioned, this is the real world sequential read speed with and without the slow leg: 300MB/s vs 100MB/s. Ubuntu 18.04 LTS, zfs 0.8.3 (I doubt is there any change related to this from 0.8.3 to 0.8.4)
let's read from this highly compressed file to ensure there is little or nothing to decompress at zfs level
with slow device present:
with slow device absent:
|
Will this be reopened as a feature request? I'm in a situation where I want mirroring but one NVME is a lot slower than another due to limited bandwidth to one m.2 interface - I figured this was the only potential feature or functionality to remedy that while keeping mirroring. |
@behlendorf Brian, would you please review all comments piled up since closure of this? Mentioned commits that close this issue don't solve problem. slowest device in pool still bottlenecks reads. |
hi @behlendorf any chance this issue can be revisited and reopened? simple reproducer: |
In case you can divide your devices by being rotational (the slow ones) or not (the fast ones), you might be able to tune the rotational ones to a cost for read that's high enough to never be selected - at least as long as the fast ones queue does not fill completely (then all bets are off). As a feature request there already is #3810 |
Reopening so this can be investigated. |
I don't see why we should not be able to have a mirror with a delimiter for its slowest device. In reality, most laptops don't have 2 2280 NVMes, but a lot more have a smaller form factor 2nd NVMe (2230/2242) - which would slow down the pool. Since ZFS is a lot less valuable being run on a single device, then it would make sense to enable it to be mirrored in situations where the mirror would slow it down. The suggestion is typically "just ZFS send if you're going to do that" - which makes sense now, but it is clearly much easier to keep the data up-to-date in a mirror. |
Being very related to
#1461
I suggest a simpler solution:
mdadm (Linux' own softraid md1 implementation) has the -W option when creating a raid setup ("write mostly") which indicates the disk to use primarily for writes, ie. directing all reads to the other disk when possible. Would be great if zfs would have such an option, too, for the mirror vdev so that one drive may go to sleep, thereby reducing noise and power consumption in situations where there are mostly no writes at all (mirror containing daily backups etc.).
The text was updated successfully, but these errors were encountered: