-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set "none" scheduler if available (initramfs) #9042
Conversation
I vote for removing scheduler changes from ZFS entirely. It is a problem better solved by OS installer. |
I fully agree. A note in the installation/system recommendations documentation would probably suffice. It's still not really clear to me why the simple schedulers are preferred; if the ZFS code also uses explicit synchronisation mechanisms. |
Codecov Report
@@ Coverage Diff @@
## master #9042 +/- ##
==========================================
- Coverage 79.31% 78.99% -0.32%
==========================================
Files 400 400
Lines 121942 121653 -289
==========================================
- Hits 96716 96098 -618
- Misses 25226 25555 +329
Continue to review full report at Codecov.
|
I'm inclined to agree, this behavior was originally added solely as convenience since at the time the kernel provided an easy mechanism to set this. That hasn't been the case since the 4.12 kernel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The comment above the changed code says "elevator=noop". That should probably be adjusted in some way. For example, "Set the elevator to noop/none on the root pool's vdevs' disks. ZFS already". You'll probably need to re-wrap it as it's getting longer.
- shellcheck is complaining about the
while read i
which it wants to bewhile read -r i
. The latter seems to work. Maybe we should fix that here? I realize that's not new (and is my fault, as I think I wrote this originally).
Before we get into the weeds of defensive shell programming; what are general thoughts about removing this stanza completely and not attempting to set the scheduler in initramfs? I'm more inclined to leave this up to individual systems administrators, or maybe print a warning here if the enclosing devices are using a scheduler we don't trust? |
If the proposal is for this to be entirely removed from ZFS (i.e. ZFS will not set |
Fair enough; that deserves a more thorough design review. In the meantime, implemented the changes you suggest; PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. But would you mind rebasing this on the latest master and force updating the PR.
Rebase pushed. I think the commit message from 42c24d9 is sufficient for the whole merge. |
@colmbuckley it looks like this wasn't rebased. Would you mind doing it one more time and force updating the PR. Please also add your signed-off-by to the comment message ( |
Existing zfs initramfs script logic will attempt to set the 'noop' scheduler if it's available on the vdev block devices. Newer kernels have the similar 'none' scheduler on multiqueue devices; this change alters the initramfs script logic to also attempt to set this scheduler if it's available. Signed-off-by: Colm Buckley <[email protected]>
Sorry, my git-fu is weak. Done, I think, PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks.
Didn't make 0.8.2? Bah. |
Existing zfs initramfs script logic will attempt to set the 'noop' scheduler if it's available on the vdev block devices. Newer kernels have the similar 'none' scheduler on multiqueue devices; this change alters the initramfs script logic to also attempt to set this scheduler if it's available. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Garrett Fields <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes openzfs#9042
Existing zfs initramfs script logic will attempt to set the 'noop' scheduler if it's available on the vdev block devices. Newer kernels have the similar 'none' scheduler on multiqueue devices; this change alters the initramfs script logic to also attempt to set this scheduler if it's available. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Garrett Fields <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes openzfs#9042
Existing zfs initramfs script logic will attempt to set the 'noop' scheduler if it's available on the vdev block devices. Newer kernels have the similar 'none' scheduler on multiqueue devices; this change alters the initramfs script logic to also attempt to set this scheduler if it's available. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Garrett Fields <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes #9042
Existing zfs initramfs script logic will attempt to set the 'noop' scheduler if it's available on the vdev block devices. Newer kernels have the similar 'none' scheduler on multiqueue devices; this change alters the initramfs script logic to also attempt to set this scheduler if it's available. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Garrett Fields <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes openzfs#9042
This effectively reverts 4fc411f (part of openzfs#6807) and f6fbe25 (openzfs#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Signed-off-by: Ahelenia Ziemiańska <[email protected]>
This effectively reverts 4fc411f (part of #6807) and f6fbe25 (#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #11838
This effectively reverts 4fc411f (part of #6807) and f6fbe25 (#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #11838
This effectively reverts 4fc411f (part of openzfs#6807) and f6fbe25 (openzfs#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#11838
This effectively reverts 4fc411f (part of openzfs#6807) and f6fbe25 (openzfs#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#11838
This effectively reverts 4fc411f (part of openzfs#6807) and f6fbe25 (openzfs#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#11838
This effectively reverts 4fc411f (part of openzfs#6807) and f6fbe25 (openzfs#9042) ‒ the code itself and latter PR cite symmetry with whole-disk-vdev behaviour (presumably because rootfs vdevs are rarely whole disks), but the code is broken for NVME devices (indeed, it'd strip the controller number instead of the (potential) partition number, turning "nvme0n1p1" into "nvmen1p1", which would then subsequently fail the sysfs existence check); it could be fixed to handle those (and any others) rather easily by dereferencing /sys/class/block/$devname, but this isn't the place for setting this ‒ as noted in the commit that removed setting the scheduler by default (9e17e6f) ‒ use an udev rule Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes openzfs#11838
Existing zfs initramfs script logic will attempt to set the 'noop' scheduler if it's available on the vdev block devices. Newer kernels have the similar 'none' scheduler on multiqueue devices; this change alters the initramfs script logic to also attempt to set this scheduler.
Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.