-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write-through special device #15118
Comments
Making a special mirror mode that would write new data out redundantly with a special case of copies=2-like behavior might be feasible, conceivably - rewriting it retroactively, I'd put my money on "not unless someone spends a fortune, and probably not then either". |
Special vdev may be used to solve two problems -- speed and space efficiency. From speed perspective proposed copies=2 logic may have sense, especially since a lot of metadata already use copies=2, and I don't think ZFS ever reads second copy if the first is OK. It should not be a huge deal to write second copy to normal vdevs, but reconstruction of the lost special vdev is something not supported at this time. From space efficiency though it makes no sense, since if main vdev is unable store small objects, like DRAID, then it just can't and redundant special vdev is the only solution to do it efficiently. |
Reconstruction isn't really a priority anyway, I'll tweak the original post to make that more clear. Currently when you add a "Reloading/preloading" a |
Basically this asks for #13460 (comment) Maybe with the addition that all reads from eligible datasets resulting in cache misses are automatically written to this special vdev. |
While there's similarity I'm not sure it's really the same thing; the advantage of While "pinning" data in ARC is interesting, it has issues with potentially making the ARC less efficient (infrequently used pinned data would prevent the ARC being used for more frequently used data) and actually specifying the data to pin (and keep it updated in a useful way), is complex. I've submitted a similar but distinct proposal as #15051 to allow cache behaviour to be set using a record size; this is much simpler and would make it possible to prevent large records from filling up the ARC so that more smaller records can be retained, but while this will help it doesn't provide the same guarantee that a correctly sized and configured In general a |
I agree that special devices are superior to L2ARC in basically all aspects, except that they can't be removed and they can't accelerate access to data already existing on-disk. The part you might have overlooked in the comment:
Also:
I did not suggest anything like that. |
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Special failsafe is a feature that allows your special allocation class vdevs ('special' and 'dedup') to fail without losing any data. It works by automatically backing up all special data to the pool. This has the added benefit that you can safely create pools with non-matching alloc class redundancy (like a mirrored pool with a single special device). This behavior is controlled via two properties: 1. feature@special_failsafe - This feature flag enables the special failsafe subsystem. It prevents the backed-up pool from being imported read/write on an older version of ZFS that does not support special failsafe. 2. special_failsafe - This pool property is the main on/off switch to control special failsafe. If you want to use special failsafe simply turn it on either at creation time or with `zpool set` prior to adding a special alloc class device. After special device have been added, then you can either leave the property on or turn it off, but once it's off you can't turn it back on again. Note that special failsafe may create a performance penalty over pure alloc class writes due to the extra backup copy write to the pool. Alloc class reads should not be affected as they always read from DVA 0 first (the copy of the data on the special device). It can also inflate disk usage on dRAID pools. Closes: openzfs#15118 Signed-off-by: Tony Hutter <[email protected]>
Describe the feature would like to see added to OpenZFS
The idea is to allow for
special
devices to be configured for a pool in a "write-through" mode such that any record assigned to a special vdev is also written to a non-special vdev as well. As a result the special device will not require the same guarantee for redundancy as the rest of the pool (it could be a single disk and still be perfectly safe, in the same way as acache
device).Initially this would simply be an option when adding the device; in this way the "write-through"
special
device is no different to a normal one, in that you will see no benefit until special records start being written.At a later time (possibly as a separate feature) it would be good to also see the ability to change a
special
device's mode, i.e- changing an existingspecial
device to "write-through" mode would cause all records currently stored within it to be copied to non-special vdevs, and once complete it can be lost without any data loss. While removing "write-through" mode would cause copying of special records to non-special vdevs stop (but any existing copies would remain in place, same as for a change ofcopies=N
).How will this feature improve OpenZFS?
This will allow for the use of a
special
vdev with less (or no) redundancy compared to the rest of the pool, without the risk of data loss, while still accelerating reading of special records (metadata, small files etc.).Additional context
This feature is an alternative to #15051 (record size limit for ARC/L2ARC). While #15051 would be the easier to implement, this one is the more "correct" since it also accelerates smaller record reading but in a more predictable way since it can (with a suitable
special
device) guarantee that all such records are accelerated, rather than just whichever ones happen to remain in ARC/L2ARC long enough to be used.There is also a side-issue #15226 for filling
special
vdevs which is important but not critical for the write-through case; while loss of a write-through special device should be safe (no data lost), once replaced there will be no special records on the new device meaning performance is lost instead, so filling is important to "reload/resilver" the new device.The text was updated successfully, but these errors were encountered: