-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve ashift handling #2024
Comments
I just verified this with the latest HEAD:
This disk have been exported with Unfortunately, it seems that VirtualBox have the block size hard coded to 512, because I can't find anywhere to change it (I created a number of volumes with volblocksize=4096 and shared that with blocksize=4096, but I still end up with a 512 block device in the host). |
@FransUrbo I agree the current behavior isn't exactly what one would expect. It feels like when you're creating the pool you're setting a pool wide property which will be inherited. Unfortunately, the code doesn't work that way the optimum ashift value is vdev specific. By default it will select whatever ashift size the drive claims is optimal as long as it's compatible with the ashift sizes used by other vdevs. At a minimum we should document this behavior clearly. Even better would be to make the code slightly smarter so There some related work from the FreeBSD guys in #1671 but it still needs a careful review and testing. |
Users need to be aware that when adding devices to an existing pool they may need to override automatically detected ashift value. This will all depend on the exact hardware they are using. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes: #2024
@behlendorf I see that you put this in for 0.6.3, but the real solution is to write code that takes care of this. Maybe I shouldn't have written a 'Closes' in the commit, but instead 'Closes (partly)' :). Could you tag this issue for 0.6.4 (and not close it if/when you accept the pull req), but keep the #2363 for 0.6.3? |
Sure, I've reopened the issue and bumped it to 0.6.4. But I am glad we're documenting the current behavior more clearly. |
Users need to be aware that when replacing devices in an existing pool they may need to override automatically detected ashift value. This will all depend on the exact hardware they are using. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #2024
This is a mild understatement.
Running
certainly leads to that expectation.
It is very confusing. If ZFS completely ignores the ashift property of the pool when it comes to managing vdevs (add, attach, replace) - then what is it's use and/or why is it still in existance as a pool property? |
|
So zpool get ashift will report something like 9,12 when vdevs with different ashift exist? |
|
Users need to be aware that when replacing devices in an existing pool they may need to override automatically detected ashift value. This will all depend on the exact hardware they are using. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2024
I've just been bitten by the assumption that the value used at create time is inherited for subsequent attach operations. Specifically the FAQ http://zfsonlinux.org/faq.html says:
I think this should be clarified asap. |
It's clearly there… If you/people don't read the documentation, then it doesn't matter how or where we put it... |
I did read the documentation and it told me that the property can only be set when the pool is first created. It seems reasonable to assume that the documentation is consistent and correct whether that is the man page or the FAQ. As it stands I consider the FAQ text contradicts the and suggest it should be amended for clarity or that it should simply refer man page if that is the authoritative source rather than making a contradictory statement. The line in the man page was added in 022f7bf which is pre-dated by https://web.archive.org/web/20140307194846/http://zfsonlinux.org/faq.html. |
There is no contradiction! Both apply:
|
I perceive a contradiction because my interpretation of the FAQ text is that the ashift value cannot be changed once the zpool has been created. This is correct in the sense that for the original vdev it cannot but it also implies that all vdevs which are subsequently attached will use the same value which is not true and was clarified by the man page commit. It would be helpful for the FAQ to make the same clarification as the man page that the parameter only applies to the initial vdev and that future operations should use the ashift option again when necessary. |
Correct.
And that's where the man page comes in:
I could agree on that. BUT,
If you want to create a FAQ update, feel free to create a pull request against https://github.com/zfsonlinux/zfsonlinux.github.com. |
@JKDingwall I'm all for updating the FAQ if you could propose some clear language to avoid confusion. |
Perhaps instead of the sentence
We can word it as
|
Sounds good to me! @Sachiru do you also just happen, perchance, have a wording for the FAQ? :) |
I've created a pull request with the updated FAQ here: |
@Sachiru thank you, I've merged your updated FAQ entry. |
This really seems like a problem - I got burned by this as well, setting ashift=12 at creation time, not realizing that it would need me to also set ashift=12 every time I ever added a device to a pool again if I didn't want to mix them. If I cut a pull request that changed the zpool add behavior to default to the pool-wide ashift property if set to non-zero, and require -f if it's set to 9 and one or more of the devices report =12, would it be likely to be taken? |
@rincebrain Sounds likely to be accepted. IF you do the same for |
@rincebrain we'd have to see the proposed change of course, but yes that sounds reasonable. As long as you handle both the |
This commit allow higher ashift values (up to 16) in 'zpool create' The ashift value was previously limited to 13 (8K block) in b41c990 because the limited number of uberblocks we could fit in the statically sized (128K) vdev label ring buffer could prevent the ability the safely roll back a pool to recover it. Since b02fe35 the largest uberblock size we support is 8K: this allow us to store a minimum number of 16 uberblocks in the vdev label, even with higher ashift values. Additionally change 'ashift' pool property behaviour: if set it will be used as the default hint value in subsequent vdev operations ('zpool add', 'attach' and 'replace'). A custom ashift value can still be specified from the command line, if desired. Finally, fix a bug in add-o_ashift.ksh caused by a missing variable. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #2024 Closes #4205 Closes #4740 Closes #5763
I just got three additional 3TB disks to replace disks in a vdev with 1.5TB disks that isn't under warranty any more.
The pool was originally created like this (all 1.5TB disks, five vdevs with three disks in each as a RAIDz):
The disks in the first vdev was replaced with 3TB disks some months ago:
This naturally worked just fine, so I didn't have any suspicion that the current replace would fail. But I got 'devices have different sector alignment'. Googling this, I found #1381.
So double checking with zdb I found:
So the
zpool add
didn't create the vdevs with ashift=12, but the default ashift=9!This was not expected. I had expected the ashift to be inherited (all documentation talks about 'use ashift=12 when creating the pool' - which I did).
One could argue that this is a documentation issue, but I think that would be a (to) easy fix. Currently I see that '-o ashift=xx' is allowed for
add
, but I'm quite sure it wasn't when I created the pool a year ago. Don't remember the exact version I used then, but some of the 0.6.0 rc versions I think. 0.6.0-rc11 was stable at the time, so I guess that's the one...PS. I've now done the first replace using '-o ashift=9' (been resilvering for about 14h, 16h to go at ~200M/s). So this explains why my pool have been so darn slow! I thought it was cpu/mem/controller that wasn't fast enough so I never bothered to investigate further - it was fast enough for my use case...
The text was updated successfully, but these errors were encountered: