Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool remove fails with "Out of space" #11356

Closed
segdy opened this issue Dec 16, 2020 · 8 comments
Closed

zpool remove fails with "Out of space" #11356

segdy opened this issue Dec 16, 2020 · 8 comments
Labels
Status: Stale No recent activity for issue Status: Understood The root cause of the issue is known Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@segdy
Copy link

segdy commented Dec 16, 2020

System information

Type Version/Name
Distribution Name Debian
Distribution Version buster with backports
Linux Kernel 4.19.0-13-amd64
Architecture amd64
ZFS Version zfs-0.8.5-2~bpo10+1
SPL Version modinfo spl | grep -iw version: 0.8.5-2~bpo10+1 (but dpkg -l |grep spl reports 0.7.12-2+deb10u1)

Describe the problem you're observing

According to #6900 it should be possible to remove a top-level vdev (implemented in v 0.8.5)

However, zpool remove results in "out of space".

Describe how to reproduce the problem

# dd if=/dev/zero bs=1M count=64 of=/ZFS5
# dd if=/dev/zero bs=1M count=64 of=/ZFS6
# dd if=/dev/zero bs=1M count=64 of=/ZFS7
# dd if=/dev/zero bs=1M count=64 of=/ZFS8
# zpool create zptest2 mirror /ZFS5 /ZFS6 mirror /ZFS7 /ZFS8
# zpool get feature@device_removal zptest2
NAME     PROPERTY                VALUE                   SOURCE
zptest2  feature@device_removal  enabled                 local
# zpool status zptest2
  pool: zptest2
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zptest2     ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            /ZFS5   ONLINE       0     0     0
            /ZFS6   ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            /ZFS7   ONLINE       0     0     0
            /ZFS8   ONLINE       0     0     0

errors: No known data errors
# zpool remove zptest2 mirror-0
cannot remove mirror-0: out of space
# zpool remove zptest2 mirror-1
cannot remove mirror-1: out of space

Include any warning/errors/backtraces from the system logs

@segdy segdy added Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang) labels Dec 16, 2020
@behlendorf
Copy link
Contributor

This is caused by the relatively small size of the disks, in this case 64M. In practice for this test to pass the disks must be at least 300M in size. I agree the error here isn't very helpful and we should look in to if this can be improved for very small vdevs.

@behlendorf behlendorf added Status: Understood The root cause of the issue is known and removed Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang) labels Dec 16, 2020
@segdy
Copy link
Author

segdy commented Dec 16, 2020

Amazing, works! Thanks.

Indeed, different message would be helpful.

@WhittlesJr
Copy link

WhittlesJr commented Dec 28, 2020

I'm having the same problem, but with much larger vdevs. I accidentally added an 8TB mirror to the wrong pool, and now I can't remove it. This is a rootfs pool (but not a boot pool)... could that be related?

> zpool list -v
NAME                                                      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
bpool                                                     960M  85.6M   874M        -         -     2%     8%  1.00x    ONLINE  -
  mirror                                                  960M  85.6M   874M        -         -     2%  8.92%      -  ONLINE  
    ata-PNY_CS1311_240GB_SSD_PNY09162169090207A9E-part3      -      -      -        -         -      -      -      -  ONLINE  
    ata-SATA_SSD_19082824003384-part3                        -      -      -        -         -      -      -      -  ONLINE  
rpool                                                    7.48T  51.1G  7.43T        -         -     0%     0%  1.00x    ONLINE  -
  mirror                                                  222G  46.9G   175G        -         -    21%  21.1%      -  ONLINE  
    ata-PNY_CS1311_240GB_SSD_PNY09162169090207A9E-part4      -      -      -        -         -      -      -      -  ONLINE  
    ata-SATA_SSD_19082824003384-part4                        -      -      -        -         -      -      -      -  ONLINE  
  mirror                                                 7.27T  4.23G  7.26T        -         -     0%  0.05%      -  ONLINE  
    ata-WDC_WD80EDAZ-11TA3A0_VGH4LNAG                        -      -      -        -         -      -      -      -  ONLINE  
    ata-WDC_WD80EDAZ-11TA3A0_VGH29VSG                        -      -      -        -         -      -      -      -  ONLINE  
tank                                                     1.81T  1.74T  75.0G        -         -    54%    95%  1.00x    ONLINE  -
  mirror                                                  928G   890G  37.9G        -         -    57%  95.9%      -  ONLINE  
    ata-TOSHIBA_HDWD110_36H5V8SNS                            -      -      -        -         -      -      -      -  ONLINE  
    ata-TOSHIBA_HDWD110_46OLZXNNS                            -      -      -        -         -      -      -      -  ONLINE  
  mirror                                                  928G   891G  37.1G        -         -    52%  96.0%      -  ONLINE  
    ata-TOSHIBA_HDWD110_4670AK5NS                            -      -      -        -         -      -      -      -  ONLINE  
    ata-WDC_WD10EZEX-08WN4A0_WD-WCC6Y5FX3J8R                 -      -      -        -         -      -      -      -  ONLINE  


> zpool remove rpool mirror-1
cannot remove mirror-1: out of space
> zfs version
zfs-2.0.0-1
zfs-kmod-2.0.0-1

behlendorf added a commit to behlendorf/zfs that referenced this issue Dec 29, 2020
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
and often prevent removal when it clearly should be possible.

This change reworks the check to use the per-vdev vs_stats.  This
is sufficiently accurate and is convenient because it allows a direct
comparison to the allocated space.  If using dsl_dir_space_available()
then the available space of the device being removed is also included
in the return value and must somehow be accounted for.

Additionally, we reduce the slop space requirement to be inline with
with the capacity of the pool after the device has been removed.
This way if a large device is accidentally added to a small pool the
vastly increased slop requirement of the larger pool won't prevent
removal.  For example, it was previously possible that even newly
added empty vdevs couldn't be removed due to the increased slop
requirement.  This was particularly unfortunate since one of the
main motivations for this feature was to allow such mistakes to be
corrected.

Lastly it's worth mentioning that by allowing the removal to start
with close to the minimum required free space it is more likely that
an active pool close to capacity may fail the removal process.  This
failure case has always been possible since we can't know what new
data will be written during the removal process.  It is correctly
handled and there's no danager to the pool.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Dec 29, 2020
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
and often prevent removal when it clearly should be possible.

This change reworks the check to use the per-vdev vs_stats.  This
is sufficiently accurate and is convenient because it allows a direct
comparison to the allocated space.  If using dsl_dir_space_available()
then the available space of the device being removed is also included
in the return value and must somehow be accounted for.

Additionally, we reduce the slop space requirement to be inline with
with the capacity of the pool after the device has been removed.
This way if a large device is accidentally added to a small pool the
vastly increased slop requirement of the larger pool won't prevent
removal.  For example, it was previously possible that even newly
added empty vdevs couldn't be removed due to the increased slop
requirement.  This was particularly unfortunate since one of the
main motivations for this feature was to allow such mistakes to be
corrected.

Lastly it's worth mentioning that by allowing the removal to start
with close to the minimum required free space it is more likely that
an active pool close to capacity may fail the removal process.  This
failure case has always been possible since we can't know what new
data will be written during the removal process.  It is correctly
handled and there's no danager to the pool.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
@behlendorf
Copy link
Contributor

behlendorf commented Dec 29, 2020

@WhittlesJr yes it sounds like you've hit this same issue. I've opened PR #11409 with a proposed fix for this. If you're comfortable rolling your own build of ZFS you could apply it, remove the vdev using the patched version, then switch back to your current version.

@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Dec 29, 2020
@WhittlesJr
Copy link

@behlendorf, This pool is a bit too sensitive to try experimental builds, I'm afraid. I've already rebuilt and restored it, so I'm good now. But I'm grateful that you've proposed a fix to the issue. If I can help test it in other ways, let me know.

@ikozhukhov
Copy link
Contributor

@behlendorf another one updated test

root@lenovo:/var/tmp/space# cat space.sh 
#!/bin/sh

# Size of files for a test pool (MB)
PSIZE=300

# Test file size (MB)
FSIZE=100

# vdevs
FILE1=/var/tmp/file01.data
FILE2=/var/tmp/file02.data
FILE3=/var/tmp/file03.data
FILE4=/var/tmp/file04.data

POOL=testpool

# Create two files for the test pool
truncate -s ${PSIZE}m ${FILE1} ${FILE2}

# Create the pool
zpool create -o ashift=12 -o cachefile=none -O compression=on -O atime=off \
        ${POOL} mirror ${FILE1} ${FILE2}

# Create a dataset
zfs create -o recordsize=1m -o compression=on ${POOL}/01

# Prepare an uncompressed contents
dd if=/dev/urandom of=/${POOL}/01/f01.rnd bs=1M count=$((FSIZE)) status=progress

# Prepare a compressed contents
dd if=/dev/zero of=/${POOL}/01/f02.tmp bs=1M count=$((FSIZE)) status=progress

# Create VDEVs for another mirror
truncate -s ${PSIZE}m ${FILE3} ${FILE4}

# Add them to the test pool
zpool add ${POOL} mirror ${FILE3} ${FILE4}

# Show available space
zpool list ${POOL}
zfs list -r ${POOL}

# Try to remove first mirror
zpool remove ${POOL} mirror-0
root@lenovo:/var/tmp/space# sh -x space.sh 
+ PSIZE=300
+ FSIZE=100
+ FILE1=/var/tmp/file01.data
+ FILE2=/var/tmp/file02.data
+ FILE3=/var/tmp/file03.data
+ FILE4=/var/tmp/file04.data
+ POOL=testpool
+ truncate -s 300m /var/tmp/file01.data /var/tmp/file02.data
+ zpool create -o ashift=12 -o cachefile=none -O compression=on -O atime=off testpool mirror /var/tmp/file01.data /var/tmp/file02.data
+ zfs create -o recordsize=1m -o compression=on testpool/01
+ dd if=/dev/urandom of=/testpool/01/f01.rnd bs=1M count=100 status=progress
92274688 bytes (92 MB, 88 MiB) copied, 4 s, 22.8 MB/s
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 4.74417 s, 22.1 MB/s
+ dd if=/dev/zero of=/testpool/01/f02.tmp bs=1M count=100 status=progress
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.676281 s, 155 MB/s
+ truncate -s 300m /var/tmp/file03.data /var/tmp/file04.data
+ zpool add testpool mirror /var/tmp/file03.data /var/tmp/file04.data
+ zpool list testpool
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
testpool   576M   101M   475M        -         -     3%    17%  1.00x    ONLINE  -
+ zfs list -r testpool
NAME          USED  AVAIL     REFER  MOUNTPOINT
testpool      101M   347M       96K  /testpool
testpool/01   100M   347M      100M  /testpool/01
+ zpool remove testpool mirror-0
cannot remove mirror-0: out of space

@behlendorf
Copy link
Contributor

@ikozhukhov it looks like this test failure is also occurring due to the minimum "slop" requirement. Even with the patch there must be a minimum of "vdev-allocated-bytes + 2 * slop-bytes" of available space on the remaining vdevs. Since by default there's a minimum "slop" size of spa_min_slop=128M, this equates to a minimum of 256M must be free on the remaining vdevs plus additional capacity for the allocated space which needs to be copied.

behlendorf added a commit to behlendorf/zfs that referenced this issue Dec 30, 2020
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
and often prevent removal when it clearly should be possible.

This change reworks the check to use the per-vdev vs_stats.  This
is sufficiently accurate and is convenient because it allows a direct
comparison to the allocated space.  If using dsl_dir_space_available()
then the available space of the device being removed is also included
in the return value and must somehow be accounted for.

Additionally, we reduce the slop space requirement to be inline with
with the capacity of the pool after the device has been removed.
This way if a large device is accidentally added to a small pool the
vastly increased slop requirement of the larger pool won't prevent
removal.  For example, it was previously possible that even newly
added empty vdevs couldn't be removed due to the increased slop
requirement.  This was particularly unfortunate since one of the
main motivations for this feature was to allow such mistakes to be
corrected.

Lastly it's worth mentioning that by allowing the removal to start
with close to the minimum required free space it is more likely that
an active pool close to capacity may fail the removal process.  This
failure case has always been possible since we can't know what new
data will be written during the removal process.  It is correctly
handled and there's no danger to the pool.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Jan 4, 2021
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
and often prevent removal when it clearly should be possible.  This
change reworks the check slightly.

First, we reduce the slop space requirement to be inline with with
the capacity of the pool after the device has been removed.  This
way if a large device is accidentally added to a small pool the
vastly increased slop requirement of the larger pool won't prevent
removal.  For example, it was previously possible that even newly
added empty vdevs couldn't be removed due to the increased slop
requirement.  This was particularly unfortunate since one of the
main motivations for this feature was to allow such mistakes to be
corrected.

Second, we handle the case of very small pools where the minimum
slop size of 128M represents a large fraction (>25%) of the total
pool capacity.  In this case, it's reasonable to reduce the extra
slop requirement.

Lastly it's worth mentioning that by allowing the removal to start
with close to the minimum required free space it is more likely that
an active pool close to capacity may fail the removal process.  This
failure case has always been possible since we can't know what new
data will be written during the removal process.  It is correctly
handled and there's no danger to the pool.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Jan 4, 2021
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
and often prevent removal when it clearly should be possible.  This
change reworks the check slightly.

First, we reduce the slop space requirement to be inline with with
the capacity of the pool after the device has been removed.  This
way if a large device is accidentally added to a small pool the
vastly increased slop requirement of the larger pool won't prevent
removal.  For example, it was previously possible that even newly
added empty vdevs couldn't be removed due to the increased slop
requirement.  This was particularly unfortunate since one of the
main motivations for this feature was to allow such mistakes to be
corrected.

Second, we handle the case of very small pools where the minimum
slop size of 128M represents a large fraction (>25%) of the total
pool capacity.  In this case, it's reasonable to reduce the extra
slop requirement.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Jan 9, 2021
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
and often prevent removal when it clearly should be possible.  This
change reworks the check slightly.

First, we reduce the slop space requirement to be inline with with
the capacity of the pool after the device has been removed.  This
way if a large device is accidentally added to a small pool the
vastly increased slop requirement of the larger pool won't prevent
removal.  For example, it was previously possible that even newly
added empty vdevs couldn't be removed due to the increased slop
requirement.  This was particularly unfortunate since one of the
main motivations for this feature was to allow such mistakes to be
corrected.

Second, we handle the case of very small pools where the minimum
slop size of 128M represents a large fraction (>25%) of the total
pool capacity.  In this case, it's reasonable to reduce the extra
slop requirement.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Jan 9, 2021
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
for small pools preventing removal when it clearly should be possible.

This change reworks the check to drop the additional slop space from
the calculation.  This solves two problems:

1) If a large device is accidentally added to a small pool the slop
   requirement is dramatically increased.  This in turn can prevent
   removal of that larger device even if it has never been used.
   This was particularly unfortunate since a main motivation for
   this feature was to allow such mistakes to be corrected.

2) For very small pools with 256M vdevs the minimum slop size of
   128M represents a huge fraction of the total pool capacity.
   Requiring twice this amount of space to be available effectively
   prevents device removal from ever working with these tiny pools.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Jan 13, 2021
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
for small pools preventing removal when it clearly should be possible.

This change reworks the check to drop the additional slop space from
the calculation.  This solves two problems:

1) If a large device is accidentally added to a small pool the slop
   requirement is dramatically increased.  This in turn can prevent
   removal of that larger device even if it has never been used.
   This was particularly unfortunate since a main motivation for
   this feature was to allow such mistakes to be corrected.

2) For very small pools with 256M vdevs the minimum slop size of
   128M represents a huge fraction of the total pool capacity.
   Requiring twice this amount of space to be available effectively
   prevents device removal from ever working with these tiny pools.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
behlendorf added a commit to behlendorf/zfs that referenced this issue Oct 29, 2021
The available space check in spa_vdev_remove_top_check() is intended
to verify there is enough available space on the other devices before
starting the removal process.  This is obviously a good idea but the
current check can significantly overestimate the space requirements
for small pools preventing removal when it clearly should be possible.

This change reworks the check to drop the additional slop space from
the calculation.  This solves two problems:

1) If a large device is accidentally added to a small pool the slop
   requirement is dramatically increased.  This in turn can prevent
   removal of that larger device even if it has never been used.
   This was particularly unfortunate since a main motivation for
   this feature was to allow such mistakes to be corrected.

2) For very small pools with 256M vdevs the minimum slop size of
   128M represents a huge fraction of the total pool capacity.
   Requiring twice this amount of space to be available effectively
   prevents device removal from ever working with these tiny pools.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#11356
@stale
Copy link

stale bot commented Dec 30, 2021

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Dec 30, 2021
@stale stale bot closed this as completed Apr 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Status: Understood The root cause of the issue is known Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

4 participants