-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zpool clear strangeness #12090
Comments
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions. |
We are still experiencing the same issue with OpenZFS 2.1.4 |
Thanks for retesting this with OpenZFS 2.1.4, I'll look in to what's exactly going on here. Somewhat related to this you might want to look at PR #13499 which slightly changes the ZED behavior. This fix was queued up for 2.1.5. |
The same issue is seen when tried with the PR:13499 |
@akashb-22 would you mind verifying the fix in #13555 resolves this issue. |
|
Thanks for the quick turn around. I see what's going on with case 2 and 3 and will work on updating the PR to take a slightly different approach. |
I've updated the PR. Would you mind testing the updated patch. |
Case 1: Damage to a single vdev is working fine with the fix.
|
That third test case really is a pretty nasty absolute worst case scenario. I was able to reproduce it locally but it'll take me a while to get to a root cause. In the meanwhile, let's move forward with the current PR which at least sorts out those first two test cases. |
@behlendorf
Further testing shows this issue is
Issue is always 100% hit w/ above testcase in draid2 w/ 2 dspares. |
CentOS8.2
4.18.0-193.28.1.x5.0.22.x86_64
modinfo zfs | grep -iw version
version: 2.1.0-x.5.0_520_ge8c59ac5
modinfo spl | grep -iw version
version: 2.1.0-x.5.0_520_ge8c59ac5
Describe the problem you're observing
Intentional damage (zero'd out the entire device) was done on a single vdev of a draid2 pool with 2 spares. After running IO
to the mounted file system on this pool, the pool became degraded, as expected. Distributed spare got engaged and resilver started. Scrub finished with zero bytes repaired. Issuing 'zpool clear' brings the DEGRADED pool and vdev to ONLINE state.
If we run IO or scrub the pool becomes degraded again.
Describe how to reproduce the problem
echo 1 > /sys/module/zfs/parameters/zfs_prefetch_disable
cat /sys/module/zfs/parameters/zfs_prefetch_disable
1
truncate -s 1G file1 file2 file3 file4 file5 file6 file7 file8 file9 file10 file11 file12 file13 file14 file15
pwd
/root/test/files
zpool create -f -o cachefile=none -o failmode=panic -O canmount=off pool-oss0 draid2:11d:15c:2s /root/test/files/file1 /root/test/files/file2 /root/test/files/file3 /root/test/files/file4 /root/test/files/file5 /root/test/files/file6 /root/test/files/file7 /root/test/files/file8 /root/test/files/file9 /root/test/files/file10 /root/test/files/file11 /root/test/files/file12 /root/test/files/file13 /root/test/files/file14 /root/test/files/file15
zfs create -o mountpoint=/mnt/ost0 pool-oss0/ost0
[root@localhost files]#
[root@localhost files]#
[root@localhost files]#
[root@localhost files]# fio --name=test --ioengine=libaio --fallocate=none --iodepth=4 --rw=write --bs=1M --size=8G --numjobs=1 --allow_mounted_write=1 --do_verify=0 --verify=pattern --verify_pattern=0x0A0B0C0D --filename=/mnt/ost0/testfile --group_reporting
test: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=4
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 8192MiB)
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=66.6MiB/s][r=0,w=66 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=10718: Wed May 19 22:26:15 2021
write: IOPS=89, BW=89.0MiB/s (93.3MB/s)(8192MiB/92043msec)
slat (usec): min=206, max=2843.9k, avg=11218.74, stdev=128346.34
clat (usec): min=3, max=2848.3k, avg=33712.67, stdev=221020.14
lat (usec): min=326, max=2852.2k, avg=44932.45, stdev=254463.05
clat percentiles (usec):
| 1.00th=[ 701], 5.00th=[ 750], 10.00th=[ 791],
| 20.00th=[ 865], 30.00th=[ 955], 40.00th=[ 1029],
| 50.00th=[ 1106], 60.00th=[ 1221], 70.00th=[ 1418],
| 80.00th=[ 2212], 90.00th=[ 8979], 95.00th=[ 16909],
| 99.00th=[1434452], 99.50th=[1602225], 99.90th=[2499806],
| 99.95th=[2566915], 99.99th=[2835350]
bw ( KiB/s): min=14307, max=411212, per=100.00%, avg=230380.26, stdev=94722.67, samples=72
iops : min= 13, max= 401, avg=224.57, stdev=92.53, samples=72
lat (usec) : 4=0.01%, 500=0.01%, 750=5.02%, 1000=31.23%
lat (msec) : 2=42.64%, 4=4.21%, 10=8.28%, 20=4.36%, 50=1.94%
lat (msec) : 100=0.22%, 750=0.04%, 1000=0.29%
cpu : usr=0.07%, sys=3.70%, ctx=2300, majf=0, minf=13
IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,8192,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=4
Run status group 0 (all jobs):
WRITE: bw=89.0MiB/s (93.3MB/s), 89.0MiB/s-89.0MiB/s (93.3MB/s-93.3MB/s), io=8192MiB (8590MB), run=92043-92043msec
[root@localhost files]#
[root@localhost files]#
[root@localhost files]# echo 3 > /proc/sys/vm/drop_caches
[root@localhost files]# sync
[root@localhost files]#
[root@localhost files]#
[root@localhost files]# fio --name=test --ioengine=libaio --fallocate=none --iodepth=4 --rw=read --bs=1M --size=8G --numjobs=1 --allow_mounted_write=1 --do_verify=1 --verify=pattern --verify_pattern=0x0A0B0C0D --filename=/mnt/ost0/testfile --group_reporting
test: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=4
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [V(1)][100.0%][r=503MiB/s,w=0KiB/s][r=503,w=0 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=27966: Wed May 19 22:26:59 2021
read: IOPS=353, BW=354MiB/s (371MB/s)(8192MiB/23161msec)
slat (usec): min=650, max=502000, avg=2625.01, stdev=13294.92
clat (usec): min=121, max=533083, avg=8483.66, stdev=23148.47
lat (usec): min=1190, max=534382, avg=11111.37, stdev=26851.01
clat percentiles (msec):
| 1.00th=[ 4], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 5],
| 30.00th=[ 5], 40.00th=[ 6], 50.00th=[ 6], 60.00th=[ 7],
| 70.00th=[ 8], 80.00th=[ 9], 90.00th=[ 12], 95.00th=[ 16],
| 99.00th=[ 44], 99.50th=[ 117], 99.90th=[ 502], 99.95th=[ 506],
| 99.99th=[ 535]
bw ( KiB/s): min=39844, max=610304, per=99.63%, avg=360838.15, stdev=151393.14, samples=46
iops : min= 38, max= 596, avg=352.26, stdev=147.95, samples=46
lat (usec) : 250=0.01%
lat (msec) : 2=0.01%, 4=19.31%, 10=66.26%, 20=11.69%, 50=1.78%
lat (msec) : 100=0.37%, 250=0.34%, 500=0.10%, 750=0.12%
cpu : usr=5.66%, sys=39.62%, ctx=33035, majf=0, minf=525
IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=8192,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=4
Run status group 0 (all jobs):
READ: bw=354MiB/s (371MB/s), 354MiB/s-354MiB/s (371MB/s-371MB/s), io=8192MiB (8590MB), run=23161-23161msec
[root@localhost files]#
[root@localhost files]#
[root@localhost files]# echo 3 > /proc/sys/vm/drop_caches
[root@localhost files]#
[root@localhost files]#
[root@localhost files]#
[root@localhost files]#
[root@localhost files]# dd if=/dev/zero of=file1 bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.812825 s, 1.3 GB/s
[root@localhost files]# echo 3 > /proc/sys/vm/drop_caches
[root@localhost files]# echo 3 > /proc/sys/vm/drop_caches
[root@localhost files]#
[root@localhost files]#
[root@localhost files]#
[root@localhost files]# fio --name=test --ioengine=libaio --fallocate=none --iodepth=4 --rw=read --bs=1M --size=8G --numjobs=1 --allow_mounted_write=1 --do_verify=1 --verify=pattern --verify_pattern=0x0A0B0C0D --filename=/mnt/ost0/testfile --group_reporting
test: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=4
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [V(1)][100.0%][r=501MiB/s,w=0KiB/s][r=501,w=0 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=72740: Wed May 19 22:28:32 2021
read: IOPS=184, BW=185MiB/s (194MB/s)(8192MiB/44367msec)
slat (usec): min=548, max=501887, avg=5117.17, stdev=16143.65
clat (usec): min=131, max=1124.1k, avg=16247.34, stdev=36912.74
lat (usec): min=1040, max=1404.9k, avg=21368.78, stdev=45991.99
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 5],
| 30.00th=[ 6], 40.00th=[ 7], 50.00th=[ 8], 60.00th=[ 11],
| 70.00th=[ 14], 80.00th=[ 19], 90.00th=[ 33], 95.00th=[ 49],
| 99.00th=[ 138], 99.50th=[ 218], 99.90th=[ 506], 99.95th=[ 726],
| 99.99th=[ 1116]
bw ( KiB/s): min= 2043, max=913848, per=99.76%, avg=188614.28, stdev=158911.91, samples=88
iops : min= 1, max= 892, avg=183.91, stdev=155.26, samples=88
lat (usec) : 250=0.01%
lat (msec) : 2=0.01%, 4=15.56%, 10=43.88%, 20=22.18%, 50=13.67%
lat (msec) : 100=3.20%, 250=1.15%, 500=0.18%, 750=0.10%, 1000=0.02%
cpu : usr=3.78%, sys=21.27%, ctx=30665, majf=0, minf=524
IO depths : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=8192,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=4
Run status group 0 (all jobs):
READ: bw=185MiB/s (194MB/s), 185MiB/s-185MiB/s (194MB/s-194MB/s), io=8192MiB (8590MB), run=44367-44367msec
[root@localhost files]#
[root@localhost ~]# zpool status -v 2
pool: pool-oss0
state: ONLINE
config:
errors: No known data errors
When there is no IO repeats the same status as above because the pool did not recognize the failure.
While running IO from another terminal, the following status observed:
pool: pool-oss0
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
config:
errors: No known data errors
..............................
pool: pool-oss0
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver (draid2:11d:15c:2s-0) in progress since Wed May 19 22:27:49 2021
9.75G scanned at 714M/s, 8.31G issued 608M/s, 9.75G total
655M resilvered, 100.00% done, 00:00:00 to go
config:
errors: No known data errors
pool: pool-oss0
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
scan: scrub in progress since Wed May 19 22:28:03 2021
9.75G scanned at 4.88G/s, 855M issued at 428M/s, 9.75G total
0B repaired, 8.56% done, 00:00:21 to go
scan: resilvered (draid2:11d:15c:2s-0) 655M in 00:00:14 with 0 errors on Wed May 19 22:28:03 2021
config:
errors: No known data errors
[root@localhost ~]# zpool clear pool-oss0
[root@localhost ~]# zpool status -v
pool: pool-oss0
state: ONLINE
scan: scrub repaired 0B in 00:00:20 with 0 errors on Wed May 19 22:28:23 2021
scan: resilvered (draid2:11d:15c:2s-0) 655M in 00:00:14 with 0 errors on Wed May 19 22:28:03 2021
config:
errors: No known data errors
When IO ran from another terminal the following status observed:
[root@localhost ~]# zpool status -v
pool: pool-oss0
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: scrub repaired 0B in 00:00:20 with 0 errors on Wed May 19 22:28:23 2021
scan: resilver (draid2:11d:15c:2s-0) in progress since Wed May 19 22:29:00 2021
2.21G scanned at 693M/s, 1.83G issued 575M/s, 9.75G total
100M resilvered, 22.65% done, 00:00:11 to go
config:
errors: No known data errors
The text was updated successfully, but these errors were encountered: