Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS hanging, eventually leading to system lockup #11754

Closed
dodexahedron opened this issue Mar 15, 2021 · 14 comments
Closed

ZFS hanging, eventually leading to system lockup #11754

dodexahedron opened this issue Mar 15, 2021 · 14 comments
Labels
Status: Stale No recent activity for issue Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@dodexahedron
Copy link
Contributor

dodexahedron commented Mar 15, 2021

System information

Type Version/Name
Distribution Name CentOS
Distribution Version 8.3
Linux Kernel 5.11.6
Architecture x86_64
ZFS Version 2.0.4-1
SPL Version 2.0.4-1

Describe the problem you're observing

After an unspecified amount of time, under seemingly random conditions (ie not particularly loaded, and I don't know of any specific activity going on to prompt it), I start getting a ton of these in the system logs, and eventually it gets bad enough to cause iSCSI commands to time out (this system is an iSCSI target) and, after a while longer, the systems becomes completely unresponsive.

dmesg output from when it starts:

[  247.003035] INFO: task z_livelist_dest:9607 blocked for more than 122 seconds.
[  247.003041]       Tainted: P           O      5.11.6-1.el8.elrepo.x86_64 #1
[  247.003043] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  247.003044] task:z_livelist_dest state:D stack:    0 pid: 9607 ppid:     2 flags:0x00004000
[  247.003048] Call Trace:
[  247.003051]  __schedule+0x39b/0x820
[  247.003058]  ? ret_from_fork+0x22/0x30
[  247.003064]  schedule+0x3c/0xa0
[  247.003066]  spl_panic+0xf9/0xfb [spl]
[  247.003086]  ? _cond_resched+0x15/0x30
[  247.003088]  ? spl_kmem_alloc+0xb6/0x100 [spl]
[  247.003094]  ? __cv_timedwait_common+0xe9/0x160 [spl]
[  247.003100]  ? __kmalloc_node+0x15d/0x320
[  247.003103]  ? spl_kmem_alloc+0xb6/0x100 [spl]
[  247.003109]  avl_add+0x9a/0xb0 [zavl]
[  247.003115]  dsl_livelist_iterate+0x120/0x1d0 [zfs]
[  247.003205]  bpobj_iterate_blkptrs+0x194/0x390 [zfs]
[  247.003265]  ? dsl_deadlist_load_cache+0x210/0x210 [zfs]
[  247.003327]  bpobj_iterate_impl+0x331/0x4a0 [zfs]
[  247.003387]  ? dsl_deadlist_load_cache+0x210/0x210 [zfs]
[  247.003449]  dsl_process_sub_livelist+0x6e/0xa0 [zfs]
[  247.003511]  ? dsl_deadlist_cache_compare+0x20/0x20 [zfs]
[  247.003572]  spa_livelist_delete_cb+0x18e/0x300 [zfs]
[  247.003654]  ? update_load_avg+0x7e/0x5f0
[  247.003658]  zthr_procedure+0x13c/0x160 [zfs]
[  247.003737]  ? zrl_is_locked+0x20/0x20 [zfs]
[  247.003806]  ? __thread_exit+0x20/0x20 [spl]
[  247.003813]  thread_generic_wrapper+0x6f/0x80 [spl]
[  247.003819]  kthread+0x116/0x130
[  247.003823]  ? kthread_park+0x80/0x80
[  247.003825]  ret_from_fork+0x22/0x30

This occurs on every boot of the machine, eventually.

Describe how to reproduce the problem

I am unsure what the trigger is. I do not see it on an identical system attached to the same SAS backplane, using 10 identical drives.

@dodexahedron dodexahedron added Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang) labels Mar 15, 2021
@devZer0
Copy link

devZer0 commented Mar 15, 2021

what about tracking disk latency with zpool iostat latency counters ( like -vlTd or -w ) to see if this is related (or not related) to disks, which may announce themself failing this way.... from my experience, hung task/latency issues may also be related to disk issues or storage driver/attachment issues.

@dodexahedron
Copy link
Contributor Author

I had been watching the aggregate counters for the pool, already, for a little while, to see if there was anything building up:

# zpool iostat -lq pool2 5
              capacity     operations     bandwidth    total_wait     disk_wait    syncq_wait    asyncq_wait  scrub   trim   syncq_read    syncq_write   asyncq_read  asyncq_write   scrubq_read   trimq_write
pool        alloc   free   read  write   read  write   read  write   read  write   read  write   read  write   wait   wait   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ
----------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
pool2       2.82T  4.44T  11.7K  60.2K   100M   819M  252us  800us  242us  113us    6us    1us  155us  732us  452us    2ms      0      0      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.44T  16.6K  77.6K   143M  1.44G  314us    3ms  292us  189us   14us    3us  133us    3ms      -    1ms      0      3      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  23.7K  46.1K   198M   529M  224us  351us  218us   92us    2us    1us   98us  282us      -    2ms      0      9      0      3      0      0      0      0      0      0     15      4
pool2       2.82T  4.44T  8.62K  59.6K  81.4M   619M  237us  121us  225us   78us    2us    1us   97us   47us      -    2ms      0      6      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.44T  11.5K  65.6K  97.3M   695M  229us  119us  223us   77us    3us    1us   83us   46us      -    1ms      0      0      0      1      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  16.5K  82.7K   139M  1.33G  323us    2ms  304us  163us   14us    1us  147us    2ms      -    1ms      0      2      0      0      0      0  1.12K     31      0      0      0      0
pool2       2.82T  4.44T  20.1K  50.6K   170M   618M  221us  202us  215us   89us    3us    1us   78us  121us      -    2ms      0      4      0      0      0      0      0      6      0      0     15      4
pool2       2.82T  4.44T  7.16K  61.9K  63.5M   740M  258us  534us  249us   94us    4us  953ns   87us  471us      -    1ms      0      0      0      1      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  10.1K  65.9K  87.0M   662M  235us  123us  226us   76us    3us    1us  107us   51us      -    1ms      0      4      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  20.6K  68.7K   175M  1.16G  313us    4ms  301us  170us    8us    1us  152us    3ms      -    1ms      0      3      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  17.5K  83.4K   149M  1.44G  302us    4ms  279us  181us   13us    6us  229us    4ms      -    2us      0     17      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  8.05K  68.7K  73.8M  1.14G  331us    3ms  289us  160us   22us  984ns  205us    3ms      -    2ms      0      6      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.45T  9.29K  35.8K  85.1M   414M  225us  135us  216us   87us    2us  994ns   95us   52us      -    3ms      0      0      0      0      0      0      4     10      0      0     15      4
pool2       2.82T  4.45T  16.1K  78.8K   135M  1.36G  305us    4ms  288us  179us   12us    1us  136us    3ms      -    2ms      0      8      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  19.0K  60.2K   161M   921M  267us    2ms  256us  134us    6us    1us  132us    1ms      -    1ms      0      8      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.45T  8.48K  43.0K  90.9M   632M  254us  147us  243us   95us    2us    1us  108us   56us      -    3ms      0      1      0      0      0      0      0      4      0      0      0      0
pool2       2.82T  4.45T  11.3K  62.0K  99.0M   732M  252us  123us  242us   81us    3us    1us  106us   45us      -    1ms      0      0      0      0      0      0      3     22      0      0      0      0
pool2       2.82T  4.44T  15.4K  75.3K   130M  1.32G  302us    3ms  287us  176us    8us    1us  146us    3ms      -    1ms      0      6      0      0      1      5    676     24      0      0      0      0
pool2       2.82T  4.45T  19.8K  56.2K   182M   731M  235us  550us  225us  103us    5us    1us   79us  473us      -    2ms      0      3      0      0      0      0      0      0      0      0     15      8
pool2       2.82T  4.45T  8.64K  54.7K   112M   620M  252us  251us  242us   87us    3us    1us  146us  179us      -    1ms      0      0      0      0      0      0      9     16      0      0      0      0
pool2       2.82T  4.45T  9.29K  61.6K  83.5M   628M  254us  110us  236us   75us    3us    1us  166us   39us      -    1ms      0      0      0      1      0      0      0      0      0      0      8      4
pool2       2.82T  4.44T  15.8K  82.2K   133M  1.38G  297us    4ms  277us  168us   10us    1us  217us    3ms      -    1ms      0      5      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  20.8K  50.8K   173M   605M  227us  515us  218us   94us    3us    1us  204us  455us      -    2ms      0      0      0      1      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  4.35K  60.9K  38.8M   625M  232us  114us  223us   76us    3us    1us   79us   42us      -    1ms      0      0      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.45T  11.6K  63.6K   104M   642M  239us  114us  232us   75us    3us  928ns   79us   43us      -    1ms      0      8      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  14.7K  81.2K   125M  1.36G  319us    4ms  298us  170us   12us  917ns  227us    4ms      -    1ms      0      0      0      0      0      0  4.24K     40      0      0      0      0
pool2       2.82T  4.45T  17.2K  52.0K   147M   811M  234us    1ms  226us  137us    4us    1us   91us    1ms      -    2ms      0      0      0      0      0      0     20     17      0      0      7      2
pool2       2.82T  4.45T  5.27K  54.6K  46.0M   567M  239us  115us  230us   76us    2us  972ns   79us   43us      -    2ms      0      0      0      0      0      0      0      3      0      0      0      0
pool2       2.82T  4.45T  9.65K  63.0K  82.7M   678M  227us  112us  220us   75us    3us  991ns   90us   40us      -    1ms      0      0      0      0      0      0      0      0      0      0     15      6
pool2       2.82T  4.45T  18.1K  57.8K   154M   933M  265us    3ms  257us  157us    6us  908ns   75us    3ms      -    1ms      0      8      0      0      0      0  2.09K     25      0      0      0      0
pool2       2.82T  4.45T  18.6K  75.7K   157M  1010M  237us    1ms  227us  113us    6us    1us   85us    1ms      -    1ms      0      1      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  5.07K  57.0K  47.1M   667M  247us  377us  236us   86us    2us    1us   81us  314us      -    1ms      0     17      0      0      0      0      4      4      0      0      4      2
pool2       2.82T  4.45T  9.28K  64.7K  79.9M   651M  236us  115us  229us   75us    3us  996ns   71us   44us      -    1ms      0      0      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.44T  15.3K  82.5K   129M  1.35G  287us    3ms  276us  162us    9us    1us   66us    3ms      -    1ms      0      2      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  20.2K  46.8K   168M   542M  228us  151us  223us   86us    3us    1us   85us   70us      -    2ms      0      0      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.45T  3.78K  60.8K  37.3M   654M  248us  117us  232us   76us    2us  781ns   95us   45us      -    1ms      0      0      0      0      0      0      0      0      0      0      2      2
pool2       2.82T  4.45T  11.8K  62.6K   105M   660M  247us  116us  236us   74us    3us  991ns   88us   45us      -    1ms      0      0      0      0      0      0      0      0      0      0     12      4
pool2       2.82T  4.44T  17.1K  91.8K   146M  1.30G  281us    3ms  268us  142us    9us    2us   97us    3ms      -    1ms      0     13      0      0      0      0  1.06K     41      0      0      0      0
pool2       2.82T  4.45T  22.1K  46.4K   183M   558M  224us  136us  219us   87us    3us    1us   81us   52us      -    1ms      0      0      0      0      0      0      0      0      0      0     15      4
pool2       2.82T  4.45T  1.60K  61.5K  18.3M   638M  283us  111us  247us   73us    2us  959ns  140us   41us      -    2ms      0      0      0      0      0      0      3     19      0      0      0      0
pool2       2.82T  4.45T  6.03K  65.4K  56.6M   691M  242us  116us  231us   75us    3us  890ns   87us   44us      -    1ms      0      0      0      0      0      0      5     14      0      0      6      2
pool2       2.82T  4.45T  19.2K  81.9K   163M  1.32G  299us    2ms  283us  153us   11us    1us  139us    2ms      -    1ms      0      0      0      0      0      0      0      0      0      0      9      6
pool2       2.82T  4.45T  21.3K  46.7K   181M   538M  230us  125us  223us   85us    3us    1us   80us   44us      -    2ms      0      4      0      1      0      0      0      0      0      0     31      8
pool2       2.82T  4.45T  8.36K  58.6K  76.6M   642M  246us  133us  228us   78us    2us    1us  151us   59us      -    2ms      0      0      0      0      0      0      4     22      0      0     12      4
pool2       2.82T  4.45T  11.6K  63.5K   101M   672M  229us  116us  219us   75us    3us  917ns  127us   45us      -    1ms      0      0      0      1      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  20.7K  94.2K   180M  1.44G  311us    3ms  290us  150us   11us  990ns  159us    3ms      -    1ms      0      2      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  14.6K  58.6K   121M   758M  240us    1ms  235us  107us    4us    1us   83us    1ms      -    2ms      0      9      0      0      0      0      0      0      0      0     15      4
pool2       2.82T  4.45T  18.8K  37.0K   156M   443M  244us  269us  240us   96us    3us    1us   90us  186us      -    2ms      0      0      0      0      0      0  1.08K     37      0      0      0      0
pool2       2.82T  4.45T      8  98.8K  3.86M  1.51G  623us    2ms  618us  137us    4us    2us      -    2ms      -    1ms      0      0      0      0      0      0    194     29      0      0      0      0
pool2       2.82T  4.45T  12.5K  82.7K   101M  1.33G  300us    3ms  297us  151us    2us  972ns      -    3ms      -    2ms      0     22      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T  2.39K  82.1K  19.1M  1.54G  219us    3ms  217us  183us    2us    1us      -    3ms      -    3ms      0      0      0      1      0      0      0      0      0      0     31      8
pool2       2.82T  4.45T      6  45.2K  3.93M   513M  682us  126us  681us   83us    1us  975ns      -   48us      -    2ms      0      0      0      0      0      0      0      0      0      0      7      2
pool2       2.82T  4.45T      8  53.2K  67.2K   561M  210us  115us  210us   76us    1us  886ns      -   43us      -    2ms      0      0      0      0      0      0      0      0      0      0     22      6
pool2       2.82T  4.45T  6.42K  82.9K  52.9M  1.08G  286us    3ms  285us  122us    2us  924ns      -    3ms      -    1ms      0      0      0      0      0      0  2.83K     40      0      0      0      0
pool2       2.82T  4.45T  5.58K  98.1K  47.3M  1.71G  279us    3ms  278us  173us    2us    9us      -    3ms      -    3us      0      0      0      0      0      0     47     37      0      0      0      0
pool2       2.82T  4.45T    511  45.3K  4.41M   560M  297us  408us  295us   93us    2us  826ns      -  338us      -    2ms      0      0      0      0      0      0      0      0      0      0      0      0
pool2       2.82T  4.45T      4  56.9K  36.8K   583M  179us  113us  179us   73us    1us  867ns      -   43us      -    2ms      0      0      0      0      0      0      9     17      0      0      0      0
pool2       2.82T  4.45T  12.0K  86.8K  96.3M  1.31G  307us    3ms  305us  142us    2us  894ns      -    3ms      -    1ms      0      1      0      0      0      0      0      0      0      0      0      0
pool2       2.81T  4.45T  1.48K  78.1K  14.3M  1.39G  186us    3ms  185us  172us    1us    1us      -    2ms      -    2ms      0      0      0      0      0      0      0      0      0      0      7      2
pool2       2.81T  4.45T      8  45.4K   470K   539M  254us  118us  252us   82us    2us    1us      -   39us      -    2ms      0      0      0      0      0      0      0      0      0      0     22      6

When running with -vlTd, the individual drives do not deviate from those numbers.

# zpool iostat -vlTd pool2 5
Mon Mar 15 13:48:44 MST 2021
                capacity     operations     bandwidth    total_wait     disk_wait    syncq_wait    asyncq_wait  scrub   trim
pool          alloc   free   read  write   read  write   read  write   read  write   read  write   read  write   wait   wait
------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
pool2         2.76T  4.51T  9.56K  62.5K  82.0M   926M  269us    1ms  258us  131us    6us    1us  163us    1ms  452us    2ms
  mirror       564G   924G  1.91K  12.9K  16.4M   188M  302us    1ms  290us  126us    6us    1us  211us    1ms    1ms    2ms
    pool2-1       -      -    979  6.42K  8.23M  94.1M  302us    1ms  290us  127us    6us    1us  212us    1ms    2ms    2ms
    pool2-2       -      -    979  6.46K  8.20M  94.1M  301us    1ms  290us  126us    6us  988ns  210us    1ms  329us    2ms
  mirror       564G   924G  1.91K  12.9K  16.4M   188M  273us    1ms  262us  129us    6us    1us  168us    1ms  136us    2ms
    pool2-3       -      -    944  6.41K  7.94M  93.9M  301us    1ms  290us  127us    6us    1us  210us    1ms  143us    2ms
    pool2-4       -      -   1007  6.45K  8.42M  93.9M  246us    1ms  236us  131us    6us    1us  131us    1ms  130us    2ms
  mirror       564G   924G  1.92K  12.5K  16.4M   185M  250us    1ms  240us  132us    6us    1us  138us    1ms  207us    2ms
    pool2-5       -      -    980  6.25K  8.20M  92.3M  252us    1ms  241us  133us    6us    1us  139us    1ms  227us    2ms
    pool2-6       -      -    980  6.29K  8.20M  92.3M  249us    1ms  238us  132us    6us    1us  137us    1ms  186us    2ms
  mirror       564G   924G  1.92K  12.0K  16.4M   181M  249us    1ms  238us  135us    6us    1us  132us    1ms  248us    2ms
    pool2-7       -      -    979  6.00K  8.19M  90.7M  249us    1ms  238us  136us    6us    1us  137us    1ms  247us    2ms
    pool2-8       -      -    982  6.04K  8.20M  90.7M  248us    1ms  238us  134us    6us    1us  127us    1ms  248us    2ms
  mirror       564G   924G  1.91K  12.2K  16.4M   184M  273us    1ms  263us  132us    6us    1us  170us    1ms  218us    2ms
    pool2-9       -      -   1012  6.06K  8.45M  91.9M  246us    1ms  236us  135us    6us    1us  133us    1ms  120us    2ms
    pool2-10      -      -    948  6.10K  7.97M  91.9M  302us    1ms  291us  130us    6us  996ns  211us    1ms  327us    2ms
------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
Mon Mar 15 13:48:49 MST 2021
                capacity     operations     bandwidth    total_wait     disk_wait    syncq_wait    asyncq_wait  scrub   trim
pool          alloc   free   read  write   read  write   read  write   read  write   read  write   read  write   wait   wait
------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
pool2         2.76T  4.51T     35  56.3K   285K   651M  245us  193us  243us   84us    2us  773ns      -  117us      -    2ms
  mirror       564G   924G      6  11.3K  49.6K   131M  221us  173us  221us   78us    1us  769ns      -  101us      -    1ms
    pool2-1       -      -      3  5.62K  25.6K  65.3M  233us  171us  233us   78us    1us  795ns      -   98us      -    1ms
    pool2-2       -      -      2  5.64K  24.0K  65.3M  209us  175us  209us   78us    1us  744ns      -  104us      -    2ms
  mirror       564G   924G      7  12.7K  57.6K   142M  283us  209us  278us   82us    2us  769ns      -  136us      -    2ms
    pool2-3       -      -      2  6.37K  22.4K  70.9M  337us  178us  337us   77us    2us  773ns      -  109us      -    1ms
    pool2-4       -      -      4  6.38K  35.2K  70.9M  250us  240us  241us   87us    2us  764ns      -  162us      -    2ms
  mirror       564G   924G      7  12.4K  60.8K   139M  235us  231us  235us   86us    2us  783ns      -  153us      -    2ms
    pool2-5       -      -      2  6.21K  22.4K  69.5M  175us  248us  175us   87us    2us  813ns      -  169us      -    2ms
    pool2-6       -      -      4  6.23K  38.4K  69.5M  270us  214us  270us   86us    2us  752ns      -  137us      -    2ms
  mirror       564G   924G      5  10.9K  46.4K   133M  218us  151us  215us   87us    2us  775ns      -   69us      -    2ms
    pool2-7       -      -      2  5.41K  22.4K  66.7M  238us  152us  238us   88us    1us  796ns      -   69us      -    2ms
    pool2-8       -      -      2  5.45K  24.0K  66.7M  199us  151us  193us   86us    2us  755ns      -   69us      -    3ms
  mirror       564G   924G      8  9.01K  70.4K   107M  256us  195us  256us   86us    3us  770ns      -  121us      -    2ms
    pool2-9       -      -      3  4.47K  28.8K  53.4M  245us  208us  245us   93us    3us  800ns      -  127us      -    2ms
    pool2-10      -      -      5  4.53K  41.6K  53.4M  264us  182us  264us   79us    2us  741ns      -  114us      -    1ms
------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----

Smart status across all drives show no errors, and all look exactly like this:

#smartctl -a /dev/sdal
smartctl 7.1 2020-04-05 r5049 [x86_64-linux-5.11.6-1.el8.elrepo.x86_64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HGST
Product:              SDLL1DLR960GCCA1
Revision:             Y130
Compliance:           SPC-4
User Capacity:        960,197,124,096 bytes [960 GB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      0x50011731013e3bd4
Serial number:        A0354DF2
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Mon Mar 15 13:46:13 2021 MST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Percentage used endurance indicator: 1%
Grown defects during certification = 0
Total blocks reassigned during format = 0
Total new blocks reassigned = 0
Power on minutes since format = 1693595
Current Drive Temperature:     35 C
Drive Trip Temperature:        70 C

Manufactured in week 47 of year 2017
Specified cycle count over device lifetime:  0
Accumulated start-stop cycles:  520
Elements in grown defect list: 0

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0     186630.319           0
write:         0        0         0         0          0      85393.407           0
verify:        0        0         0         0          0          2.293           0

Non-medium error count:    0

No Self-tests have been logged

@dodexahedron
Copy link
Contributor Author

dodexahedron commented Mar 15, 2021

The two main services running on this are NFS and SCST (for the iSCSI target).
It's serving up zvols as block devices via iSCSI for vmware, and just exporting a few folders for security camera recording via NFS.

For what it's worth:
One thing that changed about a week ago was that we turned on zstd compression for the zvols.
The problem did not occur before last week.
The zvols were set to zstd-3.
Noting the change, I set them back to lz4, but the behavior kept happening. I'm sure there are still hundreds of GB of zstd-compressed blocks in there, though.

Other stuff that may or may not be useful to know:
Deduplication is ON, and yes, we have plenty of memory (128GB, and the system never touches its swap space).
We have also tweaked the following parameters on this system:

options zfs zfs_arc_max=85899345920
options zfs zfs_dedup_prefetch=1
options zfs zfs_vdev_scrub_max_active=6
options zfs zfs_vdev_async_write_min_active=4
options zfs zfs_max_recordsize=16777216
options zfs metaslab_lba_weighting_enabled=0
options zfs zfs_vdev_async_write_max_active=24
options zfs zfs_vdev_sync_read_max_active=16

@dodexahedron
Copy link
Contributor Author

dodexahedron commented Mar 15, 2021

Another bit of odd behavior:

Though compression is on, the reported compressratio is 1.00x for all of the zvols on this system.
The data is a bunch of vms, and should be getting at least around 1.2-1.5x, based on similar VMs on another pool, also set at lz4.

The other pool on the other system never had its zvols set to zstd, so this additional oddity is making me suspect something is up when zstd on a zvol is involved.

Strike that. The ashift is 13 on this pool, so no compression makes sense with default volblocksize of 8k.

@pgeorgi
Copy link

pgeorgi commented Mar 17, 2021

[  247.003109]  avl_add+0x9a/0xb0 [zavl]
[  247.003115]  dsl_livelist_iterate+0x120/0x1d0 [zfs]

This has been reported at least twice by now, possible duplicate of issue #11480?

@PrivatePuffin
Copy link
Contributor

Deduplication is ON, and yes, we have plenty of memory (128GB, and the system never touches its swap space).
We have also tweaked the following parameters on this system:

Worth to be noted: ZSTD is not adviced to be used with deduplication, due to it being build with updating in mind.
While currently the updating is on hold, due to some issues that need to be solved in other parts of ZFS to make it possible, no guarantees are given that blocks keep matching after a ZSTD update hits.

@dodexahedron
Copy link
Contributor Author

Deduplication is ON, and yes, we have plenty of memory (128GB, and the system never touches its swap space).
We have also tweaked the following parameters on this system:

Worth to be noted: ZSTD is not adviced to be used with deduplication, due to it being build with updating in mind.
While currently the updating is on hold, due to some issues that need to be solved in other parts of ZFS to make it possible, no guarantees are given that blocks keep matching after a ZSTD update hits.

Good to know.
Thanks.
I'm rebuilding that pool with ashift=12 and all datasets set back to lz4, for now, since dedup is saving a significant amount of space and the compression gains from zstd were not particularly significant over lz4, at zstd-4, anyway.

A question:
If dedup is enabled on some datasets in the pool, but not others, is it still safe to dedup the non-zstd datasets or should zstd and dedup not exist together in the same pool, period?

@dodexahedron
Copy link
Contributor Author

dodexahedron commented Apr 1, 2021

Update:
On a completely new pool, where zstd has never been used, zfs is still causing deadlocks with the same stack trace.
The pool layout is the same as before, with 10 SAS SSDs and the same mix of zvols and datasets.

Additional stuff to note that I hadn't mentioned before:
The drives are using multipathd, with two ports on the same controller going to the two redundant backplanes in the enclosure where the drives live, using the round-robin policy.

@dodexahedron
Copy link
Contributor Author

And here's the latest stack trace:

INFO: task z_wr_iss:9171 blocked for more than 122 seconds.
Apr 01 14:46:38 hostname kernel: Tainted: P O 5.11.8-1.el8.elrepo.x86_64 #1
Apr 01 14:46:38 hostname kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 01 14:46:38 hostname kernel: task:z_wr_iss state:D stack: 0 pid: 9171 ppid: 2 flags:0x00004000
Apr 01 14:46:38 hostname kernel: Call Trace:
Apr 01 14:46:38 hostname kernel: __schedule+0x39b/0x820
Apr 01 14:46:38 hostname kernel: ? zio_vdev_io_done+0x49/0x1f0 [zfs]
Apr 01 14:46:38 hostname kernel: schedule+0x3c/0xa0
Apr 01 14:46:38 hostname kernel: rwsem_down_write_slowpath+0x2ff/0x5c0
Apr 01 14:46:38 hostname kernel: dmu_buf_lock_parent+0x6e/0xe0 [zfs]
Apr 01 14:46:38 hostname kernel: ? _cond_resched+0x15/0x30
Apr 01 14:46:38 hostname kernel: ? mutex_lock+0xe/0x30
Apr 01 14:46:38 hostname kernel: dbuf_write_ready+0x1dc/0x390 [zfs]
Apr 01 14:46:38 hostname kernel: arc_write_ready+0xac/0x590 [zfs]
Apr 01 14:46:38 hostname kernel: ? _cond_resched+0x15/0x30
Apr 01 14:46:38 hostname kernel: ? mutex_lock+0xe/0x30
Apr 01 14:46:38 hostname kernel: zio_ready+0x45/0x450 [zfs]
Apr 01 14:46:38 hostname kernel: zio_execute+0x90/0xf0 [zfs]
Apr 01 14:46:38 hostname kernel: taskq_thread+0x2d0/0x490 [spl]
Apr 01 14:46:38 hostname kernel: ? wake_up_q+0xa0/0xa0
Apr 01 14:46:38 hostname kernel: ? zio_taskq_member.isra.14.constprop.20+0x60/0x60 [zfs]
Apr 01 14:46:38 hostname kernel: ? task_done+0x90/0x90 [spl]
Apr 01 14:46:38 hostname kernel: kthread+0x116/0x130
Apr 01 14:46:38 hostname kernel: ? kthread_park+0x80/0x80
Apr 01 14:46:38 hostname kernel: ret_from_fork+0x22/0x30

@InsanePrawn
Copy link
Contributor

One out of three dedis of ours is repeatedly experiencing the same symptoms, although I can't even get the trace due to a lack of physical access/IPMI/serial.

They're all Intel quadcores with 64GB of RAM, running Debian 10, Linux 4.19, ZFS 0.8.6-1 from buster-backports, no dedup, no ZSTD, no regular pool or SMART errors.
The interesting difference is that unlike the well-behaved hosts, host 3 has two mirrored SATA SSDs instead of two mirrored SATA HDDs.
It feels like I've seen this situation before - ZFS 0.8, host with HDDs works mostly fine, host with SSDs locks up regularly, but I no longer work for that place.

Backports moved to 2.0.3-1 by now, which an apt upgrade just installed on the third, but I can roll back to 0.8.6-1 easily still, I haven't even rebooted into the new modules yet. I suspect either way I'll experience another lockup in about 3 weeks or less.
When this happens, the machine usually responds to pings still, the nginx proxy even continued serving 502s, but SSH logins don't succeed, logging stops, etc.

@InsanePrawn
Copy link
Contributor

Ha! Thanks to prometheus federation I can definitely confirm it was something FS-y though. Note the 100% iowait.
image
image

@cinterloper
Copy link

Is this addressed in 2.0.5?

@stale
Copy link

stale bot commented Jun 27, 2022

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Jun 27, 2022
@behlendorf
Copy link
Contributor

Closing. I believe this was resolved in the 2.1.0 release by a377bde. That said, if this has been observed in the 2.1.x releases please let me know and I'll reopen this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Status: Triage Needed New issue which needs to be triaged Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

7 participants