Skip to content

Commit

Permalink
Fix vdev_queue_aggregate() deadlock
Browse files Browse the repository at this point in the history
This deadlock may manifest itself in slightly different ways but
at the core it is caused by a memory allocation blocking on file-
system reclaim in the zio pipeline.  This is normally impossible
because zio_execute() disables filesystem reclaim by setting
PF_FSTRANS on the thread.  However, kmem cache allocations may
still indirectly block on file system reclaim while holding the
critical vq->vq_lock as shown below.

To resolve this issue zio_buf_alloc_flags() is introduced which
allocation flags to be passed.  This can then be used in
vdev_queue_aggregate() with KM_NOSLEEP when allocating the
aggregate IO buffer.  Since aggregating the IO is purely a
performance optimization we want this to either succeed or fail
quickly.  Trying too hard to allocate this memory under the
vq->vq_lock can negatively impact performance and result in
this deadlock.

* z_wr_iss
zio_vdev_io_start
  vdev_queue_io -> Takes vq->vq_lock
    vdev_queue_io_to_issue
      vdev_queue_aggregate
        zio_buf_alloc -> Waiting on spl_kmem_cache process

* z_wr_int
zio_vdev_io_done
  vdev_queue_io_done
    mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss

* txg_sync
spa_sync
  dsl_pool_sync
    zio_wait -> Waiting on zio being handled by z_wr_int

* spl_kmem_cache
spl_cache_grow_work
  kv_alloc
    spl_vmalloc
      ...
      evict
        zpl_evict_inode
          zfs_inactive
            dmu_tx_wait
              txg_wait_open -> Waiting on txg_sync

Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Closes openzfs#3808
Closes openzfs#3867
  • Loading branch information
behlendorf authored and goulven riou committed Feb 2, 2016
1 parent abe2650 commit b7588ac
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions module/zfs/vdev_queue.c
Original file line number Diff line number Diff line change
Expand Up @@ -606,6 +606,10 @@ vdev_queue_aggregate(vdev_queue_t *vq, zio_t *zio)
size = IO_SPAN(first, last);
ASSERT3U(size, <=, limit);

buf = zio_buf_alloc_flags(size, KM_NOSLEEP);
if (buf == NULL)
return (NULL);

buf = zio_buf_alloc_flags(size, KM_NOSLEEP);
if (buf == NULL)
return (NULL);
Expand Down

0 comments on commit b7588ac

Please sign in to comment.