Skip to content

Commit

Permalink
[RFC, discussion, feedback] account for ashift when choosing buffers …
Browse files Browse the repository at this point in the history
…to be written to l2arc device

If we don't account for that, then we might end up overwriting disk
area of buffers that have not been evicted yet, because l2arc_evict
operates in terms of disk addresses.

The discrepancy between the write size calculation and the actual increment
to l2ad_hand was introduced in
commit e14bb3258d05c1b1077e2db7cf77088924e56919

Also, consistently use asize / a_sz for the allocated size, psize / p_sz
for the physical size.  Where the latter accounts for possible size
reduction because of compression, whereas the former accounts for possible
size expansion because of alignment requirements.

The code still assumes that either underlying storage subsystems or
hardware is able to do read-modify-write when an L2ARC buffer size is
not a multiple of a disk's block size.  This is true for 4KB sector disks
that provide 512B sector emulation, but may not be true in general.
In other words, we currently do not have any code to make sure that
an L2ARC buffer, whether compressed or not, which is used for physical I/O
has a suitable size.

modified to account for the changes of

openzfs#2129 (ABD) ,

openzfs#3115 (Illumos - 5408 managing ZFS cache devices requires lots of RAM)

and

Illumos - 5701 zpool list reports incorrect "alloc" value for cache devices
  • Loading branch information
avg-I authored and kernelOfTruth committed Jun 8, 2015
1 parent 19b6408 commit a7683f5
Showing 1 changed file with 19 additions and 10 deletions.
29 changes: 19 additions & 10 deletions module/zfs/arc.c
Original file line number Diff line number Diff line change
Expand Up @@ -5693,6 +5693,7 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
for (; hdr; hdr = hdr_prev) {
kmutex_t *hash_lock;
uint64_t buf_sz;
uint64_t buf_a_sz;

if (arc_warm == B_FALSE)
hdr_prev = multilist_sublist_next(mls, hdr);
Expand Down Expand Up @@ -5721,7 +5722,15 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
continue;
}

if ((write_sz + hdr->b_size) > target_sz) {
/*
* Assume that the buffer is not going to be compressed
* and could take more space on disk because of a larger
* disk block size.
*/
buf_sz = hdr->b_size;
buf_a_sz = vdev_psize_to_asize(dev->l2ad_vdev, buf_sz);

if ((write_asize + buf_a_sz) > target_sz) {
full = B_TRUE;
mutex_exit(hash_lock);
break;
Expand Down Expand Up @@ -5787,7 +5796,6 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
*/
hdr->b_l2hdr.b_daddr = L2ARC_ADDR_UNSET;

buf_sz = hdr->b_size;
hdr->b_flags |= ARC_FLAG_HAS_L2HDR;

mutex_enter(&dev->l2ad_mtx);
Expand All @@ -5804,6 +5812,7 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
mutex_exit(hash_lock);

write_sz += buf_sz;
write_asize += buf_a_sz;
}

multilist_sublist_unlock(mls);
Expand All @@ -5827,6 +5836,7 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
* and work backwards, retracing the course of the buffer selector
* loop above.
*/
write_asize = 0;
for (hdr = list_prev(&dev->l2ad_buflist, head); hdr;
hdr = list_prev(&dev->l2ad_buflist, hdr)) {
uint64_t buf_sz;
Expand Down Expand Up @@ -5875,7 +5885,7 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,

/* Compression may have squashed the buffer to zero length. */
if (buf_sz != 0) {
uint64_t buf_p_sz;
uint64_t buf_a_sz;

wzio = zio_write_phys(pio, dev->l2ad_vdev,
dev->l2ad_hand, buf_sz, buf_data, ZIO_CHECKSUM_OFF,
Expand All @@ -5886,14 +5896,13 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
zio_t *, wzio);
(void) zio_nowait(wzio);

write_asize += buf_sz;

write_psize += buf_sz;
/*
* Keep the clock hand suitably device-aligned.
*/
buf_p_sz = vdev_psize_to_asize(dev->l2ad_vdev, buf_sz);
write_psize += buf_p_sz;
dev->l2ad_hand += buf_p_sz;
buf_a_sz = vdev_psize_to_asize(dev->l2ad_vdev, buf_sz);
write_asize += buf_a_sz;
dev->l2ad_hand += buf_a_sz;
}
}

Expand All @@ -5903,8 +5912,8 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz,
ARCSTAT_BUMP(arcstat_l2_writes_sent);
ARCSTAT_INCR(arcstat_l2_write_bytes, write_asize);
ARCSTAT_INCR(arcstat_l2_size, write_sz);
ARCSTAT_INCR(arcstat_l2_asize, write_asize);
vdev_space_update(dev->l2ad_vdev, write_asize, 0, 0);
ARCSTAT_INCR(arcstat_l2_asize, write_psize);
vdev_space_update(dev->l2ad_vdev, write_psize, 0, 0);

/*
* Bump device hand to the device start if it is approaching the end.
Expand Down

0 comments on commit a7683f5

Please sign in to comment.