Skip to content

Commit

Permalink
btrfs: set start on clone before calling copy_extent_buffer_full
Browse files Browse the repository at this point in the history
Our subpage testing started hanging on generic/560 and I bisected it
down to 1cab137 ("btrfs: reuse cloned extent buffer during
fiemap to avoid re-allocations").  This is subtle because we use
eb->start to figure out where in the folio we're copying to when we're
subpage, as our ->start may refer to an area inside of the folio.

For example, assume a 16K page size machine with a 4K node size, and
assume that we already have a cloned extent buffer when we cloned the
previous search.

copy_extent_buffer_full() will do the following when copying the extent
buffer path->nodes[0] (src) into cloned (dest):

  src->start = 8k; // this is the new leaf we're cloning
  cloned->start = 4k; // this is left over from the previous clone

  src_addr = folio_address(src->folios[0]);
  dest_addr = folio_address(dest->folios[0]);

  memcpy(dest_addr + get_eb_offset_in_folio(dst, 0),
	 src_addr + get_eb_offset_in_folio(src, 0), src->len);

Now get_eb_offset_in_folio() is where the problems occur, because for
sub-pagesize blocksize we can have multiple eb's per folio, the code for
this is as follows

  size_t get_eb_offset_in_folio(eb, offset) {
	  return (eb->start + offset & (folio_size(eb->folio[0]) - 1));
  }

So in the above example we are copying into offset 4K inside the folio.
However once we update cloned->start to 8K to match the src the math for
get_eb_offset_in_folio() changes, and any subsequent reads (i.e.
btrfs_item_key_to_cpu()) will start reading from the offset 8K instead
of 4K where we copied to, giving us garbage.

Fix this by setting start before we co copy_extent_buffer_full() to make
sure that we're copying into the same offset inside of the folio that we
will read from later.

All other sites of copy_extent_buffer_full() are correct because we
either set ->start beforehand or we simply don't change it in the case
of the tree-log usage.

With this fix we now pass generic/560 on our subpage tests.

Fixes: 1cab137 ("btrfs: reuse cloned extent buffer during fiemap to avoid re-allocations")
Reviewed-by: Filipe Manana <[email protected]>
Reviewed-by: Qu Wenruo <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
  • Loading branch information
josefbacik authored and kdave committed May 7, 2024
1 parent 99f2be1 commit 53e2415
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions fs/btrfs/extent_io.c
Original file line number Diff line number Diff line change
Expand Up @@ -2803,13 +2803,19 @@ static int fiemap_next_leaf_item(struct btrfs_inode *inode, struct btrfs_path *p
goto out;
}

/* See the comment at fiemap_search_slot() about why we clone. */
copy_extent_buffer_full(clone, path->nodes[0]);
/*
* Important to preserve the start field, for the optimizations when
* checking if extents are shared (see extent_fiemap()).
*
* We must set ->start before calling copy_extent_buffer_full(). If we
* are on sub-pagesize blocksize, we use ->start to determine the offset
* into the folio where our eb exists, and if we update ->start after
* the fact then any subsequent reads of the eb may read from a
* different offset in the folio than where we originally copied into.
*/
clone->start = path->nodes[0]->start;
/* See the comment at fiemap_search_slot() about why we clone. */
copy_extent_buffer_full(clone, path->nodes[0]);

slot = path->slots[0];
btrfs_release_path(path);
Expand Down

0 comments on commit 53e2415

Please sign in to comment.