-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine OS-independent ABD code into common source file #10293
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that currently each OS has their own abd code (abd.c). This PR is actually renaming each os-specific abd.c to abd_os.c, and combining the OS-independent ABD code into a common abd.c. I've updated the PR summary to reflect this, could you also update the commit comment?
module/os/linux/zfs/abd_os.c
Outdated
@@ -1038,28 +749,14 @@ abd_release_ownership_of_buf(abd_t *abd) | |||
/* Disable this flag since we no longer own the data buffer */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that abd_release_ownership_of_buf()
could be moved to the common abd.c
if we made abd_is_linear_page()
always return FALSE on FreeBSD, which would make sense since FreeBSD doesn't have this type of ABD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead a moved abd_release_owership_of_buf()
to the common abd.c and placed abd_is_linear_page()
in each of the OS specific ABD source files. In the case of FreeBSD I just always return B_FALSE. I also updated abd_free_linear_page()
in the FreeBSD ABD source code to be VERIFY(0)
as we should never get here.
52cd3e2
to
1cf0f69
Compare
I updated the comment in the commit message. |
The formatting of the commit message could use some help (line wrapping) - but @behlendorf may be able to do that as part of merging it. Also I'm not sure if you care about the git
|
I found the way git detects file renaming here a little confusing. Here is the diff from FreeBSD's old abd.c to its new abd_os.c (not the easiest thing to read either, but better than "here's a whole new file": 58,85c58,59
< * In addition to directly allocating a linear or scattered ABD, it is also
< * possible to create an ABD by requesting the "sub-ABD" starting at an offset
< * within an existing ABD. In linear buffers this is simple (set abd_buf of
< * the new ABD to the starting point within the original raw buffer), but
< * scattered ABDs are a little more complex. The new ABD makes a copy of the
< * relevant abd_chunks pointers (but not the underlying data). However, to
< * provide arbitrary rather than only chunk-aligned starting offsets, it also
< * tracks an abd_offset field which represents the starting point of the data
< * within the first chunk in abd_chunks. For both linear and scattered ABDs,
< * creating an offset ABD marks the original ABD as the offset's parent, and the
< * original ABD's abd_children refcount is incremented. This data allows us to
< * ensure the root ABD isn't deleted before its children.
< *
< * Most consumers should never need to know what type of ABD they're using --
< * the ABD public API ensures that it's possible to transparently switch from
< * using a linear ABD to a scattered one when doing so would be beneficial.
< *
< * If you need to use the data within an ABD directly, if you know it's linear
< * (because you allocated it) you can use abd_to_buf() to access the underlying
< * raw buffer. Otherwise, you should use one of the abd_borrow_buf* functions
< * which will allocate a raw buffer if necessary. Use the abd_return_buf*
< * functions to return any raw buffers that are no longer necessary when you're
< * done using them.
< *
< * There are a variety of ABD APIs that implement basic buffer operations:
< * compare, copy, read, write, and fill with zeroes. If you need a custom
< * function which progressively accesses the whole ABD, use the abd_iterate_*
< * functions.
---
> * It is possible to make all ABDs linear by setting zfs_abd_scatter_enabled to
> * B_FALSE.
88c62
< #include <sys/abd.h>
---
> #include <sys/abd_impl.h>
130,135d103
< #define ABDSTAT(stat) (abd_stats.stat.value.ui64)
< #define ABDSTAT_INCR(stat, val) \
< atomic_add_64(&abd_stats.stat.value.ui64, (val))
< #define ABDSTAT_BUMP(stat) ABDSTAT_INCR(stat, 1)
< #define ABDSTAT_BUMPDOWN(stat) ABDSTAT_INCR(stat, -1)
<
165,179d132
< extern inline boolean_t abd_is_linear(abd_t *abd);
< extern inline void abd_copy(abd_t *dabd, abd_t *sabd, size_t size);
< extern inline void abd_copy_from_buf(abd_t *abd, const void *buf, size_t size);
< extern inline void abd_copy_to_buf(void* buf, abd_t *abd, size_t size);
< extern inline int abd_cmp_buf(abd_t *abd, const void *buf, size_t size);
< extern inline void abd_zero(abd_t *abd, size_t size);
<
< static void *
< abd_alloc_chunk()
< {
< void *c = kmem_cache_alloc(abd_chunk_cache, KM_PUSHPAGE);
< ASSERT3P(c, !=, NULL);
< return (c);
< }
<
186,187c139,140
< void
< abd_init(void)
---
> static inline size_t
> abd_scatter_chunkcnt(abd_t *abd)
189,197c142,144
< abd_chunk_cache = kmem_cache_create("abd_chunk", zfs_abd_chunk_size, 0,
< NULL, NULL, NULL, NULL, 0, KMC_NOTOUCH | KMC_NODEBUG);
<
< abd_ksp = kstat_create("zfs", 0, "abdstats", "misc", KSTAT_TYPE_NAMED,
< sizeof (abd_stats) / sizeof (kstat_named_t), KSTAT_FLAG_VIRTUAL);
< if (abd_ksp != NULL) {
< abd_ksp->ks_data = &abd_stats;
< kstat_install(abd_ksp);
< }
---
> ASSERT(!abd_is_linear(abd));
> return (abd_chunkcnt_for_bytes(
> ABD_SCATTER(abd).abd_offset + abd->abd_size));
200,201c147,148
< void
< abd_fini(void)
---
> size_t
> abd_chunkcnt_for_bytes(size_t size)
203,209c150
< if (abd_ksp != NULL) {
< kstat_delete(abd_ksp);
< abd_ksp = NULL;
< }
<
< kmem_cache_destroy(abd_chunk_cache);
< abd_chunk_cache = NULL;
---
> return (P2ROUNDUP(size, zfs_abd_chunk_size) / zfs_abd_chunk_size);
212,213c153,154
< static inline size_t
< abd_chunkcnt_for_bytes(size_t size)
---
> boolean_t
> abd_size_alloc_linear(size_t size)
215c156
< return (P2ROUNDUP(size, zfs_abd_chunk_size) / zfs_abd_chunk_size);
---
> return (size <= zfs_abd_chunk_size ? B_TRUE : B_FALSE);
218,219c159,160
< static inline size_t
< abd_scatter_chunkcnt(abd_t *abd)
---
> boolean_t
> abd_is_linear_page(abd_t *abd)
221,223c162,166
< ASSERT(!abd_is_linear(abd));
< return (abd_chunkcnt_for_bytes(
< abd->abd_u.abd_scatter.abd_offset + abd->abd_size));
---
> /*
> * FreeBSD does not have scatter linear page ABDs so we just
> * return B_FALSE.
> */
> return (B_FALSE);
226,227c169,170
< static inline void
< abd_verify(abd_t *abd)
---
> void
> abd_update_scatter_stats(abd_t *abd, abd_stats_op_t op)
229,236c172,178
< ASSERT3U(abd->abd_size, >, 0);
< ASSERT3U(abd->abd_size, <=, SPA_MAXBLOCKSIZE);
< ASSERT3U(abd->abd_flags, ==, abd->abd_flags & (ABD_FLAG_LINEAR |
< ABD_FLAG_OWNER | ABD_FLAG_META));
< IMPLY(abd->abd_parent != NULL, !(abd->abd_flags & ABD_FLAG_OWNER));
< IMPLY(abd->abd_flags & ABD_FLAG_META, abd->abd_flags & ABD_FLAG_OWNER);
< if (abd_is_linear(abd)) {
< ASSERT3P(abd->abd_u.abd_linear.abd_buf, !=, NULL);
---
> size_t n = abd_scatter_chunkcnt(abd);
> ASSERT(op == ABDSTAT_INCR || op == ABDSTAT_DECR);
> if (op == ABDSTAT_INCR) {
> ABDSTAT_BUMP(abdstat_scatter_cnt);
> ABDSTAT_INCR(abdstat_scatter_data_size, abd->abd_size);
> ABDSTAT_INCR(abdstat_scatter_chunk_waste,
> n * zfs_abd_chunk_size - abd->abd_size);
238,244c180,183
< ASSERT3U(abd->abd_u.abd_scatter.abd_offset, <,
< zfs_abd_chunk_size);
< size_t n = abd_scatter_chunkcnt(abd);
< for (int i = 0; i < n; i++) {
< ASSERT3P(
< abd->abd_u.abd_scatter.abd_chunks[i], !=, NULL);
< }
---
> ABDSTAT_BUMPDOWN(abdstat_scatter_cnt);
> ABDSTAT_INCR(abdstat_scatter_data_size, -(int)abd->abd_size);
> ABDSTAT_INCR(abdstat_scatter_chunk_waste,
> abd->abd_size - n * zfs_abd_chunk_size);
248,249c187,188
< static inline abd_t *
< abd_alloc_struct(size_t chunkcnt)
---
> void
> abd_update_linear_stats(abd_t *abd, abd_stats_op_t op)
251,256c190,197
< size_t size = offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]);
< abd_t *abd = kmem_alloc(size, KM_PUSHPAGE);
< ASSERT3P(abd, !=, NULL);
< ABDSTAT_INCR(abdstat_struct_size, size);
<
< return (abd);
---
> ASSERT(op == ABDSTAT_INCR || op == ABDSTAT_DECR);
> if (op == ABDSTAT_INCR) {
> ABDSTAT_BUMP(abdstat_linear_cnt);
> ABDSTAT_INCR(abdstat_linear_data_size, abd->abd_size);
> } else {
> ABDSTAT_BUMPDOWN(abdstat_linear_cnt);
> ABDSTAT_INCR(abdstat_linear_data_size, -(int)abd->abd_size);
> }
259,260c200,202
< static inline void
< abd_free_struct(abd_t *abd)
---
>
> void
> abd_verify_scatter(abd_t *abd)
262,265c204,210
< size_t chunkcnt = abd_is_linear(abd) ? 0 : abd_scatter_chunkcnt(abd);
< int size = offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]);
< kmem_free(abd, size);
< ABDSTAT_INCR(abdstat_struct_size, -size);
---
> ASSERT3U(ABD_SCATTER(abd).abd_offset, <,
> zfs_abd_chunk_size);
> size_t n = abd_scatter_chunkcnt(abd);
> for (int i = 0; i < n; i++) {
> ASSERT3P(
> ABD_SCATTER(abd).abd_chunks[i], !=, NULL);
> }
268,273c213,214
< /*
< * Allocate an ABD, along with its own underlying data buffers. Use this if you
< * don't care whether the ABD is linear or not.
< */
< abd_t *
< abd_alloc(size_t size, boolean_t is_metadata)
---
> void
> abd_alloc_pages(abd_t *abd, size_t size)
275,279d215
< if (!zfs_abd_scatter_enabled || size <= zfs_abd_chunk_size)
< return (abd_alloc_linear(size, is_metadata));
<
< VERIFY3U(size, <=, SPA_MAXBLOCKSIZE);
<
281,293d216
< abd_t *abd = abd_alloc_struct(n);
<
< abd->abd_flags = ABD_FLAG_OWNER;
< if (is_metadata) {
< abd->abd_flags |= ABD_FLAG_META;
< }
< abd->abd_size = size;
< abd->abd_parent = NULL;
< zfs_refcount_create(&abd->abd_children);
<
< abd->abd_u.abd_scatter.abd_offset = 0;
< abd->abd_u.abd_scatter.abd_chunk_size = zfs_abd_chunk_size;
<
295c218
< void *c = abd_alloc_chunk();
---
> void *c = kmem_cache_alloc(abd_chunk_cache, KM_PUSHPAGE);
297c220
< abd->abd_u.abd_scatter.abd_chunks[i] = c;
---
> ABD_SCATTER(abd).abd_chunks[i] = c;
299,305c222
<
< ABDSTAT_BUMP(abdstat_scatter_cnt);
< ABDSTAT_INCR(abdstat_scatter_data_size, size);
< ABDSTAT_INCR(abdstat_scatter_chunk_waste,
< n * zfs_abd_chunk_size - size);
<
< return (abd);
---
> ABD_SCATTER(abd).abd_chunk_size = zfs_abd_chunk_size;
308,309c225,226
< static void
< abd_free_scatter(abd_t *abd)
---
> void
> abd_free_pages(abd_t *abd)
313c230
< abd_free_chunk(abd->abd_u.abd_scatter.abd_chunks[i]);
---
> abd_free_chunk(ABD_SCATTER(abd).abd_chunks[i]);
315,322d231
<
< zfs_refcount_destroy(&abd->abd_children);
< ABDSTAT_BUMPDOWN(abdstat_scatter_cnt);
< ABDSTAT_INCR(abdstat_scatter_data_size, -(int)abd->abd_size);
< ABDSTAT_INCR(abdstat_scatter_chunk_waste,
< abd->abd_size - n * zfs_abd_chunk_size);
<
< abd_free_struct(abd);
325,329d233
< /*
< * Allocate an ABD that must be linear, along with its own underlying data
< * buffer. Only use this when it would be very annoying to write your ABD
< * consumer with a scattered ABD.
< */
331c235
< abd_alloc_linear(size_t size, boolean_t is_metadata)
---
> abd_alloc_struct(size_t chunkcnt)
333,352c237,241
< abd_t *abd = abd_alloc_struct(0);
<
< VERIFY3U(size, <=, SPA_MAXBLOCKSIZE);
<
< abd->abd_flags = ABD_FLAG_LINEAR | ABD_FLAG_OWNER;
< if (is_metadata) {
< abd->abd_flags |= ABD_FLAG_META;
< }
< abd->abd_size = size;
< abd->abd_parent = NULL;
< zfs_refcount_create(&abd->abd_children);
<
< if (is_metadata) {
< abd->abd_u.abd_linear.abd_buf = zio_buf_alloc(size);
< } else {
< abd->abd_u.abd_linear.abd_buf = zio_data_buf_alloc(size);
< }
<
< ABDSTAT_BUMP(abdstat_linear_cnt);
< ABDSTAT_INCR(abdstat_linear_data_size, size);
---
> size_t abd_size = offsetof(abd_t,
> abd_u.abd_scatter.abd_chunks[chunkcnt]);
> abd_t *abd = kmem_alloc(abd_size, KM_PUSHPAGE);
> ASSERT3P(abd, !=, NULL);
> ABDSTAT_INCR(abdstat_struct_size, abd_size);
357,358c246,247
< static void
< abd_free_linear(abd_t *abd)
---
> void
> abd_free_struct(abd_t *abd)
360,364c249,253
< if (abd->abd_flags & ABD_FLAG_META) {
< zio_buf_free(abd->abd_u.abd_linear.abd_buf, abd->abd_size);
< } else {
< zio_data_buf_free(abd->abd_u.abd_linear.abd_buf, abd->abd_size);
< }
---
> size_t chunkcnt = abd_is_linear(abd) ? 0 : abd_scatter_chunkcnt(abd);
> int size = offsetof(abd_t, abd_u.abd_scatter.abd_chunks[chunkcnt]);
> kmem_free(abd, size);
> ABDSTAT_INCR(abdstat_struct_size, -size);
> }
366,368c255,259
< zfs_refcount_destroy(&abd->abd_children);
< ABDSTAT_BUMPDOWN(abdstat_linear_cnt);
< ABDSTAT_INCR(abdstat_linear_data_size, -(int)abd->abd_size);
---
> void
> abd_init(void)
> {
> abd_chunk_cache = kmem_cache_create("abd_chunk", zfs_abd_chunk_size, 0,
> NULL, NULL, NULL, NULL, 0, KMC_NOTOUCH | KMC_NODEBUG);
370c261,266
< abd_free_struct(abd);
---
> abd_ksp = kstat_create("zfs", 0, "abdstats", "misc", KSTAT_TYPE_NAMED,
> sizeof (abd_stats) / sizeof (kstat_named_t), KSTAT_FLAG_VIRTUAL);
> if (abd_ksp != NULL) {
> abd_ksp->ks_data = &abd_stats;
> kstat_install(abd_ksp);
> }
373,376d268
< /*
< * Free an ABD. Only use this on ABDs allocated with abd_alloc() or
< * abd_alloc_linear().
< */
378c270
< abd_free(abd_t *abd)
---
> abd_fini(void)
380,381c272,275
< if (abd == NULL)
< return;
---
> if (abd_ksp != NULL) {
> kstat_delete(abd_ksp);
> abd_ksp = NULL;
> }
383,389c277,278
< abd_verify(abd);
< ASSERT3P(abd->abd_parent, ==, NULL);
< ASSERT(abd->abd_flags & ABD_FLAG_OWNER);
< if (abd_is_linear(abd))
< abd_free_linear(abd);
< else
< abd_free_scatter(abd);
---
> kmem_cache_destroy(abd_chunk_cache);
> abd_chunk_cache = NULL;
392,397c281,282
< /*
< * Allocate an ABD of the same format (same metadata flag, same scatterize
< * setting) as another ABD.
< */
< abd_t *
< abd_alloc_sametype(abd_t *sabd, size_t size)
---
> void
> abd_free_linear_page(abd_t *abd)
399,404c284,288
< boolean_t is_metadata = (sabd->abd_flags & ABD_FLAG_META) != 0;
< if (abd_is_linear(sabd)) {
< return (abd_alloc_linear(size, is_metadata));
< } else {
< return (abd_alloc(size, is_metadata));
< }
---
> /*
> * FreeBSD does not have have scatter linear pages
> * so there is an error.
> */
> VERIFY(0);
423,430c307,308
< /*
< * Allocate a new ABD to point to offset off of sabd. It shares the underlying
< * buffer data with sabd. Use abd_put() to free. sabd must not be freed while
< * any derived ABDs exist.
< */
< /* ARGSUSED */
< static inline abd_t *
< abd_get_offset_impl(abd_t *sabd, size_t off, size_t size)
---
> abd_t *
> abd_get_offset_scatter(abd_t *sabd, size_t off)
432c310
< abd_t *abd;
---
> abd_t *abd = NULL;
437,487c315,317
< if (abd_is_linear(sabd)) {
< abd = abd_alloc_struct(0);
<
< /*
< * Even if this buf is filesystem metadata, we only track that
< * if we own the underlying data buffer, which is not true in
< * this case. Therefore, we don't ever use ABD_FLAG_META here.
< */
< abd->abd_flags = ABD_FLAG_LINEAR;
<
< abd->abd_u.abd_linear.abd_buf =
< (char *)sabd->abd_u.abd_linear.abd_buf + off;
< } else {
< size_t new_offset = sabd->abd_u.abd_scatter.abd_offset + off;
< size_t chunkcnt = abd_scatter_chunkcnt(sabd) -
< (new_offset / zfs_abd_chunk_size);
<
< abd = abd_alloc_struct(chunkcnt);
<
< /*
< * Even if this buf is filesystem metadata, we only track that
< * if we own the underlying data buffer, which is not true in
< * this case. Therefore, we don't ever use ABD_FLAG_META here.
< */
< abd->abd_flags = 0;
<
< abd->abd_u.abd_scatter.abd_offset =
< new_offset % zfs_abd_chunk_size;
< abd->abd_u.abd_scatter.abd_chunk_size = zfs_abd_chunk_size;
<
< /* Copy the scatterlist starting at the correct offset */
< (void) memcpy(&abd->abd_u.abd_scatter.abd_chunks,
< &sabd->abd_u.abd_scatter.abd_chunks[new_offset /
< zfs_abd_chunk_size],
< chunkcnt * sizeof (void *));
< }
<
< if (size == 0)
< abd->abd_size = sabd->abd_size - off;
< else
< abd->abd_size = size;
< abd->abd_parent = sabd;
< zfs_refcount_create(&abd->abd_children);
< (void) zfs_refcount_add_many(&sabd->abd_children, abd->abd_size, abd);
<
< return (abd);
< }
<
< abd_t *
< abd_get_offset(abd_t *sabd, size_t off)
< {
---
> size_t new_offset = ABD_SCATTER(sabd).abd_offset + off;
> size_t chunkcnt = abd_scatter_chunkcnt(sabd) -
> (new_offset / zfs_abd_chunk_size);
489,510c319
< return (abd_get_offset_impl(sabd, off, 0));
< }
<
< abd_t *
< abd_get_offset_size(abd_t *sabd, size_t off, size_t size)
< {
< ASSERT3U(off + size, <=, sabd->abd_size);
<
< return (abd_get_offset_impl(sabd, off, size));
< }
<
<
< /*
< * Allocate a linear ABD structure for buf. You must free this with abd_put()
< * since the resulting ABD doesn't own its own buffer.
< */
< abd_t *
< abd_get_from_buf(void *buf, size_t size)
< {
< abd_t *abd = abd_alloc_struct(0);
<
< VERIFY3U(size, <=, SPA_MAXBLOCKSIZE);
---
> abd = abd_alloc_struct(chunkcnt);
513,515c322,324
< * Even if this buf is filesystem metadata, we only track that if we
< * own the underlying data buffer, which is not true in this case.
< * Therefore, we don't ever use ABD_FLAG_META here.
---
> * Even if this buf is filesystem metadata, we only track that
> * if we own the underlying data buffer, which is not true in
> * this case. Therefore, we don't ever use ABD_FLAG_META here.
517,579c326
< abd->abd_flags = ABD_FLAG_LINEAR;
< abd->abd_size = size;
< abd->abd_parent = NULL;
< zfs_refcount_create(&abd->abd_children);
<
< abd->abd_u.abd_linear.abd_buf = buf;
<
< return (abd);
< }
<
< /*
< * Free an ABD allocated from abd_get_offset() or abd_get_from_buf(). Will not
< * free the underlying scatterlist or buffer.
< */
< void
< abd_put(abd_t *abd)
< {
< if (abd == NULL)
< return;
< abd_verify(abd);
< ASSERT(!(abd->abd_flags & ABD_FLAG_OWNER));
<
< if (abd->abd_parent != NULL) {
< (void) zfs_refcount_remove_many(&abd->abd_parent->abd_children,
< abd->abd_size, abd);
< }
<
< zfs_refcount_destroy(&abd->abd_children);
< abd_free_struct(abd);
< }
<
< /*
< * Get the raw buffer associated with a linear ABD.
< */
< void *
< abd_to_buf(abd_t *abd)
< {
< ASSERT(abd_is_linear(abd));
< abd_verify(abd);
< return (abd->abd_u.abd_linear.abd_buf);
< }
<
< /*
< * Borrow a raw buffer from an ABD without copying the contents of the ABD
< * into the buffer. If the ABD is scattered, this will allocate a raw buffer
< * whose contents are undefined. To copy over the existing data in the ABD, use
< * abd_borrow_buf_copy() instead.
< */
< void *
< abd_borrow_buf(abd_t *abd, size_t n)
< {
< void *buf;
< abd_verify(abd);
< ASSERT3U(abd->abd_size, >=, n);
< if (abd_is_linear(abd)) {
< buf = abd_to_buf(abd);
< } else {
< buf = zio_buf_alloc(n);
< }
< (void) zfs_refcount_add_many(&abd->abd_children, n, buf);
<
< return (buf);
< }
---
> abd->abd_flags = 0;
581,589c328,329
< void *
< abd_borrow_buf_copy(abd_t *abd, size_t n)
< {
< void *buf = abd_borrow_buf(abd, n);
< if (!abd_is_linear(abd)) {
< abd_copy_to_buf(buf, abd, n);
< }
< return (buf);
< }
---
> ABD_SCATTER(abd).abd_offset = new_offset % zfs_abd_chunk_size;
> ABD_SCATTER(abd).abd_chunk_size = zfs_abd_chunk_size;
591,647c331,335
< /*
< * Return a borrowed raw buffer to an ABD. If the ABD is scattered, this will
< * not change the contents of the ABD and will ASSERT that you didn't modify
< * the buffer since it was borrowed. If you want any changes you made to buf to
< * be copied back to abd, use abd_return_buf_copy() instead.
< */
< void
< abd_return_buf(abd_t *abd, void *buf, size_t n)
< {
< abd_verify(abd);
< ASSERT3U(abd->abd_size, >=, n);
< if (abd_is_linear(abd)) {
< ASSERT3P(buf, ==, abd_to_buf(abd));
< } else {
< ASSERT0(abd_cmp_buf(abd, buf, n));
< zio_buf_free(buf, n);
< }
< (void) zfs_refcount_remove_many(&abd->abd_children, n, buf);
< }
<
< void
< abd_return_buf_copy(abd_t *abd, void *buf, size_t n)
< {
< if (!abd_is_linear(abd)) {
< abd_copy_from_buf(abd, buf, n);
< }
< abd_return_buf(abd, buf, n);
< }
<
< /*
< * Give this ABD ownership of the buffer that it's storing. Can only be used on
< * linear ABDs which were allocated via abd_get_from_buf(), or ones allocated
< * with abd_alloc_linear() which subsequently released ownership of their buf
< * with abd_release_ownership_of_buf().
< */
< void
< abd_take_ownership_of_buf(abd_t *abd, boolean_t is_metadata)
< {
< ASSERT(abd_is_linear(abd));
< ASSERT(!(abd->abd_flags & ABD_FLAG_OWNER));
< abd_verify(abd);
<
< abd->abd_flags |= ABD_FLAG_OWNER;
< if (is_metadata) {
< abd->abd_flags |= ABD_FLAG_META;
< }
<
< ABDSTAT_BUMP(abdstat_linear_cnt);
< ABDSTAT_INCR(abdstat_linear_data_size, abd->abd_size);
< }
<
< void
< abd_release_ownership_of_buf(abd_t *abd)
< {
< ASSERT(abd_is_linear(abd));
< ASSERT(abd->abd_flags & ABD_FLAG_OWNER);
< abd_verify(abd);
---
> /* Copy the scatterlist starting at the correct offset */
> (void) memcpy(&ABD_SCATTER(abd).abd_chunks,
> &ABD_SCATTER(sabd).abd_chunks[new_offset /
> zfs_abd_chunk_size],
> chunkcnt * sizeof (void *));
649,654c337
< abd->abd_flags &= ~ABD_FLAG_OWNER;
< /* Disable this flag since we no longer own the data buffer */
< abd->abd_flags &= ~ABD_FLAG_META;
<
< ABDSTAT_BUMPDOWN(abdstat_linear_cnt);
< ABDSTAT_INCR(abdstat_linear_data_size, -(int)abd->abd_size);
---
> return (abd);
657,663d339
< struct abd_iter {
< abd_t *iter_abd; /* ABD being iterated through */
< size_t iter_pos; /* position (relative to abd_offset) */
< void *iter_mapaddr; /* addr corresponding to iter_pos */
< size_t iter_mapsize; /* length of data valid at mapaddr */
< };
<
668c344
< return ((aiter->iter_abd->abd_u.abd_scatter.abd_offset +
---
> return ((ABD_SCATTER(aiter->iter_abd).abd_offset +
676c352
< return ((aiter->iter_abd->abd_u.abd_scatter.abd_offset +
---
> return ((ABD_SCATTER(aiter->iter_abd).abd_offset +
683c359
< static void
---
> void
693a370,379
> * This is just a helper function to see if we have exhausted the
> * abd_iter and reached the end.
> */
> boolean_t
> abd_iter_at_end(struct abd_iter *aiter)
> {
> return (aiter->iter_pos == aiter->iter_abd->abd_size);
> }
>
> /*
698c384
< static void
---
> void
705c391
< if (aiter->iter_pos == aiter->iter_abd->abd_size)
---
> if (abd_iter_at_end(aiter))
715c401
< static void
---
> void
726c412
< aiter->iter_abd->abd_u.abd_scatter.abd_chunk_size);
---
> ABD_SCATTER(aiter->iter_abd).abd_chunk_size);
729c415
< if (aiter->iter_pos == aiter->iter_abd->abd_size)
---
> if (abd_iter_at_end(aiter))
735c421
< paddr = aiter->iter_abd->abd_u.abd_linear.abd_buf;
---
> paddr = ABD_LINEAR_BUF(aiter->iter_abd);
741c427
< paddr = aiter->iter_abd->abd_u.abd_scatter.abd_chunks[index];
---
> paddr = ABD_SCATTER(aiter->iter_abd).abd_chunks[index];
750c436
< static void
---
> void
754c440
< if (aiter->iter_pos == aiter->iter_abd->abd_size)
---
> if (abd_iter_at_end(aiter))
764,979d449
< int
< abd_iterate_func(abd_t *abd, size_t off, size_t size,
< abd_iter_func_t *func, void *private)
< {
< int ret = 0;
< struct abd_iter aiter;
<
< abd_verify(abd);
< ASSERT3U(off + size, <=, abd->abd_size);
<
< abd_iter_init(&aiter, abd);
< abd_iter_advance(&aiter, off);
<
< while (size > 0) {
< abd_iter_map(&aiter);
<
< size_t len = MIN(aiter.iter_mapsize, size);
< ASSERT3U(len, >, 0);
<
< ret = func(aiter.iter_mapaddr, len, private);
<
< abd_iter_unmap(&aiter);
<
< if (ret != 0)
< break;
<
< size -= len;
< abd_iter_advance(&aiter, len);
< }
<
< return (ret);
< }
<
< struct buf_arg {
< void *arg_buf;
< };
<
< static int
< abd_copy_to_buf_off_cb(void *buf, size_t size, void *private)
< {
< struct buf_arg *ba_ptr = private;
<
< (void) memcpy(ba_ptr->arg_buf, buf, size);
< ba_ptr->arg_buf = (char *)ba_ptr->arg_buf + size;
<
< return (0);
< }
<
< /*
< * Copy abd to buf. (off is the offset in abd.)
< */
< void
< abd_copy_to_buf_off(void *buf, abd_t *abd, size_t off, size_t size)
< {
< struct buf_arg ba_ptr = { buf };
<
< (void) abd_iterate_func(abd, off, size, abd_copy_to_buf_off_cb,
< &ba_ptr);
< }
<
< static int
< abd_cmp_buf_off_cb(void *buf, size_t size, void *private)
< {
< int ret;
< struct buf_arg *ba_ptr = private;
<
< ret = memcmp(buf, ba_ptr->arg_buf, size);
< ba_ptr->arg_buf = (char *)ba_ptr->arg_buf + size;
<
< return (ret);
< }
<
< /*
< * Compare the contents of abd to buf. (off is the offset in abd.)
< */
< int
< abd_cmp_buf_off(abd_t *abd, const void *buf, size_t off, size_t size)
< {
< struct buf_arg ba_ptr = { (void *) buf };
<
< return (abd_iterate_func(abd, off, size, abd_cmp_buf_off_cb, &ba_ptr));
< }
<
< static int
< abd_copy_from_buf_off_cb(void *buf, size_t size, void *private)
< {
< struct buf_arg *ba_ptr = private;
<
< (void) memcpy(buf, ba_ptr->arg_buf, size);
< ba_ptr->arg_buf = (char *)ba_ptr->arg_buf + size;
<
< return (0);
< }
<
< /*
< * Copy from buf to abd. (off is the offset in abd.)
< */
< void
< abd_copy_from_buf_off(abd_t *abd, const void *buf, size_t off, size_t size)
< {
< struct buf_arg ba_ptr = { (void *) buf };
<
< (void) abd_iterate_func(abd, off, size, abd_copy_from_buf_off_cb,
< &ba_ptr);
< }
<
< /*ARGSUSED*/
< static int
< abd_zero_off_cb(void *buf, size_t size, void *private)
< {
< (void) memset(buf, 0, size);
< return (0);
< }
<
< /*
< * Zero out the abd from a particular offset to the end.
< */
< void
< abd_zero_off(abd_t *abd, size_t off, size_t size)
< {
< (void) abd_iterate_func(abd, off, size, abd_zero_off_cb, NULL);
< }
<
< /*
< * Iterate over two ABDs and call func incrementally on the two ABDs' data in
< * equal-sized chunks (passed to func as raw buffers). func could be called many
< * times during this iteration.
< */
< int
< abd_iterate_func2(abd_t *dabd, abd_t *sabd, size_t doff, size_t soff,
< size_t size, abd_iter_func2_t *func, void *private)
< {
< int ret = 0;
< struct abd_iter daiter, saiter;
<
< abd_verify(dabd);
< abd_verify(sabd);
<
< ASSERT3U(doff + size, <=, dabd->abd_size);
< ASSERT3U(soff + size, <=, sabd->abd_size);
<
< abd_iter_init(&daiter, dabd);
< abd_iter_init(&saiter, sabd);
< abd_iter_advance(&daiter, doff);
< abd_iter_advance(&saiter, soff);
<
< while (size > 0) {
< abd_iter_map(&daiter);
< abd_iter_map(&saiter);
<
< size_t dlen = MIN(daiter.iter_mapsize, size);
< size_t slen = MIN(saiter.iter_mapsize, size);
< size_t len = MIN(dlen, slen);
< ASSERT(dlen > 0 || slen > 0);
<
< ret = func(daiter.iter_mapaddr, saiter.iter_mapaddr, len,
< private);
<
< abd_iter_unmap(&saiter);
< abd_iter_unmap(&daiter);
<
< if (ret != 0)
< break;
<
< size -= len;
< abd_iter_advance(&daiter, len);
< abd_iter_advance(&saiter, len);
< }
<
< return (ret);
< }
<
< /*ARGSUSED*/
< static int
< abd_copy_off_cb(void *dbuf, void *sbuf, size_t size, void *private)
< {
< (void) memcpy(dbuf, sbuf, size);
< return (0);
< }
<
< /*
< * Copy from sabd to dabd starting from soff and doff.
< */
< void
< abd_copy_off(abd_t *dabd, abd_t *sabd, size_t doff, size_t soff, size_t size)
< {
< (void) abd_iterate_func2(dabd, sabd, doff, soff, size,
< abd_copy_off_cb, NULL);
< }
<
< /*ARGSUSED*/
< static int
< abd_cmp_cb(void *bufa, void *bufb, size_t size, void *private)
< {
< return (memcmp(bufa, bufb, size));
< }
<
< /*
< * Compares the contents of two ABDs.
< */
< int
< abd_cmp(abd_t *dabd, abd_t *sabd)
< {
< ASSERT3U(dabd->abd_size, ==, sabd->abd_size);
< return (abd_iterate_func2(dabd, sabd, 0, 0, dabd->abd_size,
< abd_cmp_cb, NULL));
< }
<
< /*
< * Iterate over code ABDs and a data ABD and call @func_raidz_gen.
< *
< * @cabds parity ABDs, must have equal size
< * @dabd data ABD. Can be NULL (in this case @dsize = 0)
< * @func_raidz_gen should be implemented so that its behaviour
< * is the same when taking linear and when taking scatter
< */
981,983c451
< abd_raidz_gen_iterate(abd_t **cabds, abd_t *dabd,
< ssize_t csize, ssize_t dsize, const unsigned parity,
< void (*func_raidz_gen)(void **, const void *, size_t, size_t))
---
> abd_enter_critical(unsigned long flags)
985,1000d452
< int i;
< ssize_t len, dlen;
< struct abd_iter caiters[3];
< struct abd_iter daiter = {0};
< void *caddrs[3];
<
< ASSERT3U(parity, <=, 3);
<
< for (i = 0; i < parity; i++)
< abd_iter_init(&caiters[i], cabds[i]);
<
< if (dabd)
< abd_iter_init(&daiter, dabd);
<
< ASSERT3S(dsize, >=, 0);
<
1002,1058d453
< while (csize > 0) {
< len = csize;
<
< if (dabd && dsize > 0)
< abd_iter_map(&daiter);
<
< for (i = 0; i < parity; i++) {
< abd_iter_map(&caiters[i]);
< caddrs[i] = caiters[i].iter_mapaddr;
< }
<
< switch (parity) {
< case 3:
< len = MIN(caiters[2].iter_mapsize, len);
< case 2:
< len = MIN(caiters[1].iter_mapsize, len);
< case 1:
< len = MIN(caiters[0].iter_mapsize, len);
< }
<
< /* must be progressive */
< ASSERT3S(len, >, 0);
<
< if (dabd && dsize > 0) {
< /* this needs precise iter.length */
< len = MIN(daiter.iter_mapsize, len);
< dlen = len;
< } else
< dlen = 0;
<
< /* must be progressive */
< ASSERT3S(len, >, 0);
< /*
< * The iterated function likely will not do well if each
< * segment except the last one is not multiple of 512 (raidz).
< */
< ASSERT3U(((uint64_t)len & 511ULL), ==, 0);
<
< func_raidz_gen(caddrs, daiter.iter_mapaddr, len, dlen);
<
< for (i = parity-1; i >= 0; i--) {
< abd_iter_unmap(&caiters[i]);
< abd_iter_advance(&caiters[i], len);
< }
<
< if (dabd && dsize > 0) {
< abd_iter_unmap(&daiter);
< abd_iter_advance(&daiter, dlen);
< dsize -= dlen;
< }
<
< csize -= len;
<
< ASSERT3S(dsize, >=, 0);
< ASSERT3S(csize, >=, 0);
< }
< critical_exit();
1061,1070d455
< /*
< * Iterate over code ABDs and data reconstruction target ABDs and call
< * @func_raidz_rec. Function maps at most 6 pages atomically.
< *
< * @cabds parity ABDs, must have equal size
< * @tabds rec target ABDs, at most 3
< * @tsize size of data target columns
< * @func_raidz_rec expects syndrome data in target columns. Function
< * reconstructs data and overwrites target columns.
< */
1072,1132c457,458
< abd_raidz_rec_iterate(abd_t **cabds, abd_t **tabds,
< ssize_t tsize, const unsigned parity,
< void (*func_raidz_rec)(void **t, const size_t tsize, void **c,
< const unsigned *mul),
< const unsigned *mul)
< {
< int i;
< ssize_t len;
< struct abd_iter citers[3];
< struct abd_iter xiters[3];
< void *caddrs[3], *xaddrs[3];
<
< ASSERT3U(parity, <=, 3);
<
< for (i = 0; i < parity; i++) {
< abd_iter_init(&citers[i], cabds[i]);
< abd_iter_init(&xiters[i], tabds[i]);
< }
<
< critical_enter();
< while (tsize > 0) {
<
< for (i = 0; i < parity; i++) {
< abd_iter_map(&citers[i]);
< abd_iter_map(&xiters[i]);
< caddrs[i] = citers[i].iter_mapaddr;
< xaddrs[i] = xiters[i].iter_mapaddr;
< }
<
< len = tsize;
< switch (parity) {
< case 3:
< len = MIN(xiters[2].iter_mapsize, len);
< len = MIN(citers[2].iter_mapsize, len);
< case 2:
< len = MIN(xiters[1].iter_mapsize, len);
< len = MIN(citers[1].iter_mapsize, len);
< case 1:
< len = MIN(xiters[0].iter_mapsize, len);
< len = MIN(citers[0].iter_mapsize, len);
< }
< /* must be progressive */
< ASSERT3S(len, >, 0);
< /*
< * The iterated function likely will not do well if each
< * segment except the last one is not multiple of 512 (raidz).
< */
< ASSERT3U(((uint64_t)len & 511ULL), ==, 0);
<
< func_raidz_rec(xaddrs, len, caddrs, mul);
<
< for (i = parity-1; i >= 0; i--) {
< abd_iter_unmap(&xiters[i]);
< abd_iter_unmap(&citers[i]);
< abd_iter_advance(&xiters[i], len);
< abd_iter_advance(&citers[i], len);
< }
<
< tsize -= len;
< ASSERT3S(tsize, >=, 0);
< }
---
> abd_exit_critical(unsigned long flags)
> { |
I am not too worried about |
module/os/freebsd/zfs/abd_os.c
Outdated
|
||
/* | ||
* ARC buffer data (ABD). | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this diagram is the same in abd.c and both abd_os.c's. Let's remove it from the abd_os.c's, and perhaps add a comment to those files indicating that these are implementing the OS-specific parts of ABD, to see abd.c for an overview, and then talk about OS-specific considerations (e.g. the HIGHMEM stuff for linux).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I will update this.
static inline boolean_t | ||
abd_is_linear_page(abd_t *abd) | ||
{ | ||
return ((abd->abd_flags & ABD_FLAG_LINEAR_PAGE) != 0 ? | ||
B_TRUE : B_FALSE); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this code could be retained in the generic (not OS-specific) code (abd.c or abd.h). Since ABD_FLAG_LINEAR_PAGE is defined for FreeBSD (it's just never set). The FreeBSD abd_verify_scatter() could check that it is never set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought about that too. I will go ahead and move this back into the generic abd.c.
@@ -802,264 +701,45 @@ abd_alloc_for_io(size_t size, boolean_t is_metadata) | |||
return (abd_alloc(size, is_metadata)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you feel like it, we should rewrite the above comment to just talk about Linux, since clearly the code in os/linux is specific to linux, and if Illumos support is added to the common repo, it would get its own abd_os.c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I will update this comment.
* Otherwise, ABDs are allocated scattered by default unless the caller uses | ||
* abd_alloc_linear(). | ||
*/ | ||
boolean_t zfs_abd_scatter_enabled = B_TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this could be in the common abd.c, since both OS's have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are taking about zfs_abd_scatter_enabled
correct? I will go ahead and move this to common abd.c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. thanks!
module/os/freebsd/zfs/abd_os.c
Outdated
size_t | ||
abd_chunkcnt_for_bytes(size_t size) | ||
{ | ||
return (P2ROUNDUP(size, zfs_abd_chunk_size) / zfs_abd_chunk_size); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to make abd_chunkcnt_for_bytes
static (not used by the generic abd.c), and have abd_alloc_struct()
take the size in bytes, rather than the number of chunks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally had that, and I have updated it so abd_chunkcnt_for_bytes
is static again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we were leaking memory when I tried just passing sabd->abd_size - new_offset
to abd_alloc_struct()
in FreeBSD in the function abd_get_offset_scatter()
. I went ahead and just made a static function to allocate an ABD scatter struct using the chunkcnt calculated in abd_get_offset_scatter()
. This still allows for abd_alloc_struct()
to always take a size and abd_chunkcnt_for_bytes()
to just be a static function.
module/os/linux/zfs/abd_os.c
Outdated
@@ -284,7 +275,7 @@ abd_unmark_zfs_page(struct page *page) | |||
* reclaim or compaction. When necessary this function will degenerate to | |||
* allocating individual pages and allowing reclaim to satisfy allocations. | |||
*/ | |||
static void | |||
void | |||
abd_alloc_pages(abd_t *abd, size_t size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better name for this would be abd_alloc_chunks
, since it's also used on FreeBSD where it isn't necessarily allocating pages. (also abd_free_pages->chunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I will update this.
I can update the commit message when merging, but since there were other request changes you might as well update it before update the PR again. |
6d4d9ea
to
f76c2c7
Compare
I went ahead and updated the commit message with my latest push. |
Codecov Report
@@ Coverage Diff @@
## master #10293 +/- ##
==========================================
- Coverage 79.56% 79.43% -0.14%
==========================================
Files 389 390 +1
Lines 123296 123336 +40
==========================================
- Hits 98105 97973 -132
- Misses 25191 25363 +172
Continue to review full report at Codecov.
|
Reorganizing ABD code base so OS-independent ABD code has been placed into a common abd.c file. OS-dependent ABD code has been left in each OS's ABD source files, and these source files have been renamed to abd_os. The OS-independent ABD code is now under: module/zfs/abd.c With the OS-dependent code in: module/os/linux/zfs/abd_os.c module/os/freebsd/zfs/abd_os.c Signed-off-by: Brian Atkinson <[email protected]>
f76c2c7
to
213db05
Compare
Reorganizing ABD code base so OS-independent ABD code has been placed into a common abd.c file. OS-dependent ABD code has been left in each OS's ABD source files, and these source files have been renamed to abd_os. The OS-independent ABD code is now under: module/zfs/abd.c With the OS-dependent code in: module/os/linux/zfs/abd_os.c module/os/freebsd/zfs/abd_os.c Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes openzfs#10293 (cherry picked from commit fc551d7)
Reorganizing ABD code base so OS-independent ABD code has been placed into a common abd.c file. OS-dependent ABD code has been left in each OS's ABD source files, and these source files have been renamed to abd_os. The OS-independent ABD code is now under: module/zfs/abd.c With the OS-dependent code in: module/os/linux/zfs/abd_os.c module/os/freebsd/zfs/abd_os.c Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes openzfs#10293
Reorganizing ABD code base so the OS
dependant ABD code has been split into
their own independent files.
The shared ABD code is now under:
module/zfs/abd.c
With the independent OS code in:
module/os/linux/zfs/abd_os.c
module/os/freebsd/zfs/abd_os.c
Signed-off-by: Brian Atkinson [email protected]
Authored-by: Brian Atkinson [email protected]
Refactoring ABD Code to Isolate OS Specific Code
Motivation and Context
By separating the OS specific ABD code I am removing duplicate code that is not tied to any OS and could be placed in a general abd.c file.
Description
I placed OS specific code in their own abd_os.c files and kept common code in abd.c.
How Has This Been Tested?
Ran zloop.sh for 30 mins in both Linux and FreeBSD.
Linux CentOS 7 with Kernel 4.18.5
FreeBSD 12.1-RELASE-p3 GENERIC amd64
Types of changes
Checklist:
Signed-off-by
.