Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dedicated box testing] ABD2 + Illumos #4950 + Illumos #2605 (master January 16th, 2015) #4236

Conversation

kernelOfTruth
Copy link
Contributor

big blob patch with
Update arc_c under a mutex from #4197
ABD2 #3441 ABD: linear/scatter dual typed buffer for ARC (ver 2)
4950 #4207 Illumos #4950 files sometimes can't be removed from a full filesystem
2605 #4213 Illumos 2605 want to resume interrupted zfs send

edit:
ABD2 is of course below the blob and NOT included - but not changing the commit message, it's already TEST'ing 👍

tuxoko and others added 29 commits January 16, 2016 21:59
zfsolinux currently uses vmalloc backed slab for ARC buffers. There are some
major problems with this approach. One is that 32-bit system have only a
handful of vmalloc space. Another is that the fragmentation in slab will easily
deplete memory in busy system.

With ABD, we use scatterlist to allocate data buffers. In this approach we can
allocate in HIGHMEM, which alleviates vmalloc space pressure on 32-bit. Also,
we don't have to rely on slab, so there's no fragmentation issue.

But for metadata buffers, we still uses linear buffer from slab. The reason for
this is that there are a lot of *_phys pointers directly point to metadata
buffers. Thus, it would be much more complicated to use scatter buffer for
metadata.

Currently, ABD is not enabled and its API will treat them as normal buffers.
We will enable it once all relevant code is modified to use the API.

Signed-off-by: Chunwei Chen <[email protected]>
Modify/Add incremental fletcher function prototype to match abd_iterate_rfunc
callback type. Also, reduce duplicated code a bit in zfs_fletcher.c.

Signed-off-by: Chunwei Chen <[email protected]>
1. Use abd_t in arc_buf_t->b_data, dmu_buf_t->db_data, zio_t->io_data and
zio_transform_t->zt_orig_data
2. zio_* function take abd_t for data

Signed-off-by: Chunwei Chen <[email protected]>
1. Add checksum function for abd_t
2. Use abd_t version checksum function in zio_checksum_table
3. Make zio_checksum_compute and zio_checksum_error handle abd_t

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Use ABD API on related pointers and functions.(b_data, db_data, zio_*(), etc.)

Suggested-by: DHE <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Currently, abd_uiomove repeatedly calls abd_copy_*_off. The problem is that it
will need to do abd_miter_advance repeatedly over the parts that were skipped
before.

We split out the miter part of the abd_copy_*_off into abd_miter_copy_*. These
new function will take miter directly and they will automatically advance it
after finish. We initialize an miter in uiomove and use the iterator copy
functions to solve the stated problem.

Signed-off-by: Chunwei Chen <[email protected]>
The check is needed to make sure the user buffer is indeed in user space. Also
change copy_{to,from}_user to __copy_{to,from}_user so that we don't
repeatedly call access_ok.

Signed-off-by: Chunwei Chen <[email protected]>
When we aren't allocating in HIGHMEM, we can try to allocate contiguous pages,
we can also use sg_alloc_table_from_pages to merge adjacent pages for us. This
will allow more efficient cache prefetch and also reduce sg iterator overhead.
And this has been tested to greatly improve performance.

Signed-off-by: Jinshan Xiong <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
This patch adds non-highmem scatter ABD and also adds a new set of functions,
namely abd_buf_segment and friends, for accessing those ABD.

This is a preparation for metadata scatter ABD.

Signed-off-by: Chunwei Chen <[email protected]>
Some metadata types are not required or not easily to use scatter ABD. So to
allow ARC to accommodate both types of metadata, we add a new flag
ARC_FLAG_META_SCATTER to indicate which type the buffer belongs.

We also introduce a new type arc_buf_alloc_t, which is basically
arc_buf_contents_t with a flag indicating scatter metadata. Users can pass it
to arc_buf_alloc to decide which buffer type to allocate.

Signed-off-by: Chunwei Chen <[email protected]>
We add a new member ot_scatter to dmu_ot to determine the ABD type for each
DMU type. We temporary set them all to FALSE, until some types are ready to
handle scatter abd.

BP_GET_BUFA_TYPE and DBUF_GET_BUFA_TYPE will returns with the scatter flag set
up accordingly, so they can be passed to arc_buf_alloc, and the ARC subsystem
will returns the correct ABD type.

Signed-off-by: Chunwei Chen <[email protected]>
Use non-highmem scatter ABD for indirect block, i.e. level > 0. abd_array
should be used to access the blkptr.

Signed-off-by: Chunwei Chen <[email protected]>
Use non-highmem scatter ABD for dnode array blocks, abd_array should be used
to access the dnodes.

Signed-off-by: Chunwei Chen <[email protected]>
abd_alloc_scatter always allocate nr*PAGE_SIZE memory. So in order to save
memory, we allow it to fallback to do linear allocation when size is less than
PAGE_SIZE.

Note that orignally, we requires that zio->io_data remains the same type
before it reaches lower level, vdev_{queue,cache,disk}. After this patch,
however, transformation like compression may change a scatter io_data to
linear. This is fine, because every scatter ABD aware API can also take linear
ABD, but not vice versa. So we relax the type check in push transform to check
linear only.

Signed-off-by: Chunwei Chen <[email protected]>
In dsl_scan_scrub_cb and spa_load_verify_cb, we originally always allocated
linear ABD. Now we try to allocate scatter ABD according to the BUFA type of
the blkptr to reduce unnecessary spl slab allocation.

Also in zio_ddt_read_start, we match the parent zio->io_data ABD type for the
same reason.

Signed-off-by: Chunwei Chen <[email protected]>
SPA history is access purely through dmu_read/dmu_write. It doesn't require
any modification except for setting ot_scatter.

Signed-off-by: Chunwei Chen <[email protected]>
Add abd version byteswap function with the name "abd_<old bswap func name>".

Note that abd_byteswap_uint*_array and abd_dnode_buf_byteswap can handle
scatter buffer, so now we don't need extra borrow/copy.

Signed-off-by: Chunwei Chen <[email protected]>
Update arc_c under a mutex from openzfs#4197
ABD2 openzfs#3441 ABD: linear/scatter dual typed buffer for ARC (ver 2)
4950 openzfs#4207 Illumos openzfs#4950 files sometimes can't be removed from a full filesystem
2605 openzfs#4213 Illumos 2605 want to resume interrupted zfs send
@kernelOfTruth kernelOfTruth changed the title [eval,blob][buildbot] ABD2 + Illumos #4950 + Illumos #2605 (master January 16th, 2015) [dedicated box testing] ABD2 + Illumos #4950 + Illumos #2605 (master January 16th, 2015) Jan 17, 2016
@behlendorf
Copy link
Contributor

Closing as stale.

@behlendorf behlendorf closed this Mar 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants