Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-16896 common: Spill Over Evictable Buckets Implementation #15646

Merged
merged 1 commit into from
Jan 7, 2025

Conversation

sherintg
Copy link
Collaborator

The DAV_v2 allocator now includes support for Spill Over Evictable Buckets (SOEMB). All global allocations will continue to utilize the standard non-evictable memory buckets, while spillover allocations from evictable memory buckets will be directed to SOEMB. In the current implementation, SOEMB remains locked in the memory cache, similar to the behavior of non-evictable memory buckets.

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Copy link

Ticket title is 'Spill Over Evictable Buckets (SOEMB) Implementation'
Status is 'Open'
Labels: 'md_on_ssd2'
https://daosio.atlassian.net/browse/DAOS-16896

@sherintg sherintg force-pushed the sherintg/DAOS-16896 branch from c8ee33f to 7ffe011 Compare December 19, 2024 12:35
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
The DAV_v2 allocator now includes support for Spill Over Evictable
Buckets (SOEMB). All global allocations will continue to utilize the
standard non-evictable memory buckets, while spillover allocations
from evictable memory buckets will be directed to SOEMB. In the
current implementation, SOEMB remains locked in the memory cache,
similar to the behavior of non-evictable memory buckets.

Signed-off-by: Sherin T George <[email protected]>
@sherintg sherintg force-pushed the sherintg/DAOS-16896 branch from 7ffe011 to cb10687 Compare December 19, 2024 13:38
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@daos-stack daos-stack deleted a comment from daosbuild1 Dec 19, 2024
@sherintg sherintg marked this pull request as ready for review December 19, 2024 14:12
@sherintg sherintg requested review from a team as code owners December 19, 2024 14:12
@@ -559,6 +559,7 @@ dav_tx_begin_v2(dav_obj_t *pop, jmp_buf env, ...)
sizeof(struct tx_range_def));
tx->first_snapshot = 1;
tx->pop = pop;
heap_soemb_reserve(pop->do_heap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might needs be done before the umem_cache_reserve() call in the future?

When we support turning SOE bucket to evict-able (once a SOE bucket isn't qualified as a SOE anymore), allocator might need to pass the SOE set to umem_cache_reserve(), so that we can ensure that all SOE buckets are loaded in umem_cache_reserve().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will be moved when SOEMB starts using the evictable MB pool. For this release SOEMB will use the cache pages from umem_cache_reserve() for creating a new non-evictable SOEMB. Hence this code is moved to the end of tx_begin() so that the page can be initialized in the same way as non-evictable memory bucket.


smbrt->svec[SOEMB_ACTIVE_CNT - 1] = NULL;
smbrt->fur_idx = 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow the logic here (and the SOE selecting in heap_soemb_active_get()), are we trying to use the buckets in SOE set in a round-robin manner?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allocator will maintain SOEMB_ACTIVE_CNT == 3 active SOEMBs. The allocator will first attempt spill over to svec[0] and if it fails svec[1] and finally svec[2]. If the allocation still fails then it will spill over to the global non-evictable memory buckets.
fur_idx indicates the furthest index of active SOEMB list that was used to do the spill over within a TX. If its value is greater than 1 then in the next tx_begin(), heap_soemb_reserve() will mark svec[0] as passive and left shifts other active SOEMBs.

break;
}
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above loop is to create ensure there is always 1 available SOE bucket or SOEMB_ACTIVE_CNT buckets? (The loop breaks if any bucket setup successfully).

I think the goal here is to ensure enough free space being pinned in memory for the potential spilling over happened in next transaction, right? So we'd replace any unqualified bucket (in 'svec') with qualified one to satisfy the space requirement. It looks to me it's too late to remove the unqualified bucket from 'svec' in heap_recycle_soembs().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of the early boot phase, this condition primarily occurs when fur_idx is greater than 1, which triggers a left shift of the active SOEMBs by one position. This implies that only svec[2] has to be populated.

The left shift happen when atleast one allocation within a TX failed to spill over to svec[0] and svec[1] and allocator ends up using svec[2]. Hence svec[2] will be very sparsely populated. After the left shift svec[0] will be almost full, svec[1] sparsely populated and svec[2] not populated (or least populated if obtained from passive list). The assumption made here is that the free space in svec[1] and svec[2] is sufficient to satisfy the next TX.

@sherintg sherintg requested a review from NiuYawei January 3, 2025 08:53
@sherintg sherintg requested a review from a team January 7, 2025 04:26
@NiuYawei NiuYawei merged commit da773f5 into master Jan 7, 2025
56 of 57 checks passed
@NiuYawei NiuYawei deleted the sherintg/DAOS-16896 branch January 7, 2025 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants