Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-15739 engine: Add multi-socket support #14234

Merged
merged 22 commits into from
May 2, 2024
Merged

Conversation

jolivier23
Copy link
Contributor

@jolivier23 jolivier23 commented Apr 23, 2024

Add a simple multi-socket mode for use cases where a single engine must be used. Avoids the issue of having all helper xstreams automatically assigned to a single NUMA node thus increasing efficiency of synchronizations between I/O and helper xstreams.

It is the default behavior if all of the following are true

  1. Neither pinned_numa_node nor first_core are used.
  2. No oversubscription is requested
  3. NUMA has uniform number of cores
  4. targets and helpers divide evenly among numa nodes
  5. There is more than one numa node

Update server config logic to ensure first_core is passed on to engine if it's set while keeping existing behavior
when both first_core: 0 and pinned_numa_node are set.

Required-githooks: true

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Add a simple multi-socket mode for use cases where a single engine must
be used. Avoids the issue of having all helper xstreams automatically
assigned to a single NUMA node thus increasing efficiency of
synchronizations between I/O and helper xstreams.

Usage: set DAOS_MULTISOCKET=1 in server yaml to enable this mode
Limitations:
1. IO xstreams and helper xstreams must each divide evenly by
the number of numa cores.
2. No DAOS_OVERSUBSCRIBE is not allowed
3. Must be equal number of cores on each numa node.

If DAOS_MULTISOCKET is not set, old behavior is maintained

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Copy link

github-actions bot commented Apr 23, 2024

Ticket title is 'support a single engine, multi-socket configuration'
Status is 'In Review'
https://daosio.atlassian.net/browse/DAOS-15739

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@jolivier23
Copy link
Contributor Author

Note to reviewers, I don't think this actually changes anything with respect to existing algorithms (outside of pulling the numa information for the whole node at startup). Everything else should work as before. This just gives me something to play with.

Make multi-socket the default behavior.
Keep old IOFW behavior of scheduling on another IO core

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@jolivier23
Copy link
Contributor Author

Ok, nevermind. At Johann's behest, I changed it to be default behavior where possible.

@jolivier23 jolivier23 marked this pull request as ready for review April 24, 2024 15:54
Option to bypass the forward to another xstream

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@jolivier23 jolivier23 requested a review from a team as a code owner April 24, 2024 16:08
@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14234/6/execution/node/333/log

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
NiuYawei
NiuYawei previously approved these changes Apr 25, 2024
Copy link
Contributor

@NiuYawei NiuYawei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I foresee a hard merge when updating the multiprovider branch next time, hope the multiprovider branch could be landed soon. ;)

@@ -52,6 +52,7 @@ Environment variables in this section only apply to the server side.
|DAOS\_DTX\_AGG\_THD\_AGE|DTX aggregation age threshold in seconds. The valid range is [210, 1830]. The default value is 630.|
|DAOS\_DTX\_RPC\_HELPER\_THD|DTX RPC helper threshold. The valid range is [18, unlimited). The default value is 513.|
|DAOS\_DTX\_BATCHED\_ULT\_MAX|The max count of DTX batched commit ULTs. The valid range is [0, unlimited). 0 means to commit DTX synchronously. The default value is 32.|
|DAOS\_FORWARD\_SELF|Set to disable I/O forwarding on neighbor xstream in the absence of helper threads.|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should make it as default? I don't quite see the advantage of forwarding by neighbor vos xtream (if we assume the workload is balanced over server targets).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, for now I just wanted to keep it consistent with old behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the default in new patch and fixed one issue I hit where first engine on same node got different behavior because it used first_core: 0.

src/engine/ult.c Outdated
uint32_t target;

if (dss_tgt_offload_xs_nr == 0) {
if (xs_type == DSS_XS_IOFW && !dss_forward_self) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't apply the same to DSS_XS_OFFLOAD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had similar questions. Not sure why.

Address some review comments

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14234/8/execution/node/1406/log

NiuYawei
NiuYawei previously approved these changes Apr 28, 2024
johannlombardi
johannlombardi previously approved these changes Apr 30, 2024
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-14234/9/execution/node/1465/log

@jolivier23 jolivier23 changed the title DAOS-15739 engine: Add multi-socket support DAOS-15739 engine: Add multi-socket support DO NOT LAND YET May 1, 2024
jolivier23 added 2 commits May 1, 2024 10:22
doesn't omit telling the engine it was explicitly set.

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@jolivier23 jolivier23 requested review from a team as code owners May 1, 2024 16:38
@jolivier23 jolivier23 changed the title DAOS-15739 engine: Add multi-socket support DO NOT LAND YET DAOS-15739 engine: Add multi-socket support May 1, 2024
@daosbuild1
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14234/10/testReport/

@daosbuild1
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14234/11/testReport/

when pinned_numa_node is set

Features: control

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@jolivier23 jolivier23 requested review from a team as code owners May 1, 2024 17:46
@@ -480,7 +480,7 @@ def __init__(self, base_namespace, index, provider=None, max_storage_tiers=MAX_S
# log_file: map to D_LOG_FILE env
# env_vars: influences DAOS I/O Engine behavior
self.targets = BasicParameter(None, 8)
self.first_core = BasicParameter(None, 0)
self.first_core = BasicParameter(None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means all tests will no longer have first_core in the config. Running pr + Features: control ObjectMetadata is probably good coverage of this, but are there any areas we should be concerned with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to change this to allow setting first_core: 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a pr or control test so you'll want to include ObjectMetadata in testing

@daosbuild1
Copy link
Collaborator

Test stage Unit Test on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-14234/12/testReport/

jolivier23 added 7 commits May 1, 2024 12:31
This is the simplest path forward for now. I mimic the old
behavior when both are set.

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Features: control

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Features: control
Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage NLT on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-14234/15/display/redirect

Allow-unstable-test: true

Required-githooks: true

Signed-off-by: Jeff Olivier <[email protected]>
Copy link
Contributor

@tanabarr tanabarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -612,7 +612,7 @@ func (c *Config) WithHelperStreamCount(count int) *Config {

// WithServiceThreadCore sets the core index to be used for running DAOS service threads.
func (c *Config) WithServiceThreadCore(idx int) *Config {
c.ServiceThreadCore = idx
c.ServiceThreadCore = &idx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we are changing things it might make sense to change ServiceThreadCore to *uint

return DSS_XS_SELF;
}

socket = tgt_id / dss_numa_nr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this assignment could be specified just once at the beginning of the function rather than in two places

@jolivier23 jolivier23 merged commit b1e0be0 into master May 2, 2024
52 checks passed
@jolivier23 jolivier23 deleted the jvolivie/add_multisocket branch May 2, 2024 19:43
jolivier23 pushed a commit that referenced this pull request May 3, 2024
Backport for the following patches
DAOS-13380 engine: refine tgt_nr check
DAOS-15739 engine: Add multi-socket support (#14234)

* DAOS-13380 engine: refine tgt_nr check

1. for non-DAOS_TARGET_OVERSUBSCRIBE case
   fail to start engine if #cores is not enough
2. for DAOS_TARGET_OVERSUBSCRIBE case
   allow to force start engine
The #nr_xs_helpers possibly be reduced for either case.

* DAOS-15739 engine: Add multi-socket support (#14234)

Add a simple multi-socket mode for use cases where a single
engine must be used. Avoids the issue of having all helper
xstreams automatically assigned to a single NUMA node thus
increasing efficiency of synchronizations between I/O and
helper xstreams.

It is the default behavior if all of the following are true

Neither pinned_numa_node nor first_core are used.
No oversubscription is requested
NUMA has uniform number of cores
targets and helpers divide evenly among numa nodes
There is more than one numa node
Update server config logic to ensure first_core is passed
on to engine if it's set while keeping existing behavior
when both first_core: 0 and pinned_numa_node are set.

Signed-off-by: Jeff Olivier <[email protected]>
Signed-off-by: Xuezhao Liu <[email protected]>
Signed-off-by: Tom Nabarro <[email protected]>
jolivier23 added a commit that referenced this pull request May 8, 2024
Backport for the following patches
DAOS-13380 engine: refine tgt_nr check (#12405)
DAOS-15739 engine: Add multi-socket support (#14234)
DAOS-623 engine: Fix a typo (#14329)

* DAOS-13380 engine: refine tgt_nr check

1. for non-DAOS_TARGET_OVERSUBSCRIBE case
   fail to start engine if #cores is not enough
2. for DAOS_TARGET_OVERSUBSCRIBE case
   allow to force start engine
The #nr_xs_helpers possibly be reduced for either case.

* DAOS-15739 engine: Add multi-socket support (#14234)

Add a simple multi-socket mode for use cases where a single
engine must be used. Avoids the issue of having all helper
xstreams automatically assigned to a single NUMA node thus
increasing efficiency of synchronizations between I/O and
helper xstreams.

It is the default behavior if all of the following are true

Neither pinned_numa_node nor first_core are used.
No oversubscription is requested
NUMA has uniform number of cores
targets and helpers divide evenly among numa nodes
There is more than one numa node
Update server config logic to ensure first_core is passed
on to engine if it's set while keeping existing behavior
when both first_core: 0 and pinned_numa_node are set.

Signed-off-by: Jeff Olivier <[email protected]>
Signed-off-by: Xuezhao Liu <[email protected]>
Signed-off-by: Tom Nabarro <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants