Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-14408 common: ensure NDCTL not used for storage class ram #15203

Merged
merged 7 commits into from
Oct 16, 2024

Conversation

grom72
Copy link
Contributor

@grom72 grom72 commented Sep 26, 2024

This PR prepares DAOS to be used with NDCTL enabled in PMDK, which means:

  • NDCTL must not be used when non-DCPM (simulate PMem) - storage class: "ram" is used:
    PMEMOBJ_CONF=sds.at_create=0 env variable disables NDCTL features in the PMDK
    This change affects all tests run on simulated PMem (e.g. inside VMs).
    Some DOAS utility applications may also require PMEMOBJ_CONF=sds.at_create=0 to be set.

  • The default ULT stack size must be at least 20KiB to avoid stack overuse by PMDK with NDCTL enabled and be aligned with Linux page size.
    ABT_THREAD_STACKSIZE=20480 env variable is used to increase the default ULT stack size.
    This env variable is set by control/server module just before engine is started.
    Much bigger stack is used for pmempool open/create-related tasks e.g. tgt_vos_create_one to avoid stack overusage.

This modification shall not affect md-on-ssd mode as long as storage class: "ram" is used for the first tier in the storage configuration.
This change does not require any configuration changes to existing systems.

The new PMDK package with NDCTL enabled (daos-stack/pmdk#38) will land as soon as this PR is merged.

Based on: #14371

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate owners.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Copy link

github-actions bot commented Sep 26, 2024

Ticket title is 'NDCTL must be enabled to provide support for RAS functionality in PMDK'
Status is 'Awaiting backport'
Labels: 'scrubbed_2.8,triaged'
Job should run at elevated priority (1)
https://daosio.atlassian.net/browse/DAOS-14408

@grom72 grom72 added the release-2.6.2 Targeted for release 2.6.2 label Sep 26, 2024
@daosbuild1
Copy link
Collaborator

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15203/1/testReport/

This PR prepares DAOS to be used with NDCTL enabled in PMDK, which means:
- NDCTL must not be used when non-DCPM (simulate PMem) - `storage class: "ram"` is used:
`PMEMOBJ_CONF=sds.at_create=0` env variable disables NDCTL features in the PMDK
This change affects all tests run on simulated PMem (e.g. inside VMs).
Some DOAS utility applications may also require `PMEMOBJ_CONF=sds.at_create=0` to be set.

- The default ULT stack size must be at least 20KiB to avoid stack overuse by PMDK with NDCTL enabled and be aligned with Linux page size.
`ABT_THREAD_STACKSIZE=20480` env variable is used to increase the default ULT stack size.
This env variable is set by control/server module just before engine is started.
Much bigger stack is used for pmempool open/create-related tasks e.g. `tgt_vos_create_one` to avoid stack overusage.

This modification shall not affect md-on-ssd mode as long as `storage class: "ram"` is used for the first tier in the `storage` configuration.
This change does not require any configuration changes to existing systems.

The new PMDK package with NDCTL enabled (daos-stack/pmdk#38) will land as soon as this PR is merged.

Allow-unstable-test: true
Priority: 2

Required-githooks: true

Signed-off-by: Tomasz Gromadzki <[email protected]>
@grom72 grom72 force-pushed the grom72/DAOS-14408-2.6.x branch from 09cab12 to 70d7a95 Compare October 10, 2024 12:35
PR-repos: pmdk@PR-38:14

Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556
Priority: 2

Cancel-prev-build: false
Force tests on various OSes
Skip-func-test-leap15: false
Skip-func-test-el9: false
Skip-test-leap-15.4-rpms: false
Skip-test-el9-rpms: false
Allow-unstable-test: true

Required-githooks: true

Signed-off-by: Tomasz Gromadzki <[email protected]>
@grom72 grom72 force-pushed the grom72/DAOS-14408-2.6.x branch from 70d7a95 to 2efa66c Compare October 10, 2024 12:42
grom72 and others added 3 commits October 10, 2024 18:04
PR-repos: pmdk@PR-38:14

Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556
Priority: 2

Cancel-prev-build: false
Force tests on various OSes
Skip-func-test-leap15: false
Skip-func-test-el9: false
Skip-test-leap-15.4-rpms: false
Skip-test-el9-rpms: false
Allow-unstable-test: true

Required-githooks: true

Signed-off-by: Tomasz Gromadzki <[email protected]>
PR-repos: pmdk@PR-38:14

Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556
Priority: 2

Cancel-prev-build: false
Skip-nlt: true
Force tests on various OSes
Skip-func-test-leap15: false
Skip-func-test-el9: false
Skip-test-leap-15.4-rpms: false
Skip-test-el9-rpms: false
Allow-unstable-test: true

Required-githooks: true

Signed-off-by: Tomasz Gromadzki <[email protected]>
PR-repos: pmdk@PR-38:14

Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556
Priority: 2

Cancel-prev-build: false
Skip-nlt: true
Force tests on various OSes
Skip-func-test-leap15: false
Skip-func-test-el9: false
Skip-test-leap-15.4-rpms: false
Skip-test-el9-rpms: false
Allow-unstable-test: true

Required-githooks: true

Signed-off-by: Jan Michalski <[email protected]>
@grom72 grom72 marked this pull request as ready for review October 14, 2024 17:28
@grom72 grom72 requested review from a team as code owners October 14, 2024 17:28
@grom72 grom72 added the clean-cherry-pick Cherry-pick from another branch that did not require additional edits label Oct 14, 2024
@grom72 grom72 requested review from tanabarr and kjacque October 15, 2024 05:54
@grom72 grom72 added the go Pull requests that update Go code label Oct 15, 2024
@github-actions github-actions bot added the priority Ticket has high priority (automatically managed) label Oct 15, 2024
PR-repos: pmdk@PR-38:14

Skip-list: test_dfuse_daos_build_wt_pil4dfs:DAOS-16556
Priority: 2

Cancel-prev-build: false
Skip-nlt: false
Skip-unit-test-memcheck: true
Skip-unit-test: true
Skip-func-test: true

Required-githooks: true

Signed-off-by: Tomasz Gromadzki <[email protected]>
@grom72
Copy link
Contributor Author

grom72 commented Oct 15, 2024

https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-15203/8/ has NLT failures which I don't think are related to the PR
All other tests passed in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-15203/7/

@grom72 grom72 requested a review from janekmi October 15, 2024 11:06
utils/rpms/daos.spec Outdated Show resolved Hide resolved
Doc-only: true

Required-githooks: true

Signed-off-by: Tomasz Gromadzki <[email protected]>
@grom72 grom72 requested a review from janekmi October 15, 2024 15:50
@grom72 grom72 requested a review from a team October 15, 2024 21:16
@daltonbohning daltonbohning merged commit d9f16a1 into release/2.6 Oct 16, 2024
53 of 56 checks passed
@daltonbohning daltonbohning deleted the grom72/DAOS-14408-2.6.x branch October 16, 2024 00:48
@daltonbohning daltonbohning added the forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. label Oct 16, 2024
jolivier23 pushed a commit that referenced this pull request Nov 1, 2024
)

* DAOS-14408 common: enable NDCTL for DCPM

This PR prepares DAOS to be used with NDCTL enabled in PMDK, which means:
- NDCTL must not be used when non-DCPM (simulate PMem) - `storage class: "ram"` is used:
`PMEMOBJ_CONF=sds.at_create=0` env variable disables NDCTL features in the PMDK
This change affects all tests run on simulated PMem (e.g. inside VMs).
Some DOAS utility applications may also require `PMEMOBJ_CONF=sds.at_create=0` to be set.

- The default ULT stack size must be at least 20KiB to avoid stack overuse by PMDK with NDCTL enabled and be aligned with Linux page size.
`ABT_THREAD_STACKSIZE=20480` env variable is used to increase the default ULT stack size.
This env variable is set by control/server module just before engine is started.
Much bigger stack is used for pmempool open/create-related tasks e.g. `tgt_vos_create_one` to avoid stack overusage.

This modification shall not affect md-on-ssd mode as long as `storage class: "ram"` is used for the first tier in the `storage` configuration.
This change does not require any configuration changes to existing systems.

The new PMDK package with NDCTL enabled (daos-stack/pmdk#38) will land as soon as this PR is merged.

Required-githooks: true

Change-Id: If4c3f7d88a97e4e4f5526da71f4b374a2844057b
Signed-off-by: Jan Michalski <[email protected]>
jolivier23 pushed a commit that referenced this pull request Nov 1, 2024
)

* DAOS-14408 common: enable NDCTL for DCPM

This PR prepares DAOS to be used with NDCTL enabled in PMDK, which means:
- NDCTL must not be used when non-DCPM (simulate PMem) - `storage class: "ram"` is used:
`PMEMOBJ_CONF=sds.at_create=0` env variable disables NDCTL features in the PMDK
This change affects all tests run on simulated PMem (e.g. inside VMs).
Some DOAS utility applications may also require `PMEMOBJ_CONF=sds.at_create=0` to be set.

- The default ULT stack size must be at least 20KiB to avoid stack overuse by PMDK with NDCTL enabled and be aligned with Linux page size.
`ABT_THREAD_STACKSIZE=20480` env variable is used to increase the default ULT stack size.
This env variable is set by control/server module just before engine is started.
Much bigger stack is used for pmempool open/create-related tasks e.g. `tgt_vos_create_one` to avoid stack overusage.

This modification shall not affect md-on-ssd mode as long as `storage class: "ram"` is used for the first tier in the `storage` configuration.
This change does not require any configuration changes to existing systems.

The new PMDK package with NDCTL enabled (daos-stack/pmdk#38) will land as soon as this PR is merged.

Change-Id: If4c3f7d88a97e4e4f5526da71f4b374a2844057b
Signed-off-by: Jan Michalski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clean-cherry-pick Cherry-pick from another branch that did not require additional edits forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. go Pull requests that update Go code priority Ticket has high priority (automatically managed) release-2.6.2 Targeted for release 2.6.2
Development

Successfully merging this pull request may close these issues.

6 participants