Skip to content

Commit

Permalink
Merge branch 'master' into feature/vos_on_blob_p2
Browse files Browse the repository at this point in the history
Required-githooks: true
  • Loading branch information
NiuYawei committed Jul 10, 2024
2 parents 9f3b928 + 406f35b commit 38024fb
Show file tree
Hide file tree
Showing 253 changed files with 6,713 additions and 3,819 deletions.
3 changes: 3 additions & 0 deletions SConstruct
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,9 @@ def scons():

deps_env = Environment()

# Silence deprecation warning so it doesn't fail the build
SetOption('warn', ['no-python-version'])

add_command_line_options()

# Scons strips out the environment, however that is not always desirable so add back in
Expand Down
3 changes: 3 additions & 0 deletions ci/codespell.ignores
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,6 @@ laf
cacl
chk
falloc
rin
assertIn
checkin
4 changes: 3 additions & 1 deletion ci/jira_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
FIELDS = 'summary,status,labels,customfield_10044,customfield_10045'

# Labels in GitHub which this script will set/clear based on the logic below.
MANAGED_LABELS = ('release-2.2', 'release-2.4', 'priority')
MANAGED_LABELS = ('release-2.2', 'release-2.4', 'release-2.6', 'priority')


def set_output(key, value):
Expand Down Expand Up @@ -175,6 +175,8 @@ def main():
gh_label.add('release-2.2')
if str(version) in ('2.4 Community Release'):
gh_label.add('release-2.4')
if str(version) in ('2.6 Community Release'):
gh_label.add('release-2.6')

# If a PR does not otherwise have priority then use custom values from above.
if priority is None and not pr_data['base']['ref'].startswith('release'):
Expand Down
2 changes: 1 addition & 1 deletion ci/provisioning/post_provision_config.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ source ci/provisioning/post_provision_config_common_functions.sh
source ci/junit.sh


: "${MLNX_VER_NUM:=latest-5.8}"
: "${MLNX_VER_NUM:=24.04-0.6.6.0}"

: "${DISTRO:=EL_7}"
DSL_REPO_var="DAOS_STACK_${DISTRO}_LOCAL_REPO"
Expand Down
48 changes: 23 additions & 25 deletions docs/admin/administration.md
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,6 @@ Usage:
...

Available commands:
device-health Query the device health
list-devices List storage devices on the server
list-pools List pools on the server
usage Show SCM & NVMe storage space utilization per storage server
Expand Down Expand Up @@ -372,6 +371,8 @@ Usage:
...

[list-devices command options]
-l, --host-list= A comma separated list of addresses <ipv4addr/hostname> to
connect to
-r, --rank= Constrain operation to the specified server rank
-b, --health Include device health in results
-u, --uuid= Device UUID (all devices if blank)
Expand All @@ -389,18 +390,6 @@ Usage:
-u, --uuid= Pool UUID (all pools if blank)
-v, --verbose Show more detail about pools
```
```bash
$ dmg storage scan --nvme-meta --help
Usage:
dmg [OPTIONS] storage scan [scan-OPTIONS]

...

[scan command options]
-v, --verbose List SCM & NVMe device details
-n, --nvme-health Display NVMe device health statistics
-m, --nvme-meta Display server meta data held on NVMe storage
```
The NVMe storage query list-devices and list-pools commands query the persistently
stored SMD device and pool tables, respectively. The device table maps the internal
Expand Down Expand Up @@ -464,14 +453,19 @@ boro-11
- Query Storage Device Health Data:
```bash
$ dmg storage query device-health --help
$ dmg storage query list-devices --health --help
Usage:
dmg [OPTIONS] storage query device-health [device-health-OPTIONS]
dmg [OPTIONS] storage query list-devices [list-devices-OPTIONS]

...

[device-health command options]
-u, --uuid= Device UUID. All devices queried if arg not set
[list-devices command options]
-l, --host-list= A comma separated list of addresses <ipv4addr/hostname> to
connect to
-r, --rank= Constrain operation to the specified server rank
-b, --health Include device health in results
-u, --uuid= Device UUID (all devices if blank)
-e, --show-evicted Show only evicted faulty devices
```
```bash
$ dmg storage scan --nvme-health --help
Expand All @@ -481,27 +475,31 @@ Usage:
...

[scan command options]
-l, --host-list= A comma separated list of addresses <ipv4addr/hostname>
to connect to
-v, --verbose List SCM & NVMe device details
-n, --nvme-health Display NVMe device health statistics
-m, --nvme-meta Display server meta data held on NVMe storage
```
The NVMe storage query device-health command queries the device health data, including
NVMe SSD health stats and in-memory I/O error and checksum error counters.
The server rank and device state are also listed.
Additionally, vendor-specific SMART stats are displayed, currently for Intel devices only.
The 'dmg storage scan --nvme-health' command queries the device health data, including
NVMe SSD health stats and in-memory I/O error and checksum error counters and prefixes the stat
list with NVMe controller details.
The 'dmg storage query list-devices --health' command displays the same health data and SMD UUID,
bdev roles, server rank and device state.
Vendor-specific SMART stats are displayed, currently for Intel devices only.
Note: A reasonable timed workload > 60 min must be ran for the SMART stats to register
(Raw values are 65535).
Media wear percentage can be calculated by dividing by 1024 to find the percentage of the
maximum rated cycles.
```bash
$ dmg -l boro-11 storage query device-health --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
$ dmg -l boro-11 storage query list-devices --health --uuid=d5ec1227-6f39-40db-a1f6-70245aa079f1
-------
boro-11
-------
Devices
UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:0000:8a:00.0]
Targets:[0 1 2 3] Rank:0 State:NORMAL
UUID:d5ec1227-6f39-40db-a1f6-70245aa079f1 [TrAddr:d70505:03:00.0 NSID:1]
Roles:NA Targets:[3 7] Rank:0 State:NORMAL LED:OFF
Health Stats:
Timestamp:2021-09-13T11:12:34.000+00:00
Temperature:289K(15C)
Expand Down
153 changes: 153 additions & 0 deletions docs/admin/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,11 +100,164 @@ Refer to the example configuration file
[`daos_server.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml)
for latest information and examples.

#### MD-on-SSD Configuration

To enable MD-on-SSD, the Control-Plane-Metadata ('control_metadata') global section of the
configuration file
[`daos_server.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml)
needs to specify a persistent location to store control-plane specific metadata (which would be
stored on PMem in non MD-on-SSD mode). Either set 'control_metadata:path' to an existing (mounted)
local filesystem path or set 'control_metadata:device' to a storage partition which can be mounted
and formatted by the control-plane during storage format. In the latter case when specifying a
device the path parameter value will be used as the mountpoint path.

The MD-on-SSD code path will only be used if it is explicitly enabled by specifying the new
'bdev_role' property for the NVMe storage tier(s) in the 'daos_server.yml' file. There are three
types of 'bdev_role': wal, meta, and data. Each role must be assigned to exactly one NVMe tier.
Depending on the number of NVMe SSDs per DAOS engine there may be one, two or three NVMe tiers with
different 'bdev_role' assignments.

For a complete server configuration file example enabling MD-on-SSD, see
[`daos_server_mdonssd.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml).

Below are four different 'daos_server.yml' storage configuration snippets that represent scenarios
for a DAOS engine with four NVMe SSDs and MD-on-SSD enabled.


1. One NVMe tier where four SSDs are each assigned wal, meta and data roles:

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
- meta
- data
bdev_list:
- "0000:e3:00.0"
- "0000:e4:00.0"
- "0000:e5:00.0"
- "0000:e6:00.0"
```

This example shows the typical use case for a DAOS server with a small number of NVMe SSDs. With
only four or five NVMe SSDs per engine, it is natural to assign all three roles to all NVMe SSDs
configured as a single NVMe tier.


2. Two NVMe tiers, one SSD assigned wal role (tier-1) and three SSDs assigned both meta and data
roles (tier-2):

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
bdev_list:
- "0000:e3:00.0"
-
class: nvme
bdev_roles:
- meta
- data
bdev_list:
- "0000:e4:00.0"
- "0000:e5:00.0"
- "0000:e6:00.0"
```

This example shows where one NVMe SSD is dedicated for the wal, while the remaining three NVMe SSDs
are assigned to hold the VOS checkpoints (meta) and the user data. Using two NVMe tiers makes it
possible to use a higher endurance and higher performance SSD for the wal tier. It should be noted
that the performance of a single high-performance SSD may still be lower than the aggregate
performance of multiple lower-performance SSDs in the previous scenario.


3. Two NVMe tiers, one SSD assigned both wal and meta roles (tier-1) and three SSDs assigned both
meta and data roles (tier-2):

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
- meta
bdev_list:
- "0000:e3:00.0"
-
class: nvme
bdev_roles:
- data
bdev_list:
- "0000:e4:00.0"
- "0000:e5:00.0"
- "0000:e6:00.0"
```

This example uses two NVMe tiers but co-locates the wal and meta blobs on the same tier. This may be
a better choice than co-locating meta and data if the endurance of the data NVMe SSDs is too low for
the relatively frequent VOS checkpointing. It should be noted that performance may be impacted by
reducing the number of NVMe SSDs available for VOS checkpointing from three to one.

The other option to use two NVMe tiers would be to co-locate wal and data and dedicate the other
tier for meta but this configuration is invalid. Sharing wal and data roles on the same tier is not
allowed.


4. Three NVMe tiers, one for each role (each device will have a single distinct role), one SSD
assigned wal role (tier-1), one SSD assigned meta role (tier-2) and two SSDs assigned data role
(tier-3):

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
bdev_list:
- "0000:e3:00.0"
-
class: nvme
bdev_roles:
- meta
bdev_list:
- "0000:e4:00.0"
-
class: nvme
bdev_roles:
- data
bdev_list:
- "0000:e5:00.0"
- "0000:e6:00.0"
```

Using three NVMe tiers (one per role) is not reasonable when there are only four NVMe SSDs per
engine but maybe practical with a larger number of SSDs and so illustrated here for completeness.

#### Auto Generate Configuration File

DAOS can attempt to produce a server configuration file that makes optimal use of hardware on a
given set of hosts either through the `dmg` or `daos_server` tools.

To generate an MD-on-SSD configurations set both '--control-metadata-path' and '--use-tmpfs-scm'
options as detailed below. Note that due to the number of variables considered when generating a
configuration automatically the result may not be the most optimal in all situations.

##### Generating Configuration File Using daos_server Tool

To generate a configuration file for a single storage server, run the `daos_server config generate`
Expand Down
1 change: 0 additions & 1 deletion docs/admin/pool_operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,6 @@ $ dmg pool create --help
[create command options]
-g, --group= DAOS pool to be owned by given group, format name@domain
-u, --user= DAOS pool to be owned by given user, format name@domain
-p, --label= Unique label for pool (deprecated, use positional argument)
-P, --properties= Pool properties to be set
-a, --acl-file= Access Control List file path for DAOS pool
-z, --size= Total size of DAOS pool (auto)
Expand Down
Loading

0 comments on commit 38024fb

Please sign in to comment.