Skip to content

Commit

Permalink
Merge branch 'master' into aaoganez/DAOS-16020
Browse files Browse the repository at this point in the history
Required-githooks: true
  • Loading branch information
frostedcmos committed Jun 14, 2024
2 parents cde9d00 + f825add commit c2bb1b1
Show file tree
Hide file tree
Showing 14 changed files with 224 additions and 66 deletions.
48 changes: 23 additions & 25 deletions docs/admin/administration.md
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,6 @@ Usage:
...

Available commands:
device-health Query the device health
list-devices List storage devices on the server
list-pools List pools on the server
usage Show SCM & NVMe storage space utilization per storage server
Expand Down Expand Up @@ -372,6 +371,8 @@ Usage:
...

[list-devices command options]
-l, --host-list= A comma separated list of addresses <ipv4addr/hostname> to
connect to
-r, --rank= Constrain operation to the specified server rank
-b, --health Include device health in results
-u, --uuid= Device UUID (all devices if blank)
Expand All @@ -389,18 +390,6 @@ Usage:
-u, --uuid= Pool UUID (all pools if blank)
-v, --verbose Show more detail about pools
```
```bash
$ dmg storage scan --nvme-meta --help
Usage:
dmg [OPTIONS] storage scan [scan-OPTIONS]

...

[scan command options]
-v, --verbose List SCM & NVMe device details
-n, --nvme-health Display NVMe device health statistics
-m, --nvme-meta Display server meta data held on NVMe storage
```
The NVMe storage query list-devices and list-pools commands query the persistently
stored SMD device and pool tables, respectively. The device table maps the internal
Expand Down Expand Up @@ -464,14 +453,19 @@ boro-11
- Query Storage Device Health Data:
```bash
$ dmg storage query device-health --help
$ dmg storage query list-devices --health --help
Usage:
dmg [OPTIONS] storage query device-health [device-health-OPTIONS]
dmg [OPTIONS] storage query list-devices [list-devices-OPTIONS]

...

[device-health command options]
-u, --uuid= Device UUID. All devices queried if arg not set
[list-devices command options]
-l, --host-list= A comma separated list of addresses <ipv4addr/hostname> to
connect to
-r, --rank= Constrain operation to the specified server rank
-b, --health Include device health in results
-u, --uuid= Device UUID (all devices if blank)
-e, --show-evicted Show only evicted faulty devices
```
```bash
$ dmg storage scan --nvme-health --help
Expand All @@ -481,27 +475,31 @@ Usage:
...

[scan command options]
-l, --host-list= A comma separated list of addresses <ipv4addr/hostname>
to connect to
-v, --verbose List SCM & NVMe device details
-n, --nvme-health Display NVMe device health statistics
-m, --nvme-meta Display server meta data held on NVMe storage
```
The NVMe storage query device-health command queries the device health data, including
NVMe SSD health stats and in-memory I/O error and checksum error counters.
The server rank and device state are also listed.
Additionally, vendor-specific SMART stats are displayed, currently for Intel devices only.
The 'dmg storage scan --nvme-health' command queries the device health data, including
NVMe SSD health stats and in-memory I/O error and checksum error counters and prefixes the stat
list with NVMe controller details.
The 'dmg storage query list-devices --health' command displays the same health data and SMD UUID,
bdev roles, server rank and device state.
Vendor-specific SMART stats are displayed, currently for Intel devices only.
Note: A reasonable timed workload > 60 min must be ran for the SMART stats to register
(Raw values are 65535).
Media wear percentage can be calculated by dividing by 1024 to find the percentage of the
maximum rated cycles.
```bash
$ dmg -l boro-11 storage query device-health --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
$ dmg -l boro-11 storage query list-devices --health --uuid=d5ec1227-6f39-40db-a1f6-70245aa079f1
-------
boro-11
-------
Devices
UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:0000:8a:00.0]
Targets:[0 1 2 3] Rank:0 State:NORMAL
UUID:d5ec1227-6f39-40db-a1f6-70245aa079f1 [TrAddr:d70505:03:00.0 NSID:1]
Roles:NA Targets:[3 7] Rank:0 State:NORMAL LED:OFF
Health Stats:
Timestamp:2021-09-13T11:12:34.000+00:00
Temperature:289K(15C)
Expand Down
153 changes: 153 additions & 0 deletions docs/admin/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,11 +100,164 @@ Refer to the example configuration file
[`daos_server.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml)
for latest information and examples.

#### MD-on-SSD Configuration

To enable MD-on-SSD, the Control-Plane-Metadata ('control_metadata') global section of the
configuration file
[`daos_server.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml)
needs to specify a persistent location to store control-plane specific metadata (which would be
stored on PMem in non MD-on-SSD mode). Either set 'control_metadata:path' to an existing (mounted)
local filesystem path or set 'control_metadata:device' to a storage partition which can be mounted
and formatted by the control-plane during storage format. In the latter case when specifying a
device the path parameter value will be used as the mountpoint path.

The MD-on-SSD code path will only be used if it is explicitly enabled by specifying the new
'bdev_role' property for the NVMe storage tier(s) in the 'daos_server.yml' file. There are three
types of 'bdev_role': wal, meta, and data. Each role must be assigned to exactly one NVMe tier.
Depending on the number of NVMe SSDs per DAOS engine there may be one, two or three NVMe tiers with
different 'bdev_role' assignments.

For a complete server configuration file example enabling MD-on-SSD, see
[`daos_server_mdonssd.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml).

Below are four different 'daos_server.yml' storage configuration snippets that represent scenarios
for a DAOS engine with four NVMe SSDs and MD-on-SSD enabled.


1. One NVMe tier where four SSDs are each assigned wal, meta and data roles:

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
- meta
- data
bdev_list:
- "0000:e3:00.0"
- "0000:e4:00.0"
- "0000:e5:00.0"
- "0000:e6:00.0"
```

This example shows the typical use case for a DAOS server with a small number of NVMe SSDs. With
only four or five NVMe SSDs per engine, it is natural to assign all three roles to all NVMe SSDs
configured as a single NVMe tier.


2. Two NVMe tiers, one SSD assigned wal role (tier-1) and three SSDs assigned both meta and data
roles (tier-2):

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
bdev_list:
- "0000:e3:00.0"
-
class: nvme
bdev_roles:
- meta
- data
bdev_list:
- "0000:e4:00.0"
- "0000:e5:00.0"
- "0000:e6:00.0"
```

This example shows where one NVMe SSD is dedicated for the wal, while the remaining three NVMe SSDs
are assigned to hold the VOS checkpoints (meta) and the user data. Using two NVMe tiers makes it
possible to use a higher endurance and higher performance SSD for the wal tier. It should be noted
that the performance of a single high-performance SSD may still be lower than the aggregate
performance of multiple lower-performance SSDs in the previous scenario.


3. Two NVMe tiers, one SSD assigned both wal and meta roles (tier-1) and three SSDs assigned both
meta and data roles (tier-2):

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
- meta
bdev_list:
- "0000:e3:00.0"
-
class: nvme
bdev_roles:
- data
bdev_list:
- "0000:e4:00.0"
- "0000:e5:00.0"
- "0000:e6:00.0"
```

This example uses two NVMe tiers but co-locates the wal and meta blobs on the same tier. This may be
a better choice than co-locating meta and data if the endurance of the data NVMe SSDs is too low for
the relatively frequent VOS checkpointing. It should be noted that performance may be impacted by
reducing the number of NVMe SSDs available for VOS checkpointing from three to one.

The other option to use two NVMe tiers would be to co-locate wal and data and dedicate the other
tier for meta but this configuration is invalid. Sharing wal and data roles on the same tier is not
allowed.


4. Three NVMe tiers, one for each role (each device will have a single distinct role), one SSD
assigned wal role (tier-1), one SSD assigned meta role (tier-2) and two SSDs assigned data role
(tier-3):

```bash
storage:
-
class: ram
scm_mount: /mnt/dram1
-
class: nvme
bdev_roles:
- wal
bdev_list:
- "0000:e3:00.0"
-
class: nvme
bdev_roles:
- meta
bdev_list:
- "0000:e4:00.0"
-
class: nvme
bdev_roles:
- data
bdev_list:
- "0000:e5:00.0"
- "0000:e6:00.0"
```

Using three NVMe tiers (one per role) is not reasonable when there are only four NVMe SSDs per
engine but maybe practical with a larger number of SSDs and so illustrated here for completeness.

#### Auto Generate Configuration File

DAOS can attempt to produce a server configuration file that makes optimal use of hardware on a
given set of hosts either through the `dmg` or `daos_server` tools.

To generate an MD-on-SSD configurations set both '--control-metadata-path' and '--use-tmpfs-scm'
options as detailed below. Note that due to the number of variables considered when generating a
configuration automatically the result may not be the most optimal in all situations.

##### Generating Configuration File Using daos_server Tool

To generate a configuration file for a single storage server, run the `daos_server config generate`
Expand Down
1 change: 0 additions & 1 deletion docs/admin/pool_operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,6 @@ $ dmg pool create --help
[create command options]
-g, --group= DAOS pool to be owned by given group, format name@domain
-u, --user= DAOS pool to be owned by given user, format name@domain
-p, --label= Unique label for pool (deprecated, use positional argument)
-P, --properties= Pool properties to be set
-a, --acl-file= Access Control List file path for DAOS pool
-z, --size= Total size of DAOS pool (auto)
Expand Down
21 changes: 11 additions & 10 deletions docs/user/container.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ To create and then query a container labeled `mycont` on a pool
labeled `tank`:

```bash
$ daos cont create tank --label mycont
$ daos cont create tank mycont
Container UUID : daefe12c-45d4-44f7-8e56-995d02549041
Container Label: mycont
Container Type : unknown
Expand All @@ -45,7 +45,7 @@ $ daos cont query tank mycont
Snapshot Epochs :
```

While a label is not mandatory, it is highly recommended. Like pools, container
Like pools, container
labels can be up to 127 characters long and must only include alphanumeric
characters, colon (':'), period ('.'), hyphen ('-') or underscore ('\_').
Labels that can be parsed as UUID are not allowed.
Expand All @@ -58,7 +58,7 @@ are stored in an extended attribute of the target file or directory that can
then be used in subsequent command invocations to identify the container.

```bash
$ daos cont create tank --label mycont --path /tmp/mycontainer --type POSIX --oclass=SX
$ daos cont create tank mycont --path /tmp/mycontainer --type POSIX --oclass=SX
Container UUID : 30e5d364-62c9-4ddf-9284-1021359455f2
Container Type : POSIX

Expand All @@ -83,7 +83,7 @@ To create a container that can support one engine failure, use a redundancy
factor of 1 as follows:

```bash
$ daos cont create tank --label mycont1 --type POSIX --properties rd_fac:1
$ daos cont create tank mycont1 --type POSIX --properties rd_fac:1
Container UUID : b396e2ca-2077-4908-9ff2-1af4b4b2fd4a
Container Label: mycont1
Container Type : unknown
Expand Down Expand Up @@ -174,7 +174,7 @@ By default, a container will inherit a set of default value for each property.
Those can be overridden at container creation time via the `--properties` option.

```bash
$ daos cont create tank --label mycont2 --properties cksum:sha1,dedup:hash,rd_fac:1
$ daos cont create tank mycont2 --properties cksum:sha1,dedup:hash,rd_fac:1
Container UUID : a6286ead-1952-4faa-bf87-00fc0f3785aa
Container Label: mycont2
Container Type : unknown
Expand Down Expand Up @@ -340,7 +340,7 @@ The redundancy factor can be set at container creation time and cannot be
modified after creation.

```bash
$ daos cont create tank --label mycont1 --type POSIX --properties rd_fac:1
$ daos cont create tank mycont1 --type POSIX --properties rd_fac:1
Container UUID : b396e2ca-2077-4908-9ff2-1af4b4b2fd4a
Container Label: mycont1
Container Type : unknown
Expand Down Expand Up @@ -440,7 +440,7 @@ checksum verification on the server side, one can use the following command
line:

```bash
$ daos cont create tank --label mycont --properties cksum:crc64,srv_cksum:on
$ daos cont create tank mycont --properties cksum:crc64,srv_cksum:on
Successfully created container dfa09efd-4529-482c-b7cd-748c29ef7419

$ daos cont get-prop tank mycont4 | grep cksum
Expand All @@ -464,7 +464,7 @@ default `ec_cell_sz`, which was 1MiB in DAOS 2.0 and has been reduced to
container creation time via the `--property` option:

```bash
$ daos cont create tank --label mycont5 --type POSIX --properties rd_fac:1,cell_size:131072
$ daos cont create tank mycont5 --type POSIX --properties rd_fac:1,cell_size:131072
Container UUID : 90185799-0e22-4a0b-be9d-1a20900a35ee
Container Label: mycont5
Container Type : unknown
Expand Down Expand Up @@ -501,9 +501,10 @@ For example:


### Checksum Background Scrubbing

A pool ULT can be configured to scan the VOS trees to discover silent data
corruption proactively. (see data_integrity.md for more details). This can be
disabled per container using the ```DAOS_PROP_CO_SCRUBBER_DISABLED``` container
disabled per container using the `DAOS_PROP_CO_SCRUBBER_DISABLED` container
property.

### Deduplication (Preview)
Expand Down Expand Up @@ -659,7 +660,7 @@ To create a container labeled mycont in a pool labeled tank with a custom ACL:
```bash
$ export DAOS_POOL="tank"
$ export DAOS_CONT="mycont"
$ daos cont create $DAOS_POOL --label $DAOS_CONT --acl-file=<path>
$ daos cont create $DAOS_POOL $DAOS_CONT --acl-file=<path>
```

The ACL file format is detailed in the
Expand Down
7 changes: 4 additions & 3 deletions docs/user/filesystem.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,8 @@ the daos container using dfuse.
#### Via mount.fuse3 command

```
$ dmg pool create --scm-size=8G --nvme-size=64G --label=samirrav_pool -u samirrav@
$ dmg pool create --scm-size=8G --nvme-size=64G -u samirrav@ samirrav_pool
Creating DAOS pool with manual per-engine storage allocation: 8.0 GB SCM, 64 GB NVMe (12.50% ratio)
Pool created with 11.11%,88.89% storage tier ratio
--------------------------------------------------
Expand Down Expand Up @@ -327,7 +328,7 @@ $
Only root can run 'mount -a' command so this example should be run as root user.

```
$ dmg pool create --scm-size=8G --nvme-size=64G --label=admin_pool
$ dmg pool create --scm-size=8G --nvme-size=64G admin_pool
Creating DAOS pool with manual per-engine storage allocation: 8.0 GB SCM, 64 GB NVMe (12.50% ratio)
Pool created with 11.11%,88.89% storage tier ratio
--------------------------------------------------
Expand Down Expand Up @@ -595,7 +596,7 @@ To create a new container and link it into the namespace of an existing one,
use the following command.

```bash
$ daos container create <pool_label> --type POSIX --path <path_to_entry_point>
$ daos container create <pool_label> <cont_label> --type POSIX --path <path_to_entry_point>
```

The pool should already exist, and the path should specify a location
Expand Down
Loading

0 comments on commit c2bb1b1

Please sign in to comment.