Merge branch 'master' into feature/vos_on_blob_p2

Required-githooks: true
daos-stack · Jul 10, 2024 · 38024fb · 38024fb
2 parents 9f3b928 + 406f35b
commit 38024fb
Show file tree

Hide file tree

Showing 253 changed files with 6,713 additions and 3,819 deletions.
diff --git a/SConstruct b/SConstruct
@@ -373,6 +373,9 @@ def scons():
 
     deps_env = Environment()
 
+    # Silence deprecation warning so it doesn't fail the build
+    SetOption('warn', ['no-python-version'])
+
     add_command_line_options()
 
     # Scons strips out the environment, however that is not always desirable so add back in

diff --git a/ci/codespell.ignores b/ci/codespell.ignores
@@ -36,3 +36,6 @@ laf
 cacl
 chk
 falloc
+rin
+assertIn
+checkin
diff --git a/ci/jira_query.py b/ci/jira_query.py
@@ -39,7 +39,7 @@
 FIELDS = 'summary,status,labels,customfield_10044,customfield_10045'
 
 # Labels in GitHub which this script will set/clear based on the logic below.
-MANAGED_LABELS = ('release-2.2', 'release-2.4', 'priority')
+MANAGED_LABELS = ('release-2.2', 'release-2.4', 'release-2.6', 'priority')
 
 
 def set_output(key, value):
@@ -175,6 +175,8 @@ def main():
                 gh_label.add('release-2.2')
             if str(version) in ('2.4 Community Release'):
                 gh_label.add('release-2.4')
+            if str(version) in ('2.6 Community Release'):
+                gh_label.add('release-2.6')
 
         # If a PR does not otherwise have priority then use custom values from above.
         if priority is None and not pr_data['base']['ref'].startswith('release'):

diff --git a/ci/provisioning/post_provision_config.sh b/ci/provisioning/post_provision_config.sh
@@ -20,7 +20,7 @@ source ci/provisioning/post_provision_config_common_functions.sh
 source ci/junit.sh
 
 
-: "${MLNX_VER_NUM:=latest-5.8}"
+: "${MLNX_VER_NUM:=24.04-0.6.6.0}"
 
 : "${DISTRO:=EL_7}"
 DSL_REPO_var="DAOS_STACK_${DISTRO}_LOCAL_REPO"

diff --git a/docs/admin/administration.md b/docs/admin/administration.md
@@ -308,7 +308,6 @@ Usage:
 ...
 
 Available commands:
-  device-health  Query the device health
   list-devices   List storage devices on the server
   list-pools     List pools on the server
   usage          Show SCM & NVMe storage space utilization per storage server
@@ -372,6 +371,8 @@ Usage:
 ...
 
 [list-devices command options]
+      -l, --host-list=    A comma separated list of addresses <ipv4addr/hostname> to
+                          connect to
       -r, --rank=         Constrain operation to the specified server rank
       -b, --health        Include device health in results
       -u, --uuid=         Device UUID (all devices if blank)
@@ -389,18 +390,6 @@ Usage:
       -u, --uuid=     Pool UUID (all pools if blank)
       -v, --verbose   Show more detail about pools
 ```
-```bash
-$ dmg storage scan --nvme-meta --help
-Usage:
-  dmg [OPTIONS] storage scan [scan-OPTIONS]
-
-...
-
-[scan command options]
-      -v, --verbose      List SCM & NVMe device details
-      -n, --nvme-health  Display NVMe device health statistics
-      -m, --nvme-meta    Display server meta data held on NVMe storage
-```
 
 The NVMe storage query list-devices and list-pools commands query the persistently
 stored SMD device and pool tables, respectively. The device table maps the internal
@@ -464,14 +453,19 @@ boro-11
 
 - Query Storage Device Health Data:
 ```bash
-$ dmg storage query device-health --help
+$ dmg storage query list-devices --health --help
 Usage:
-  dmg [OPTIONS] storage query device-health [device-health-OPTIONS]
+  dmg [OPTIONS] storage query list-devices [list-devices-OPTIONS]
 
 ...
 
-[device-health command options]
-      -u, --uuid=     Device UUID. All devices queried if arg not set
+[list-devices command options]
+      -l, --host-list=    A comma separated list of addresses <ipv4addr/hostname> to
+                          connect to
+      -r, --rank=         Constrain operation to the specified server rank
+      -b, --health        Include device health in results
+      -u, --uuid=         Device UUID (all devices if blank)
+      -e, --show-evicted  Show only evicted faulty devices
 ```
 ```bash
 $ dmg storage scan --nvme-health --help
@@ -481,27 +475,31 @@ Usage:
 ...
 
 [scan command options]
+      -l, --host-list=   A comma separated list of addresses <ipv4addr/hostname>
+                         to connect to
       -v, --verbose      List SCM & NVMe device details
       -n, --nvme-health  Display NVMe device health statistics
-      -m, --nvme-meta    Display server meta data held on NVMe storage
 ```
 
-The NVMe storage query device-health command queries the device health data, including
-NVMe SSD health stats and in-memory I/O error and checksum error counters.
-The server rank and device state are also listed.
-Additionally, vendor-specific SMART stats are displayed, currently for Intel devices only.
+The 'dmg storage scan --nvme-health' command queries the device health data, including
+NVMe SSD health stats and in-memory I/O error and checksum error counters and prefixes the stat
+list with NVMe controller details.
+The 'dmg storage query list-devices --health' command displays the same health data and SMD UUID,
+bdev roles, server rank and device state.
+
+Vendor-specific SMART stats are displayed, currently for Intel devices only.
 Note: A reasonable timed workload > 60 min must be ran for the SMART stats to register
 (Raw values are 65535).
 Media wear percentage can be calculated by dividing by 1024 to find the percentage of the
 maximum rated cycles.
 ```bash
-$ dmg -l boro-11 storage query device-health --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
+$ dmg -l boro-11 storage query list-devices --health --uuid=d5ec1227-6f39-40db-a1f6-70245aa079f1
 -------
 boro-11
 -------
   Devices
-    UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:0000:8a:00.0]
-      Targets:[0 1 2 3] Rank:0 State:NORMAL
+    UUID:d5ec1227-6f39-40db-a1f6-70245aa079f1 [TrAddr:d70505:03:00.0 NSID:1]
+      Roles:NA Targets:[3 7] Rank:0 State:NORMAL LED:OFF
       Health Stats:
         Timestamp:2021-09-13T11:12:34.000+00:00
         Temperature:289K(15C)

diff --git a/docs/admin/deployment.md b/docs/admin/deployment.md
@@ -100,11 +100,164 @@ Refer to the example configuration file
 [`daos_server.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml)
 for latest information and examples.
 
+#### MD-on-SSD Configuration
+
+To enable MD-on-SSD, the Control-Plane-Metadata ('control_metadata') global section of the
+configuration file
+[`daos_server.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml)
+needs to specify a persistent location to store control-plane specific metadata (which would be
+stored on PMem in non MD-on-SSD mode). Either set 'control_metadata:path' to an existing (mounted)
+local filesystem path or set 'control_metadata:device' to a storage partition which can be mounted
+and formatted by the control-plane during storage format. In the latter case when specifying a
+device the path parameter value will be used as the mountpoint path.
+
+The MD-on-SSD code path will only be used if it is explicitly enabled by specifying the new
+'bdev_role' property for the NVMe storage tier(s) in the 'daos_server.yml' file. There are three
+types of 'bdev_role': wal, meta, and data. Each role must be assigned to exactly one NVMe tier.
+Depending on the number of NVMe SSDs per DAOS engine there may be one, two or three NVMe tiers with
+different 'bdev_role' assignments.
+
+For a complete server configuration file example enabling MD-on-SSD, see
+[`daos_server_mdonssd.yml`](https://github.com/daos-stack/daos/blob/master/utils/config/daos_server.yml).
+
+Below are four different 'daos_server.yml' storage configuration snippets that represent scenarios
+for a DAOS engine with four NVMe SSDs and MD-on-SSD enabled.
+
+
+1. One NVMe tier where four SSDs are each assigned wal, meta and data roles:
+
+```bash
+storage:
+-
+  class: ram
+  scm_mount: /mnt/dram1
+-
+  class: nvme
+  bdev_roles:
+  - wal
+  - meta
+  - data
+  bdev_list:
+  - "0000:e3:00.0"
+  - "0000:e4:00.0"
+  - "0000:e5:00.0"
+  - "0000:e6:00.0"
+```
+
+This example shows the typical use case for a DAOS server with a small number of NVMe SSDs. With
+only four or five NVMe SSDs per engine, it is natural to assign all three roles to all NVMe SSDs
+configured as a single NVMe tier.
+
+
+2. Two NVMe tiers, one SSD assigned wal role (tier-1) and three SSDs assigned both meta and data
+   roles (tier-2):
+
+```bash
+storage:
+-
+  class: ram
+  scm_mount: /mnt/dram1
+-
+  class: nvme
+  bdev_roles:
+  - wal
+  bdev_list:
+  - "0000:e3:00.0"
+-
+  class: nvme
+  bdev_roles:
+  - meta
+  - data
+  bdev_list:
+  - "0000:e4:00.0"
+  - "0000:e5:00.0"
+  - "0000:e6:00.0"
+```
+
+This example shows where one NVMe SSD is dedicated for the wal, while the remaining three NVMe SSDs
+are assigned to hold the VOS checkpoints (meta) and the user data. Using two NVMe tiers makes it
+possible to use a higher endurance and higher performance SSD for the wal tier. It should be noted
+that the performance of a single high-performance SSD may still be lower than the aggregate
+performance of multiple lower-performance SSDs in the previous scenario.
+
+
+3. Two NVMe tiers, one SSD assigned both wal and meta roles (tier-1) and three SSDs assigned both
+   meta and data roles (tier-2):
+
+```bash
+storage:
+-
+  class: ram
+  scm_mount: /mnt/dram1
+-
+  class: nvme
+  bdev_roles:
+  - wal
+  - meta
+  bdev_list:
+  - "0000:e3:00.0"
+-
+  class: nvme
+  bdev_roles:
+  - data
+  bdev_list:
+  - "0000:e4:00.0"
+  - "0000:e5:00.0"
+  - "0000:e6:00.0"
+```
+
+This example uses two NVMe tiers but co-locates the wal and meta blobs on the same tier. This may be
+a better choice than co-locating meta and data if the endurance of the data NVMe SSDs is too low for
+the relatively frequent VOS checkpointing. It should be noted that performance may be impacted by
+reducing the number of NVMe SSDs available for VOS checkpointing from three to one.
+
+The other option to use two NVMe tiers would be to co-locate wal and data and dedicate the other
+tier for meta but this configuration is invalid. Sharing wal and data roles on the same tier is not
+allowed.
+
+
+4. Three NVMe tiers, one for each role (each device will have a single distinct role), one SSD
+   assigned wal role (tier-1), one SSD assigned meta role (tier-2) and two SSDs assigned data role
+   (tier-3):
+
+```bash
+storage:
+-
+  class: ram
+  scm_mount: /mnt/dram1
+-
+  class: nvme
+  bdev_roles:
+  - wal
+  bdev_list:
+  - "0000:e3:00.0"
+-
+  class: nvme
+  bdev_roles:
+  - meta
+  bdev_list:
+  - "0000:e4:00.0"
+-
+  class: nvme
+  bdev_roles:
+  - data
+  bdev_list:
+  - "0000:e5:00.0"
+  - "0000:e6:00.0"
+```
+
+Using three NVMe tiers (one per role) is not reasonable when there are only four NVMe SSDs per
+engine but maybe practical with a larger number of SSDs and so illustrated here for completeness.
+
 #### Auto Generate Configuration File
 
 DAOS can attempt to produce a server configuration file that makes optimal use of hardware on a
 given set of hosts either through the `dmg` or `daos_server` tools.
 
+To generate an MD-on-SSD configurations set both '--control-metadata-path' and '--use-tmpfs-scm'
+options as detailed below. Note that due to the number of variables considered when generating a
+configuration automatically the result may not be the most optimal in all situations.
+
 ##### Generating Configuration File Using daos_server Tool
 
 To generate a configuration file for a single storage server, run the `daos_server config generate`

diff --git a/docs/admin/pool_operations.md b/docs/admin/pool_operations.md
@@ -133,7 +133,6 @@ $ dmg pool create --help
 [create command options]
       -g, --group=      DAOS pool to be owned by given group, format name@domain
       -u, --user=       DAOS pool to be owned by given user, format name@domain
-      -p, --label=      Unique label for pool (deprecated, use positional argument)
       -P, --properties= Pool properties to be set
       -a, --acl-file=   Access Control List file path for DAOS pool
       -z, --size=       Total size of DAOS pool (auto)
-Original file line number
+Diff line change
@@ Expand Up / @@ -36,3 +36,6 @@ laf @@
     cacl
     chk
     falloc
+    rin
+    assertIn
+    checkin