Skip to content

Commit

Permalink
DAOS-16487 test: fix dmg c helper for set-faulty changes (#15151)
Browse files Browse the repository at this point in the history
Fix missing reference to --host in C test helper and update some doc
references.

Signed-off-by: Tom Nabarro <[email protected]>
  • Loading branch information
tanabarr committed Oct 7, 2024
1 parent c05c486 commit 0b0e860
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 26 deletions.
30 changes: 8 additions & 22 deletions docs/admin/administration.md
Original file line number Diff line number Diff line change
Expand Up @@ -620,21 +620,17 @@ Usage:
[nvme-faulty command options]
-u, --uuid= Device UUID to set
-f, --force Do not require confirmation
-l, --host= Single host address <ipv4addr/hostname> to connect to
```
To manually evict an NVMe SSD (auto eviction is covered later in this section),
the device state needs to be set faulty by running the following command:
```bash
$ dmg -l boro-11 storage set nvme-faulty --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
$ dmg storage set nvme-faulty --host=boro-11 --uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
NOTICE: This command will permanently mark the device as unusable!
Are you sure you want to continue? (yes/no)
yes
-------
boro-11
-------
Devices
UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:]
Targets:[] Rank:0 State:EVICTED LED:ON
set-faulty operation performed successfully on the following host: wolf-310:10001
```
The device state will transition from "NORMAL" to "EVICTED" (shown above), during which time the
faulty device reaction will have been triggered (all targets on the SSD will be rebuilt).
Expand Down Expand Up @@ -693,19 +689,14 @@ Usage:
[nvme command options]
--old-uuid= Device UUID of hot-removed SSD
--new-uuid= Device UUID of new device
--no-reint Bypass reintegration of device and just bring back online.
-l, --host= Single host address <ipv4addr/hostname> to connect to
```
To replace an NVMe SSD with an evicted device and reintegrate it into use with
DAOS, run the following command:
```bash
$ dmg -l boro-11 storage replace nvme --old-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19 --new-uuid=80c9f1be-84b9-4318-a1be-c416c96ca48b
-------
boro-11
-------
Devices
UUID:80c9f1be-84b9-4318-a1be-c416c96ca48b [TrAddr:]
Targets:[] Rank:1 State:NORMAL LED:OFF
$ dmg storage replace nvme --host=boro-11 --old-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19 --new-uuid=80c9f1be-84b9-4318-a1be-c416c96ca48b
dev-replace operation performed successfully on the following host: boro-11:10001
```
The old, now replaced device will remain in an "EVICTED" state until it is unplugged.
The new device will transition from a "NEW" state to a "NORMAL" state (shown above).
Expand All @@ -716,14 +707,9 @@ In order to reuse a device that was previously set as FAULTY and evicted from th
system, an admin can run the following command (setting the old device UUID to be the
new device UUID):
```bash
$ dmg -l boro-11 storage replace nvme --old-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19 --new-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
$ dmg storage replace nvme --host=boro-11 ---old-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19 --new-uuid=5bd91603-d3c7-4fb7-9a71-76bc25690c19
NOTICE: Attempting to reuse a previously set FAULTY device!
-------
boro-11
-------
Devices
UUID:5bd91603-d3c7-4fb7-9a71-76bc25690c19 [TrAddr:]
Targets:[] Rank:1 State:NORMAL LED:OFF
dev-replace operation performed successfully on the following host: boro-11:10001
```
The FAULTY device will transition from an "EVICTED" state back to a "NORMAL" state,
and will again be available for use with DAOS. The use case of this command will mainly
Expand Down
6 changes: 3 additions & 3 deletions src/bio/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ Devices:
<a id="82"></a>
- Manually Set Device State to FAULTY: **$dmg storage set nvme-faulty**
```
$ dmg storage set nvme-faulty --uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0
$ dmg storage set nvme-faulty --host=localhost --uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0
Devices
UUID:9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0 [TrAddr:0000:8d:00.0]
Targets:[0] Rank:0 State:EVICTED
Expand All @@ -219,7 +219,7 @@ Devices
<a id="83"></a>
- Replace an evicted device with a new device: **$dmg storage replace nvme**
```
$ dmg storage replace nvme --old-uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0 --new-uuid=8131fc39-4b1c-4662-bea1-734e728c434e
$ dmg storage replace nvme --host=localhost --old-uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0 --new-uuid=8131fc39-4b1c-4662-bea1-734e728c434e
Devices
UUID:8131fc39-4b1c-4662-bea1-734e728c434e [TrAddr:0000:8d:00.0]
Targets:[0] Rank:0 State:NORMAL
Expand All @@ -229,7 +229,7 @@ Devices
<a id="84"></a>
- Reuse a previously evicted device: **$dmg storage replace nvme**
```
$ dmg storage replace nvme --old-uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0 --new-uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0
$ dmg storage replace nvme --host=localhost --old-uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0 --new-uuid=9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0
Devices
UUID:9fb3ce57-1841-43e6-8b70-2a5e7fb2a1d0 [TrAddr:0000:8a:00.0]
Targets:[0] Rank:0 State:NORMAL
Expand Down
2 changes: 1 addition & 1 deletion src/common/tests_dmg_helpers.c
Original file line number Diff line number Diff line change
Expand Up @@ -1393,7 +1393,7 @@ dmg_storage_set_nvme_fault(const char *dmg_config_file,
D_GOTO(out, rc = -DER_NOMEM);
}

args = cmd_push_arg(args, &argcount, " --host-list=%s ", host);
args = cmd_push_arg(args, &argcount, " --host=%s ", host);
if (args == NULL)
D_GOTO(out, rc = -DER_NOMEM);

Expand Down

0 comments on commit 0b0e860

Please sign in to comment.