Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
DLPX-87528 migration device removal held up by currently syncing TXG (o…
…penzfs#1132) = Problem The current removal code is sub-optimal for our evacuation to the object store use case. While copying data, once the `vca_outstanding_bytes` is full, we wait for the next txg, but we don’t cause it to be pushed out, so we may have to wait for the txg timeout before making progress. = This Patch While the logic remains the same for block-to-block removals, evacuations to the object store: 1] Push out a TXG so we don't have to wait for a TXG timeout 2] Adhere to a new global tunable that `zfs_remove_max_txg_bytes` which is compared to `svr_bytes_done` in order which looks at that limit in a per-TXG fashion. = Notes Matt was the one that first identified this issue and came up with the fix of this patch. I just propagated some of these changes to the ZTS code. = Perf results In a i3en.2xlarge VM I evacuated ~22.4GB of data with and without this patch. Without The Patch: ``` pool: test state: ONLINE remove: Evacuation of /dev/nvme2n1p1 in progress since Mon Aug 21 18:35:59 2023 22.2G copied out of 22.4G at 15.0M/s, 99.25% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 nvme2n1 ONLINE 0 0 0 (removing) cloudburst-data-2 ONLINE 0 0 0 errors: No known data errors pool: test state: ONLINE remove: Removal of vdev 0 copied 22.4G in 0h25m, completed on Mon Aug 21 19:01:27 2023 36.0K memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 cloudburst-data-2 ONLINE 0 0 0 ``` With This Patch: ``` pool: test state: ONLINE remove: Evacuation of /dev/nvme2n1p1 in progress since Mon Aug 21 19:41:38 2023 21.6G copied out of 22.4G at 96.6M/s, 96.63% done, 0h0m to go config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 nvme2n1 ONLINE 0 0 0 (removing) cloudburst-data-2 ONLINE 0 0 0 errors: No known data errors pool: test state: ONLINE remove: Removal of vdev 0 copied 22.4G in 0h3m, completed on Mon Aug 21 19:45:34 2023 36.0K memory used for removed device mappings config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 cloudburst-data-2 ONLINE 0 0 0 ```
- Loading branch information