Skip to content

Commit

Permalink
Add support for autoexpand property
Browse files Browse the repository at this point in the history
While the autoexpand property may seem like a small feature it
depends on a significant amount of system infrastructure.  Enough
of that infrastructure is now in place with a few customizations
for Linux the autoexpand property for whole disk configurations
can be supported.

Autoexpand works as follows; when a block device is resized a
change event is generated by udev with the DISK_MEDIA_CHANGE key.
The ZED, which is monitoring udev events detects the event for
disks (but not partitions) and hands it off to zfs_deliver_dle().
The zfs_deliver_dle() function appends the exected whole disk
partition suffix, and if the partition can be matched against
a known pool vdev it re-opens it.

Re-opening the vdev with trigger a re-reading of the partition
table so the maximum possible expansion size can be reported.
Next if the property autoexpand is set to "on" a vdev expansion
will be attempted.  After performing some sanity checks on the
disk to verify it's safe to expand the ZFS partition (-part1) it
will be expanded an the partition table updated.  The partition
is then re-opened again to detect the updated size which allows
the new capacity to be used.

Added PHYS_PATH="/dev/zvol/dataset" to vdev configuration for
ZFS volumes.  This was required for the test cases which test
expansion by layering a new pool on top of ZFS volumes.

Enable the zpool_expand_001_pos and /zpool_expand_003_pos
test cases which excercise the autoexpand property.

Fixed zfs_zevent_wait() signal handling which could result
in the ZED spinning when a signal was not handled.

Removed vdev_disk_rrpart() functionality which can be abandoned
in favour of re-opening the device which trigger a re-read of
the partition table as long no other partitions are in use.
This will be true as long as we're working with hole disks.
As a bonus this allows us to remove to Linux kernel API checks.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#120
Issue openzfs#2437
Issue openzfs#5771
Issue openzfs#7582
  • Loading branch information
behlendorf committed Jun 14, 2018
1 parent 6d464db commit 5cca586
Show file tree
Hide file tree
Showing 11 changed files with 179 additions and 211 deletions.
15 changes: 11 additions & 4 deletions cmd/zed/agents/zfs_mod.c
Original file line number Diff line number Diff line change
Expand Up @@ -751,23 +751,30 @@ zfsdle_vdev_online(zpool_handle_t *zhp, void *data)
}

/*
* This function handles the ESC_DEV_DLE event.
* This function handles the ESC_DEV_DLE (DISK_MEDIA_CHANGE) event which
* is only delivered for the disk itself, not for each partition. Presume
* that a 'wholedisk' partition exists and append the expected partition
* suffix in order to attempt a match.
*/
static int
zfs_deliver_dle(nvlist_t *nvl)
{
char *devname;
char *devname, pname[MAXPATHLEN];

if (nvlist_lookup_string(nvl, DEV_PHYS_PATH, &devname) != 0) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: no physpath");
return (-1);
}

if (zpool_iter(g_zfshdl, zfsdle_vdev_online, devname) != 1) {
strlcpy(pname, devname, MAXPATHLEN);
zfs_append_partition(pname, MAXPATHLEN);

if (zpool_iter(g_zfshdl, zfsdle_vdev_online, pname) != 1) {
zed_log_msg(LOG_INFO, "zfs_deliver_dle: device '%s' not "
"found", devname);
"found", pname);
return (1);
}

return (0);
}

Expand Down
32 changes: 23 additions & 9 deletions cmd/zed/zed_disk_event.c
Original file line number Diff line number Diff line change
Expand Up @@ -165,11 +165,12 @@ zed_udev_monitor(void *arg)

while (1) {
struct udev_device *dev;
const char *action, *type, *part, *sectors;
const char *action, *type, *part, *sectors, *change;
const char *bus, *uuid;
const char *class, *subclass;
nvlist_t *nvl;
boolean_t is_zfs = B_FALSE;
boolean_t is_disk_media_change = B_FALSE;

/* allow a cancellation while blocked (recvmsg) */
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
Expand Down Expand Up @@ -202,14 +203,26 @@ zed_udev_monitor(void *arg)
}

/*
* if this is a disk and it is partitioned, then the
* Disk media change events are allowed for auto-expand.
* Whether the device contains a zfs_member is determined
* at the time of the attempted expansion.
*/
change = udev_device_get_property_value(dev,
"DISK_MEDIA_CHANGE");
if (change != NULL && change[0] == '1')
is_disk_media_change = B_TRUE;

/*
* If this is a disk and it is partitioned, then the
* zfs label will reside in a DEVTYPE=partition and
* we can skip passing this event
* we can skip passing this event. Unless it's a disk
* media changes event which is expected for auto-expand.
*/
type = udev_device_get_property_value(dev, "DEVTYPE");
part = udev_device_get_property_value(dev,
"ID_PART_TABLE_TYPE");
if (type != NULL && type[0] != '\0' &&
if (!is_disk_media_change &&
type != NULL && type[0] != '\0' &&
strcmp(type, "disk") == 0 &&
part != NULL && part[0] != '\0') {
/* skip and wait for partition event */
Expand All @@ -231,14 +244,15 @@ zed_udev_monitor(void *arg)
}

/*
* If the blkid probe didn't find ZFS, then a persistent
* device id string is required in the message schema
* for matching with vdevs. Preflight here for expected
* udev information.
* If the blkid probe didn't find ZFS and this is not a
* disk media change event. Then a persistent device id
* string is required in the message schema for matching
* with vdevs. Preflight here for expected udev information.
*/
bus = udev_device_get_property_value(dev, "ID_BUS");
uuid = udev_device_get_property_value(dev, "DM_UUID");
if (!is_zfs && (bus == NULL && uuid == NULL)) {
if (!is_zfs && !is_disk_media_change &&
bus == NULL && uuid == NULL) {
zed_log_msg(LOG_INFO, "zed_udev_monitor: %s no devid "
"source", udev_device_get_devnode(dev));
udev_device_unref(dev);
Expand Down
19 changes: 0 additions & 19 deletions config/kernel-blkdev-get.m4

This file was deleted.

17 changes: 0 additions & 17 deletions config/kernel-get-gendisk.m4

This file was deleted.

2 changes: 0 additions & 2 deletions config/kernel.m4
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_TYPE_FMODE_T
ZFS_AC_KERNEL_3ARG_BLKDEV_GET
ZFS_AC_KERNEL_BLKDEV_GET_BY_PATH
ZFS_AC_KERNEL_OPEN_BDEV_EXCLUSIVE
ZFS_AC_KERNEL_LOOKUP_BDEV
Expand Down Expand Up @@ -72,7 +71,6 @@ AC_DEFUN([ZFS_AC_CONFIG_KERNEL], [
ZFS_AC_KERNEL_BLK_QUEUE_HAVE_BLK_PLUG
ZFS_AC_KERNEL_GET_DISK_AND_MODULE
ZFS_AC_KERNEL_GET_DISK_RO
ZFS_AC_KERNEL_GET_GENDISK
ZFS_AC_KERNEL_HAVE_BIO_SET_OP_ATTRS
ZFS_AC_KERNEL_GENERIC_READLINK_GLOBAL
ZFS_AC_KERNEL_DISCARD_GRANULARITY
Expand Down
61 changes: 47 additions & 14 deletions lib/libzfs/libzfs_import.c
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,21 @@ zfs_device_get_devid(struct udev_device *dev, char *bufptr, size_t buflen)
return (0);
}

/*
* For volumes use the persistent /dev/zvol/dataset identifier
*/
entry = udev_device_get_devlinks_list_entry(dev);
while (entry != NULL) {
const char *name;

name = udev_list_entry_get_name(entry);
if (strncmp(name, ZVOL_ROOT, strlen(ZVOL_ROOT)) == 0) {
(void) strlcpy(bufptr, name, buflen);
return (0);
}
entry = udev_list_entry_get_next(entry);
}

/*
* NVME 'by-id' symlinks are similar to bus case
*/
Expand Down Expand Up @@ -187,26 +202,44 @@ int
zfs_device_get_physical(struct udev_device *dev, char *bufptr, size_t buflen)
{
const char *physpath = NULL;
struct udev_list_entry *entry;

/*
* Normal disks use ID_PATH for their physical path. Device mapper
* devices are virtual and don't have a physical path. For them we
* use ID_VDEV instead, which is setup via the /etc/vdev_id.conf file.
* ID_VDEV provides a persistent path to a virtual device. If you
* don't have vdev_id.conf setup, you cannot use multipath autoreplace.
* Normal disks use ID_PATH for their physical path.
*/
if (!((physpath = udev_device_get_property_value(dev, "ID_PATH")) &&
physpath[0])) {
if (!((physpath =
udev_device_get_property_value(dev, "ID_VDEV")) &&
physpath[0])) {
return (ENODATA);
}
physpath = udev_device_get_property_value(dev, "ID_PATH");
if (physpath != NULL && strlen(physpath) > 0) {
(void) strlcpy(bufptr, physpath, buflen);
return (0);
}

/*
* Device mapper devices are virtual and don't have a physical
* path. For them we use ID_VDEV instead, which is setup via the
* /etc/vdev_id.conf file. ID_VDEV provides a persistent path
* to a virtual device. If you don't have vdev_id.conf setup,
* you cannot use multipath autoreplace with device mapper.
*/
physpath = udev_device_get_property_value(dev, "ID_VDEV");
if (physpath != NULL && strlen(physpath) > 0) {
(void) strlcpy(bufptr, physpath, buflen);
return (0);
}

(void) strlcpy(bufptr, physpath, buflen);
/*
* For volumes use the persistent /dev/zvol/dataset identifier
*/
entry = udev_device_get_devlinks_list_entry(dev);
while (entry != NULL) {
physpath = udev_list_entry_get_name(entry);
if (strncmp(physpath, ZVOL_ROOT, strlen(ZVOL_ROOT)) == 0) {
(void) strlcpy(bufptr, physpath, buflen);
return (0);
}
entry = udev_list_entry_get_next(entry);
}

return (0);
return (ENODATA);
}

boolean_t
Expand Down
30 changes: 21 additions & 9 deletions module/zfs/fm.c
Original file line number Diff line number Diff line change
Expand Up @@ -671,19 +671,31 @@ zfs_zevent_wait(zfs_zevent_t *ze)
int error = 0;

mutex_enter(&zevent_lock);
zevent_waiters++;

if (zevent_flags & ZEVENT_SHUTDOWN) {
error = ESHUTDOWN;
goto out;
}
while (error == 0) {
if (zevent_flags & ZEVENT_SHUTDOWN) {
error = SET_ERROR(ESHUTDOWN);
break;
}

zevent_waiters++;
cv_wait_sig(&zevent_cv, &zevent_lock);
if (issig(JUSTLOOKING))
error = EINTR;
error = cv_timedwait_sig(&zevent_cv, &zevent_lock,
ddi_get_lbolt() + hz);
if (signal_pending(current) || fatal_signal_pending(current)) {
error = SET_ERROR(EINTR);
break;
} else {
if (error == -1) {
error = 0;
continue;
} else {
error = 0;
break;
}
}
}

zevent_waiters--;
out:
mutex_exit(&zevent_lock);

return (error);
Expand Down
3 changes: 2 additions & 1 deletion module/zfs/vdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -3097,7 +3097,8 @@ vdev_online(spa_t *spa, uint64_t guid, uint64_t flags, vdev_state_t *newstate)
/* XXX - L2ARC 1.0 does not support expansion */
if (!vd->vdev_aux) {
for (pvd = vd; pvd != rvd; pvd = pvd->vdev_parent)
pvd->vdev_expanding = !!(flags & ZFS_ONLINE_EXPAND);
pvd->vdev_expanding = !!((flags & ZFS_ONLINE_EXPAND) ||
spa->spa_autoexpand);
}

vdev_reopen(tvd);
Expand Down
Loading

0 comments on commit 5cca586

Please sign in to comment.