From e94d375d8a55af1d380b2f5f657a9b69045a6854 Mon Sep 17 00:00:00 2001 From: Nathaniel Clark Date: Mon, 14 Sep 2015 12:51:48 -0400 Subject: [PATCH] LU-7153 build: Update SPL/ZFS to 0.6.5.2 ZFS/SPL 0.6.5.2 Bug Fixes * Init script fixes zfsonlinux/zfs#3816 * Fix uioskip crash when skip to end zfsonlinux/zfs#3806 zfsonlinux/zfs#3850 * Userspace can trigger an assertion zfsonlinux/zfs#3792 * Fix quota userused underflow bug zfsonlinux/zfs#3789 * Fix performance regression from unwanted synchronous I/O zfsonlinux/zfs#3780 * Fix deadlock during ARC reclaim zfsonlinux/zfs#3808 zfsonlinux/zfs#3834 * Fix deadlock with zfs receive and clamscan zfsonlinux/zfs#3719 * Allow NFS activity to defer snapshot unmounts zfsonlinux/zfs#3794 * Linux 4.3 compatibility zfsonlinux/zfs#3799 * Zed reload fixes zfsonlinux/zfs#3773 * Fix PAX Patch/Grsec SLAB_USERCOPY panic zfsonlinux/zfs#3796 * Always remove during dkms uninstall/update zfsonlinux/spl#476 ZFS/SPL 0.6.5.1 Bug Fixes * Fix zvol corruption with TRIM/discard zfsonlinux/zfs#3798 * Fix NULL as mount(2) syscall data parameter zfsonlinux/zfs#3804 * Fix xattr=sa dataset property not honored zfsonlinux/zfs#3787 ZFS/SPL 0.6.5 Supported Kernels * Compatible with 2.6.32 - 4.2 Linux kernels. New Functionality * Support for temporary mount options. * Support for accessing the .zfs/snapshot over NFS. * Support for estimating send stream size when source is a bookmark. * Administrative commands are allowed to use reserved space improving robustness. * New notify ZEDLETs support email and pushbullet notifications. * New keyword 'slot' for vdev_id.conf to control what is use for the slot number. * New zpool export -a option unmounts and exports all imported pools. * New zpool iostat -y omits the first report with statistics since boot. * New zdb can now open the root dataset. * New zdb can print the numbers of ganged blocks. * New zdb -ddddd can print details of block pointer objects. * New zdb -b performance improved. * New zstreamdump -d prints contents of blocks. New Feature Flags * large_blocks - This feature allows the record size on a dataset to be set larger than 128KB. We currently support block sizes from 512 bytes to 16MB. The benefits of larger blocks, and thus larger IO, need to be weighed against the cost of COWing a giant block to modify one byte. Additionally, very large blocks can have an impact on I/O latency, and also potentially on the memory allocator. Therefore, we do not allow the record size to be set larger than zfs_max_recordsize (default 1MB). Larger blocks can be created by changing this tuning, pools with larger blocks can always be imported and used, regardless of this setting. * filesystem_limits - This feature enables filesystem and snapshot limits. These limits can be used to control how many filesystems and/or snapshots can be created at the point in the tree on which the limits are set. *Performance* * Improved zvol performance on all kernels (>50% higher throughput, >20% lower latency) * Improved zil performance on Linux 2.6.39 and earlier kernels (10x lower latency) * Improved allocation behavior on mostly full SSD/file pools (5% to 10% improvement on 90% full pools) * Improved performance when removing large files. * Caching improvements (ARC): ** Better cached read performance due to reduced lock contention. ** Smarter heuristics for managing the total size of the cache and the distribution of data/metadata. ** Faster release of cached buffers due to unexpected memory pressure. *Changes in Behavior* * Default reserved space was increased from 1.6% to 3.3% of total pool capacity. This default percentage can be controlled through the new spa_slop_shift module option, setting it to 6 will restore the previous percentage. * Loading of the ZFS module stack is now handled by systemd or the sysv init scripts. Invoking the zfs/zpool commands will not cause the modules to be automatically loaded. The previous behavior can be restored by setting the ZFS_MODULE_LOADING=yes environment variable but this functionality will be removed in a future release. * Unified SYSV and Gentoo OpenRC initialization scripts. The previous functionality has been split in to zfs-import, zfs-mount, zfs-share, and zfs-zed scripts. This allows for independent control of the services and is consistent with the unit files provided for a systemd based system. Complete details of the functionality provided by the updated scripts can be found here. * Task queues are now dynamic and worker threads will be created and destroyed as needed. This allows the system to automatically tune itself to ensure the optimal number of threads are used for the active workload which can result in a performance improvement. * Task queue thread priorities were correctly aligned with the default Linux file system thread priorities. This allows ZFS to compete fairly with other active Linux file systems when the system is under heavy load. * When compression=on the default compression algorithm will be lz4 as long as the feature is enabled. Otherwise the default remains lzjb. Similarly lz4 is now the preferred method for compressing meta data when available. * The use of mkdir/rmdir/mv in the .zfs/snapshot directory has been disabled by default both locally and via NFS clients. The zfs_admin_snapshot module option can be used to re-enable this functionality. * LBA weighting is automatically disabled on files and SSDs ensuring the entire device is used fairly. * iostat accounting on zvols running on kernels older than Linux 3.19 is no longer supported. * The known issues preventing swap on zvols for Linux 3.9 and newer kernels have been resolved. However, deadlocks are still possible for older kernels. Module Options * Changed zfs_arc_c_min default from 4M to 32M to accommodate large blocks. * Added metaslab_aliquot to control how many bytes are written to a top-level vdev before moving on to the next one. Increasing this may be helpful when using blocks larger than 1M. * Added spa_slop_shift, see 'reserved space' comment in the 'changes to behavior' section. * Added zfs_admin_snapshot, enable/disable the use of mkdir/rmdir/mv in .zfs/snapshot directory. * Added zfs_arc_lotsfree_percent, throttle I/O when free system memory drops below this percentage. * Added zfs_arc_num_sublists_per_state, used to allow more fine-grained locking. * Added zfs_arc_p_min_shift, used to set a floor on arc_p. * Added zfs_arc_sys_free, the target number of bytes the ARC should leave as free. * Added zfs_dbgmsg_enable, used to enable the 'dbgmsg' kstat. * Added zfs_dbgmsg_maxsize, sets the maximum size of the dbgmsg buffer. * Added zfs_max_recordsize, used to control the maximum allowed record size. * Added zfs_arc_meta_strategy, used to select the preferred ARC reclaim strategy. * Removed metaslab_min_alloc_size, it was unused internally due to prior changes. * Removed zfs_arc_memory_throttle_disable, replaced by zfs_arc_lotsfree_percent. * Removed zvol_threads, zvols no longer require a dedicated task queue. * See zfs-module-parameters(5) for complete details on available module options. Bug Fixes * Improved documentation with many updates, corrections, and additions. * Improved sysv, systemd, initramfs, and dracut support. * Improved block pointer validation before issuing IO. * Improved scrub pause heuristics. * Improved test coverage. * Improved heuristics for automatic repair when zfs_recover=1 module option is set. * Improved debugging infrastructure via 'dbgmsg' kstat. * Improved zpool import performance. * Fixed deadlocks in direct memory reclaim. * Fixed deadlock on db_mtx and dn_holds. * Fixed deadlock in dmu_objset_find_dp(). * Fixed deadlock during zfs rollback. * Fixed kernel panic due to tsd_exit() in ZFS_EXIT. * Fixed kernel panic when adding a duplicate dbuf to dn_dbufs. * Fixed kernel panic due to security / ACL creation failure. * Fixed kernel panic on unmount due to iput taskq. * Fixed panic due to corrupt nvlist when running utilities. * Fixed panic on unmount due to not waiting for all znodes to be released. * Fixed panic with zfs clone from different source and target pools. * Fixed NULL pointer dereference in dsl_prop_get_ds(). * Fixed NULL pointer dereference in dsl_prop_notify_all_cb(). * Fixed NULL pointer dereference in zfsdev_getminor(). * Fixed I/Os are now aggregated across ZIO priority classes. * Fixed .zfs/snapshot auto-mounting for all supported kernels. * Fixed 3-digit octal escapes by changing to 4-digit which disambiguate the output. * Fixed hard lockup due to infinite loop in zfs_zget(). * Fixed misreported 'alloc' value for cache devices. * Fixed spurious hung task watchdog stack traces. * Fixed direct memory reclaim deadlocks. * Fixed module loading in zfs import systemd service. * Fixed intermittent libzfs_init() failure to open /dev/zfs. * Fixed hot-disk sparing for disk vdevs * Fixed system spinning during ARC reclaim. * Fixed formatting errors in {{zfs(8)}} * Fixed zio pipeline stall by having callers invoke next stage. * Fixed assertion failed in zrl_tryenter(). * Fixed memory leak in make_root_vdev(). * Fixed memory leak in zpool_in_use(). * Fixed memory leak in libzfs when doing rollback. * Fixed hold leak in dmu_recv_end_check(). * Fixed refcount leak in bpobj_iterate_impl(). * Fixed misuse of input argument in traverse_visitbp(). * Fixed missing missing mutex_destroy() calls. * Fixed integer overflows in dmu_read/dmu_write. * Fixed verify() failure in zio_done(). * Fixed zio_checksum_error() to only include info for ECKSUM errors. * Fixed -ESTALE to force lookup on missing NFS file handles. * Fixed spurious failures from dsl_dataset_hold_obj(). * Fixed zfs compressratio when using with 4k sector size. * Fixed spurious watchdog warnings in prefetch thread. * Fixed unfair disk space allocation when vdevs are of unequal size. * Fixed ashift accounting error writing to cache devices. * Fixed zdb -d has false positive warning when feature@large_blocks=disabled. * Fixed zdb -h | -i seg fault. * Fixed force-received full stream into a dataset if it has a snapshot. * Fixed snapshot error handling. * Fixed 'hangs' while deleting large files. * Fixed lock contention (rrw_exit) while running a read only load. * Fixed error message when creating a pool to include all problematic devices. * Fixed Xen virtual block device detection, partitions are now created. * Fixed missing E2BIG error handling in zfs_setprop_error(). * Fixed zpool import assertion in libzfs_import.c. * Fixed zfs send -nv output to stderr. * Fixed idle pool potentially running itself out of space. * Fixed narrow race which allowed read(2) to access beyond fstat(2)'s reported end-of-file. * Fixed support for VPATH builds. * Fixed double counting of HDR_L2ONLY_SIZE in ARC. * Fixed 'BUG: Bad page state' warning from kernel due to writeback flag. * Fixed arc_available_memory() to check freemem. * Fixed arc_memory_throttle() to check pageout. * Fixed'zpool create warning when using zvols in debug builds. * Fixed loop devices layered on ZFS with 4.1 kernels. * Fixed zvol contribution to kernel entropy pool. * Fixed handling of compression flags in arc header. * Substantial changes to realign code base with illumos. * Many additional bug fixes. Signed-off-by: Nathaniel Clark Change-Id: I87c012aec9ec581b10a417d699dafc7d415abf63 Reviewed-on: http://review.whamcloud.com/16399 Tested-by: Jenkins Reviewed-by: Alex Zhuravlev Tested-by: Maloo Reviewed-by: Andreas Dilger --- contrib/lbuild/lbuild | 2 +- lustre/ChangeLog | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/contrib/lbuild/lbuild b/contrib/lbuild/lbuild index 37826fc309..07494c5b63 100755 --- a/contrib/lbuild/lbuild +++ b/contrib/lbuild/lbuild @@ -1050,7 +1050,7 @@ build_spl_zfs() { # The spl/zfs spec files expect RPM_BUILD_ROOT to point to the root of the # destination for the rpms export RPM_BUILD_ROOT=$TOPDIR - SPLZFSVER=${SPLZFSVER:-0.6.4.2} + SPLZFSVER=${SPLZFSVER:-0.6.5.2} SPLZFSTAG=${SPLZFSTAG:-} # The files expect a kver to be set to the kernel version . diff --git a/lustre/ChangeLog b/lustre/ChangeLog index c2862cba0f..fd0b72db7d 100644 --- a/lustre/ChangeLog +++ b/lustre/ChangeLog @@ -16,6 +16,8 @@ TBD Intel Corporation 3.0.101-0.47.55 (SLES11 SP3) 3.12.39-47 (SLES12) * Recommended e2fsprogs version: 1.42.13.wc3 or newer + * Recommended ZFS / SPL version: 0.6.5.2 + * Tested with ZFS / SPL version: 0.6.5.2 * NFS export disabled when stack size < 8192 (32-bit Lustre clients), since the NFSv4 export of Lustre filesystem with 4K stack may cause a stack overflow. For more information, please refer to bugzilla 17630.