Large fsync's starve smaller fsync's #4603
Labels
Status: Inactive
Not being actively updated
Status: Stale
No recent activity for issue
Type: Performance
Performance improvement or performance problem
This was discovered with Lustre on ZFS (0.6.5.4). Users was experiencing long delays when reading log files written from other nodes. This will make Lustre flush the new data to disk, before a client is allowed to read.
I managed to recreate what I think is the problem using plain ZFS (0.6.5.6 CentOS 6.7) on a single block device. Doing an ioping in one process while dd'ing a large file with conv=fsync. ioping is just doing random synced 4k writes in a 1MB once a second.
The ioping command:
ioping -WWW /test0/1m
The large file dd:
dd of=/test0/f0 if=/dev/zero bs=1M count=14000 conv=fsync
The server has 128GB of memory so the size is larger than zfs_dirty_data_max, in order to create a worst case.
Output from ioping when fsync kicks in:
After 5 secs (zfs_txg_timeout) of dd the latency goes up, but there is progress. After the dd fsync kicks in the next ping is stalled for 1.5 minutes, while the large file is flushed. The disk writes ~160MB/s.
The text was updated successfully, but these errors were encountered: