Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs list slow #450

Closed
leelists opened this issue Nov 14, 2011 · 18 comments
Closed

zfs list slow #450

leelists opened this issue Nov 14, 2011 · 18 comments
Labels
Type: Feature Feature request or new feature
Milestone

Comments

@leelists
Copy link

Hi,

the command zfs list -t snapshot -r pool/xxxx is very very slow.
i have approx 1200 snapshot under pool/xxxx the command take's approx 20 minutes.

Any clues ?

Thanks Lee

@behlendorf
Copy link
Contributor

That's a lot of snapshots, but 20 minutes does seem unreasonably long. Can you try running top during the zfs list and see if any of the zfs threads are cpu bound?

@leelists
Copy link
Author

It seem's that no zfs process is cpu bound, you will find an extract of iostat -kx 5

avg-cpu: %user %nice %system %iowait %steal %idle
0,22 0,00 5,18 0,00 0,00 94,60

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sde 6,60 1,20 79,80 59,20 5392,00 232,00 80,92 1,06 7,61 2,21 30,72
sdf 3,60 2,40 93,80 48,60 5395,00 217,60 78,83 1,07 7,53 1,94 27,68

can it have anything to do with this bug : http://wesunsolve.net/bugid/id/6755389 ?

@behlendorf
Copy link
Contributor

It sounds like it may be caused by a similar issue. However, the Linux port does already contain the fixes referenced in that issue. We'll need to dig in to it to determine exactly what's causing the slow down.

@behlendorf
Copy link
Contributor

Pawel Jakub Dawidek of the FreeBSD port took a look at this same issue and came up with the following patch. It still needs to be reviewed and ported to Linux but it looks very promising.

http://people.freebsd.org/~pjd/patches/zfs_list_name.2.patch

@leelists
Copy link
Author

Effectively it look's promising, keep on your good work

@Rudd-O
Copy link
Contributor

Rudd-O commented May 4, 2012

Looks good to me too.

@Rudd-O
Copy link
Contributor

Rudd-O commented May 25, 2012

My zfs list processes are IO bound. They do this a few times per second while they appear to be reading data (almost a megabyte every few seconds) from the disk according to atop:

ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0
ioctl(6, 0x5a15, 0x7fffd7e20c80) = 0

@behlendorf
Copy link
Contributor

That I/O would likely be ZFS prefetching the additional list data. It should be reasonably straight forward to port Pawel's patch to disable this behavior if you want to try that.

@behlendorf
Copy link
Contributor

@Rudd-O Please try this patch and see if it resolves you issue. It's a port of Pawel's FreeBSD change for ZoL, note it doesn't impact the default behavior of zfs list. To see the speed up you need to request just the names zfs list -t snapshot -o name -s name. If you don't all the properties will be fetched as well and this is the major slowdown.

@behlendorf
Copy link
Contributor

@dajhorn You might want to look at this as well for zfs-auto-snapshot. If simply the list of snapshot names is enough than this should considerably speed things up. The change is safe enough so assuming it tests well for other it could be merged in soon.

@dajhorn
Copy link
Contributor

dajhorn commented May 29, 2012

@behlendorf The zfs-auto-snapshot for ZoL does not call zfs list in a way that satisfies the test at line 2,822 for the new ZFS_ITER_SIMPLE flag. The code needs some finesse before it will benefit from this optimization.

@timbrody
Copy link

Some more background ... two identical spec. systems:

$ time sudo zfs list -t snapshot | wc
    142     142    7375

real    0m6.451s
user    0m0.036s
sys 0m0.408s

# time sudo zfs list -t snapshot | wc
    148     740   14217

real    0m0.253s
user    0m0.012s
sys 0m0.188s

The first system has a lot more data in its snapshots. The first system is ~8% CPU to 'zfs' and ~1% in SYS (watching top).

gdb --args zfs list -t snapshot
) run
CTRL + C
) bt

#0  0x00007ffff6f8f747 in ioctl () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff747b4da in ?? () from /lib/libzfs.so.1
#2  0x00007ffff747f222 in zfs_iter_snapshots () from /lib/libzfs.so.1
#3  0x00000000004047d0 in ?? ()
#4  0x00007ffff747f0a0 in zfs_iter_filesystems () from /lib/libzfs.so.1
#5  0x00000000004047e8 in ?? ()
#6  0x00007ffff747f0a0 in zfs_iter_filesystems () from /lib/libzfs.so.1
#7  0x00000000004047e8 in ?? ()
#8  0x00007ffff747b099 in zfs_iter_root () from /lib/libzfs.so.1
...

NB version 31 refers to "support for improved 'zfs list' performance.":
http://hub.opensolaris.org/bin/view/Community+Group+zfs/31

@Rudd-O
Copy link
Contributor

Rudd-O commented Jun 4, 2012

The patch won'twork for me because I need at least one property (namely the snapshot date which I use to sort the sync ops).

BTW github.com/Rudd-O/zfs-tools check the zreplicate command. I am so proud of it.

@leelists
Copy link
Author

Thanks a lot, the patch really helps

Now zfs list -t snapshot -o name -r pool/xxxx returns in severals seconds,

Great job

behlendorf pushed a commit to behlendorf/zfs that referenced this issue Jun 14, 2012
FreeBSD #xxx:  Dramatically optimize listing snapshots when user
requests only snapshot names and wants to sort them by name, ie.
when executes:

  # zfs list -t snapshot -o name -s name

Because only name is needed we don't have to read all snapshot
properties.

Below you can find how long does it take to list 34509 snapshots
from a single disk pool before and after this change with cold and
warm cache:

    before:

        # time zfs list -t snapshot -o name -s name > /dev/null
        cold cache: 525s
        warm cache: 218s

    after:

        # time zfs list -t snapshot -o name -s name > /dev/null
        cold cache: 1.7s
        warm cache: 1.1s

NOTE: This patch only appears in FreeBSD.  If/when Illumos picks up
the change we may want to drop this patch and adopt their version.
However, for now this addresses a real issue.

Ported-by: Brian Behlendorf <[email protected]>
Issue openzfs#450
@byteharmony
Copy link

Thanks for the above information, it really helped improve my script speeds. Getting volume snapshot information went from 7 minutes (for example on a small system with 2,600 snapshots) down to 30 seconds. I think it's important to note, however, that this is not a ZoL (ZFS on Linux) issue, rather it is a ZFS issue. I see this same snapshot list speed on Solaris ZFS systems. When there is a large number of snapshots (over 1000 on a pool) scripts managing the snaps take longer and average system output is significantly reduced.

BK

@behlendorf
Copy link
Contributor

I'd be curious to hear how your original scripts perform on the latest Solaris. They claim significant speed ups here in zpool version 31. But I've never seen any hard data.

@byteharmony
Copy link

Sorry if I mislead you, perhaps I should have been more clear that the Solaris machine was a Nexenta system. So it's also running version 28.

BK

@ryao
Copy link
Contributor

ryao commented Oct 3, 2012

@byteharmony You could ask @mmatuska what the upstream status of that patch is. He tends to do the work to get FreeBSD improvements to ZFS sent back to Illumos, which would be needed before it goes into Nexenta.

tisoft added a commit to tisoft/zfs-auto-snapshot that referenced this issue Apr 16, 2013
zfs list is very slow if it needs to retrieve other properties than name. See openzfs/zfs#450 for reference.

This change uses only the name of the snapshot and uses awk to extract the snapshot date from the name and sort to sort it:

1. get all snapshot names, sorted by name
2. use grep to filter out any snapshot, that isn't correctly prefixed (maybe a check for the correct date format could also be added here)
3. use awk to extract the date and put the date before the snapshot
4. use sort to reverse sort this, first by date, then by snapshot name
5. use awk to remove the date, so that just the snapshot name is left

This significally speeds it up on my system (/root/zfs-auto-snapshot is the changed one, running with -n to get only the times for snapshot retrieval):

root@tatooine:/root# time /root/zfs-auto-snapshot -d -v // --label=frequent --keep=4 -n
Debug: Including rpool for regular snapshot.
Debug: Including rpool/DATA for regular snapshot.
Debug: Including rpool/DATA/ldap for recursive snapshot.
Debug: Including rpool/DATA/postgresql for recursive snapshot.
Debug: Including rpool/DATA/vm for regular snapshot.
Debug: Including rpool/DATA/vm/alderaan.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/bespin.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/coruscant.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/dagobah.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/dantooine.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/dev.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/monitor.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/office.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/test.example.net for recursive snapshot.
Debug: Including rpool/ROOT for recursive snapshot.
Debug: Excluding rpool/ROOT/ubuntu-1 because rpool/ROOT includes it recursively.
Doing regular snapshots of rpool rpool/DATA rpool/DATA/vm
Doing recursive snapshots of rpool/DATA/ldap rpool/DATA/postgresql rpool/DATA/vm/alderaan.example.net rpool/DATA/vm/bespin.example.net rpool/DATA/vm/coruscant.example.net rpool/DATA/vm/dagobah.example.net rpool/DATA/vm/dantooine.example.net rpool/DATA/vm/dev.example.net rpool/DATA/vm/monitor.example.net rpool/DATA/vm/office.example.net rpool/DATA/vm/test.example.net rpool/ROOT
Doing a dry run. Not running these commands...
zfs snapshot -o com.sun:auto-snapshot-desc='-'  'rpool@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy  'rpool@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-'  'rpool/DATA@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy  'rpool/DATA@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-'  'rpool/DATA/vm@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy  'rpool/DATA/vm@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/ldap@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/ldap@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/postgresql@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/postgresql@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/alderaan.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/alderaan.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/bespin.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/bespin.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/coruscant.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/coruscant.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/dagobah.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/dagobah.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/dantooine.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/dantooine.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/dev.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/dev.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/monitor.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/monitor.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/office.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/office.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/test.example.net@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/DATA/vm/test.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/ROOT@zfs-auto-snap_frequent-2013-04-16-2344'
zfs destroy -r 'rpool/ROOT@zfs-auto-snap_frequent-2013-04-16-2315'
@zfs-auto-snap_frequent-2013-04-16-2344, 15 created, 15 destroyed, 0 warnings.

real	0m6.936s
user	0m0.076s
sys	0m0.184s
root@tatooine:/root# time /sbin/zfs-auto-snapshot -d -v // --label=frequent --keep=4 -n
Debug: Including rpool for regular snapshot.
Debug: Including rpool/DATA for regular snapshot.
Debug: Including rpool/DATA/ldap for recursive snapshot.
Debug: Including rpool/DATA/postgresql for recursive snapshot.
Debug: Including rpool/DATA/vm for regular snapshot.
Debug: Including rpool/DATA/vm/alderaan.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/bespin.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/coruscant.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/dagobah.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/dantooine.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/dev.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/monitor.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/office.example.net for recursive snapshot.
Debug: Including rpool/DATA/vm/test.example.net for recursive snapshot.
Debug: Including rpool/ROOT for recursive snapshot.
Debug: Excluding rpool/ROOT/ubuntu-1 because rpool/ROOT includes it recursively.
Doing regular snapshots of rpool rpool/DATA rpool/DATA/vm
Doing recursive snapshots of rpool/DATA/ldap rpool/DATA/postgresql rpool/DATA/vm/alderaan.example.net rpool/DATA/vm/bespin.example.net rpool/DATA/vm/coruscant.example.net rpool/DATA/vm/dagobah.example.net rpool/DATA/vm/dantooine.example.net rpool/DATA/vm/dev.example.net rpool/DATA/vm/monitor.example.net rpool/DATA/vm/office.example.net rpool/DATA/vm/test.example.net rpool/ROOT
Doing a dry run. Not running these commands...
zfs snapshot -o com.sun:auto-snapshot-desc='-'  'rpool@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy  'rpool@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-'  'rpool/DATA@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy  'rpool/DATA@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-'  'rpool/DATA/vm@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy  'rpool/DATA/vm@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/ldap@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/ldap@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/postgresql@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/postgresql@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/alderaan.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/alderaan.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/bespin.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/bespin.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/coruscant.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/coruscant.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/dagobah.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/dagobah.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/dantooine.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/dantooine.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/dev.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/dev.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/monitor.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/monitor.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/office.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/office.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/DATA/vm/test.example.net@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/DATA/vm/test.example.net@zfs-auto-snap_frequent-2013-04-16-2315'
zfs snapshot -o com.sun:auto-snapshot-desc='-' -r 'rpool/ROOT@zfs-auto-snap_frequent-2013-04-16-2348'
zfs destroy -r 'rpool/ROOT@zfs-auto-snap_frequent-2013-04-16-2315'
@zfs-auto-snap_frequent-2013-04-16-2348, 15 created, 15 destroyed, 0 warnings.

real	3m30.995s
user	0m0.152s
sys	0m0.792s
root@tatooine:/root# 

I'm not an awk god, so I tried a bit until I got a working version. There might be better ways. I also don't know if this catches every adge case, but it is a start for improvement. :)
behlendorf pushed a commit to behlendorf/zfs that referenced this issue May 21, 2018
C type coercion rules require that negative numbers be converted into
positive numbers via wraparound such that a negative -1 becomes a
positive 1. This causes vn_getf to return a file handle when it should
return NULL whenever a positive file descriptor existed with the same
value. We should check for a negative file descriptor and return NULL
instead.

This was caught by ClusterHQ's unit testing.

Reference:
http://stackoverflow.com/questions/50605/signed-to-unsigned-conversion-in-c-is-it-always-safe

Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Andriy Gapon <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#450
ahrens pushed a commit to ahrens/zfs that referenced this issue Sep 21, 2021
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
…aster

Merge remote-tracking branch '6.0/stage' into 'master'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

7 participants