Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lingering .send-#####-1 holds #173

Closed
rottegift opened this issue May 2, 2014 · 9 comments
Closed

Lingering .send-#####-1 holds #173

rottegift opened this issue May 2, 2014 · 9 comments
Labels

Comments

@rottegift
Copy link
Contributor

I'm seeing stale holds.

$ sudo zfs destroy -v ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810
will destroy ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810
will reclaim 1.65G
cannot destroy snapshot ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810: dataset is busy
$ sudo zfs holds ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810
NAME                                              TAG            TIMESTAMP
ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810  .send-95118-1  Thu May  1 19:27 2014
ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810  .send-78468-1  Thu May  1 13:17 2014

(This is in a pipeline zfs send ... | ssh target zfs recv ... where the snapshot in question succeeds

receiving incremental stream of ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810 into OldStuff/CLA-TM/from_ssdpool/ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810
received 1.92GB stream in 1606 seconds (1.23MB/sec)

and that whole incremental send succeeded without error -- I am 80% sure, anyway; the recv host has crashed a couple times in the past ocuple ofdays, and I cannot exclude the possibility that this snapshot was in the range of (one or two) incremental send(s) where the recv went away.

Removing the holds by hand correctly activates a defer_destroy

$ sudo zfs destroy -d -v ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810
will destroy ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810
will reclaim 1.65G
$ zfs release  .send-95118-1 ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810  
$ zfs release .send-78468-1 ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810

I'm pretty sure this has hit other zfs implementations in the past too, but perhaps O3X's zfs send cleanup has a local bug rather than inheriting this from upstream.

@rottegift
Copy link
Contributor Author

It doesn't seem to be illumos #3645 (for which we have ZoL's fix around dump_bytes).

@rottegift
Copy link
Contributor Author

Oddly, the same hold on several other snapshots did not linger:

2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-29-2231 (20027) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@2014-04-29-225559 (20183) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-29-2331 (20439) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-0912 (260) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@2014-04-30-095114 (557) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@2014-04-30-135114 (647) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-1835 (4481) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-1935 (5007) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-2035 (5417) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-2135 (5880) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-2235 (6342) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-2335 (6781) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0035 (7182) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0135 (7575) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0235 (7993) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0335 (8371) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0435 (8789) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810 (1287) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0910 (1729) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1010 (2165) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1110 (2647) tag=.send-78468-1 temp=1 refs=1
2014-05-01.13:17:26 [txg:94689933] hold ssdpool/foo@2014-05-01-114403 (2951) tag=.send-78468-1 temp=1 refs=1
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-04-30-2335 (6781) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0035 (7182) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0135 (7575) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0235 (7993) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0335 (8371) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0435 (8789) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810 (1287) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0910 (1729) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1010 (2165) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1110 (2647) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@2014-05-01-114403 (2951) tag=.send-95118-1 temp=1 refs=2
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1210 (3164) tag=.send-95118-1 temp=1 refs=1
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1310 (3562) tag=.send-95118-1 temp=1 refs=1
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1410 (3995) tag=.send-95118-1 temp=1 refs=1
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-1510 (8657) tag=.send-95118-1 temp=1 refs=1
2014-05-01.19:27:39 [txg:94693796] hold ssdpool/foo@2014-05-01-191853 (8746) tag=.send-95118-1 temp=1 refs=1

and my subsequent by-hand zfs release

2014-05-02.14:01:16 [txg:94725259] release ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810 (1287) tag=.send-95118-1 refs=1
2014-05-02.14:01:25 [txg:94725261] release ssdpool/foo@zfs-auto-snap_hourly-2014-05-01-0810 
(1287) tag=.send-78468-1 refs=0

and finally

$ uptime
14:38  up 1 day,  8:28, 24 users, load averages: 4.52 4.35 4.42

@rottegift
Copy link
Contributor Author

Hm, after a series of reboots, full exports and full imports, all while the zfs recv receiver is stable, I'm still seeing this. I'll try to provide a reduced test case.

@rottegift rottegift reopened this May 3, 2014
@lundman
Copy link
Contributor

lundman commented May 13, 2014

It is not all send operations that trigger the automatic holds to be in effect;

"doall" and "replicate" has to be on for holds to be used. Ie, -I and -R.

Then it makes a list of snapshots to hold, and call lzc_hold().

This uses the (somewhat) new zfs_onexit_ API to setup callbacks when the cleanup_fd is closed. On OSX, this does differ, in that we have to use current_proc() instead of fd to match.

The setup is done in zfs_ioc_hold

    if (nvlist_lookup_int32(args, "cleanup_fd", &cleanup_fd) == 0) {
        error = zfs_onexit_fd_hold(cleanup_fd, &minor);
        if (error != 0)
            return (error);
    }
    error = dsl_dataset_user_hold(holds, minor, errlist);

Now in zfs_onexit_fd_hold we already do the OSX translation to use current_proc, and in zfsdev_release we are (only) given the current_proc to match the correct onexit callback, and call it. In this case it should be dsl_dataset_user_release_onexit().

It is possible there is some case we missed for the OSX version, but I don't consider it a release-blocker for the next Installer at this moment.

@rottegift
Copy link
Contributor Author

This persists in master after the 0.6.3 sync.

@ilovezfs
Copy link
Contributor

Yeah, we think we may know why the holds are not released. Cleanup is currently per-process not per-open-fd, so if it happens that /dev/zfs is opened, opened, closed, closed, only one of the two cleanups is happening, whereas if it were opened, closed, opened, closed, both would happen. This is not straightforward to fix, but we'll probably need to do something like what Apple does with audit_sdevs:
http://fxr.watson.org/fxr/source/bsd/security/audit/audit_session.c?v=xnu-2050.18.24;im=excerpts#L2009

@ilovezfs
Copy link
Contributor

Evidence for this is in your paste:
#194 (comment)

Notice:

16/06/2014 14:56:50.000 kernel[0]: zfsdev_open, flag 03 devtype 8192, proc is 0xffffff80364db5d8: thread 0xffffff8031382590
16/06/2014 14:56:50.000 kernel[0]: created zs 0xffffff81f7f551e8
16/06/2014 14:56:50.000 kernel[0]: zfsdev_open, flag 03 devtype 8192, proc is 0xffffff80364db5d8: thread 0xffffff8031382590
16/06/2014 14:56:50.000 kernel[0]: zs already exists

@rottegift
Copy link
Contributor Author

Ok, neat. I have a local workaround that cleans up the stale holds, but will happily test proposed fixes.

@ilovezfs ilovezfs added the bug label Jun 16, 2014
lundman added a commit that referenced this issue Jun 17, 2014
and patch that into zfsdev_minor_alloc(), causing a unique minor to be
allocated for each open. We then need to create softstate for ctldev so
we can differentiate between ctldev and zvol ioctls.
Issue #173
@lundman
Copy link
Contributor

lundman commented Jun 17, 2014

A bit more work went into that, you also need to pull SPL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants