Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow zfs to send replication streams when missing snapshots in the hierarchy #5285

Closed
wants to merge 1 commit into from

Conversation

loli10K
Copy link
Contributor

@loli10K loli10K commented Oct 16, 2016

Since the integration of OpenZFS 6111 (6635624) we lost the ability to create replication send streams when we're missing even a single snapshot in the whole hierarchy.

# fallocate -l 256m /var/tmp/zfs-vdev1
# zpool create tank /var/tmp/zfs-vdev1
# zfs create tank/important-data1
# zfs create tank/important-data2
# zfs create tank/important-data3
# zfs create tank/not-so-much
# zfs snapshot -r tank@sendme
# zfs destroy tank/not-so-much@sendme
# zfs send -Rv tank@sendme > /dev/null
cannot send tank@sendme recursively: snapshot tank/not-so-much@sendme does not exist
# echo $?
1
# 

I don't know if OpenZFS 6111 intentionally removed this functionality but this could seriously break user scripts used for backups/replication (as demonstrated by the open issue) and slightly violates the principle of least astonishment.

Also add a new script to the ZFS test suite.

This should fix #5196

…ierarchy

With OpenZFS 6111 we lost the ability to create replication send streams
when we're missing even a single snapshot in the whole hierarchy: this
commit restore this functionality.

Add also relative zfs_send_008_pos.ksh script to ZFS test suite.

Signed-off-by: loli10K <[email protected]>
@mention-bot
Copy link

@loli10K, thanks for your PR! By analyzing the history of the files in this pull request, we identified @kernelOfTruth, @ahrens and @behlendorf to be potential reviewers.

@behlendorf
Copy link
Contributor

Let's ask @ahrens. This definitely looks like it impacts OpenZFS upstream and would be a regression in functionality.

@ahrens
Copy link
Member

ahrens commented Oct 17, 2016

Let's also ask @deiter, who authored the OpenZFS change (illumos bug 6111). For reference, here was the original code review: openzfs/openzfs#76

IIRC, the intent here was explicitly to make it an error if one of the snapshots is absent, so that you don't think you've sent all the data but in reality have not. However, the snapshot is not required on new filesystems, so this will still work:

zfs snapshot -r pool/A@snap
zfs create pool/A/new             # lacks @snap
zfs send -R pool/A@snap

@loli10K
Copy link
Contributor Author

loli10K commented Oct 18, 2016

From my understanding of the issue description here https://www.illumos.org/issues/6111 the problem reported was that we couldn't zfs send recursively in case we created a child dataset in the hierarchy after the snapshots we wanted to send and not included in the interval specified by -i|-I.

What was also introduced is the inability to send recursively the parent dataset and its descendants skipping the ones with no snapshot that where created before. For my own selfish use case i'm thinking rpool/swap, but there are other as well. This is almost like the problem we just solved, but it's now retroactively affecting all consolidated dataset trees we have around: either we start sending data we don't care that much about or we have to stop using -R on the parent and start to send every dataset on its own. In both cases snap -r/send -R become less appealing than before.

make it an error if one of the snapshots is absent, so that you don't think you've sent all the data but in reality have not

That was also a concern, but we emit a warning when we don't send a snapshot and the return code of zfs send in this case it's != 0, i don't think anyone could be fooled in thinking we have sent all the data.

# zfs list -t all
NAME               USED  AVAIL  REFER  MOUNTPOINT
tank               146K   208M    19K  /tank
tank/send           38K   208M    19K  none
tank/send@init        0      -    19K  -
tank/send/before    19K   208M    19K  none
# zfs send -v -R tank/send@init > file
full send of tank/send@init estimated size is 9.50K
WARNING: could not send tank/send/before@init: does not exist
total estimated size is 9.50K
TIME        SENT   SNAPSHOT
WARNING: could not send tank/send/before@init: does not exist
# echo $?
1
# zfs recv tank/recv < file
# zfs list -t all
NAME               USED  AVAIL  REFER  MOUNTPOINT
tank               184K   208M    19K  /tank
tank/recv           19K   208M    19K  none
tank/recv@init        0      -    19K  -
tank/send           38K   208M    19K  none
tank/send@init        0      -    19K  -
tank/send/before    19K   208M    19K  none

What i got wrong in this patch is the return code of zfs send when we try to send datasets created after the last snapshot (should be 0, it's 1 after this patch). If this is still of interest to be integrated i will update the code and relative test script.

@loli10K
Copy link
Contributor Author

loli10K commented Jan 10, 2017

This is getting stale, and it's also creating conflicts with other PR because of test case numbering ... any news on this? Should we close it and keep the new zfs send behaviour?

@loli10K loli10K closed this Jan 18, 2017
@behlendorf
Copy link
Contributor

Sorry about letting this one get stale, but let's keep it open for the moment. I think there's a reasonable case to be made for continuing to support the existing long standing behavior which is useful. There are almost certainly users out there who are depending on it, even if it technically wasn't intended.

What if instead of removing this check entirely we added a new command line option to zfs send which explicitly allows this behavior. This would address the concerns of users being mislead about what was sent since by default it would be fatal. Users which do depend on this functionality could easily supply the flag rather than potentially rewrite they tools. If we updated error message to reference the new option users who are upgrading would immediately be aware of the issue and have a method of handling it.

The downside to this approach would be having yet another option to zfs send which would need to be supported and might be confusing. @loli10K @ahrens what are your thoughts?

@behlendorf behlendorf reopened this Jan 18, 2017
@prometheanfire
Copy link
Contributor

I'm fine with changing my scripts if needed

@loli10K
Copy link
Contributor Author

loli10K commented Jan 20, 2017

@behlendorf i'm in favor of doing whatever will benefit the majority of users: if a new option (-f?) is required to do so then i'm all for it. That said, unfortunately i won't be able to do this kind of development.

@loli10K
Copy link
Contributor Author

loli10K commented Mar 13, 2017

Just heads up to whoever will eventually (or not) pick up this: it seems (zfs send) -r is going to be used to send "raw" streams https://github.com/tcaputi/zfs/blob/raw_sends/cmd/zfs/zfs_main.c#L3842, you'll have to come up with something else.

@prometheanfire
Copy link
Contributor

That's gonna cause (keep) things broken for those that use the old bad method, though I've moved on already...

@behlendorf
Copy link
Contributor

Closing for now. We'll stick with the upstream behavior and users depending on the previous behavior will need to update their scripts. Adding an option to explicitly allow for missing snapshots is the stream would be a nice feature if someone is interested in developing it.

@behlendorf behlendorf closed this May 10, 2017
@loli10K loli10K deleted the issue-5196 branch September 17, 2017 10:39
behlendorf pushed a commit that referenced this pull request Apr 11, 2021
A tentative implementation and discussion was done in #5285.
According to it a send --skip-missing|-s flag has been added.
In a replication stream, when there are snapshots missing in
the hierarchy, if -s is provided print a warning and ignore
dataset (and its children) instead of throwing an error

Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pablo Correa Gómez <[email protected]>
Closes #11710
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Apr 14, 2021
A tentative implementation and discussion was done in openzfs#5285.
According to it a send --skip-missing|-s flag has been added.
In a replication stream, when there are snapshots missing in
the hierarchy, if -s is provided print a warning and ignore
dataset (and its children) instead of throwing an error

Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pablo Correa Gómez <[email protected]>
Closes openzfs#11710
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
A tentative implementation and discussion was done in openzfs#5285.
According to it a send --skip-missing|-s flag has been added.
In a replication stream, when there are snapshots missing in
the hierarchy, if -s is provided print a warning and ignore
dataset (and its children) instead of throwing an error

Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pablo Correa Gómez <[email protected]>
Closes openzfs#11710
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

recursive dataset not sendable if missing child datasets
5 participants