Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix EIO after resuming receive of new dataset over an existing one #10999

Merged
merged 1 commit into from
Oct 3, 2020

Conversation

asomers
Copy link
Contributor

@asomers asomers commented Sep 28, 2020

Fixes #10995
Sponsored by: Axcient
Downtream bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Signed-off-by: Alan Somers [email protected]

Motivation and Context

When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.

#10995

Description

This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.

How Has This Been Tested?

See steps to reproduce in the issue. Tested on FreeBSD 13-CURRENT and FreeBSD 12.2-BETA3 (the latter with an older branch of ZFS). No regressions observed on 12.0-CURRENT with the FreeBSD ZFS test suite.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the OpenZFS code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • I have added tests to cover my changes.
  • I have run the ZFS Test Suite with this change applied. I have only run the FreeBSD zfs test suite, not the OpenZFS test suite
  • All commit messages are properly formatted and contain Signed-off-by.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Sep 28, 2020
@asomers
Copy link
Contributor Author

asomers commented Sep 29, 2020

The test failures don't look related to this PR.

@behlendorf
Copy link
Contributor

@asomers I've resubmitted those unrelated test failures. However, it does look like there were a couple related failures to look in to.

Tests with results other than PASS that are unexpected:
    FAIL redacted_send/redacted_resume (expected PASS)
    FAIL rsend/send-c_resume (expected PASS)
    FAIL rsend/send-c_verify_contents (expected PASS)

http://build.zfsonlinux.org/builders/FreeBSD%20head%20amd64%20%28TEST%29/builds/2112/steps/shell_9/logs/summary

@asomers-ax
Copy link

@behlendorf how does one run those tests on FreeBSD? The README does not say.

@behlendorf
Copy link
Contributor

@asomers the process is the same on Linux and FreeBSD, though again the documentation really needs to be updated to explain this. With the kernel modules loaded and the OpenZFS source tree built you can run the ZFS Test Suite in-tree with the ./scripts/zfs-tests.sh script. For example, the following command will run both the rsend and redacted_send test groups.

# Load the newly built kernel modules from the source tree.
./scripts/zfs.sh

# Run just the rsend and redacted_send test groups.
./scripts/zfs-tests.sh -T rsend,redacted_send

You can also use the -t option to run a specific test.

./scripts/zfs-tests.sh -t tests/functional/redacted_send/redacted_resume.ksh

I'm sure @freqlabs has some tips about running the tests on FreeBSD as well!

@asomers-ax
Copy link

Running the tests in-tree fails with the following error. Are there some extra environment variables I need to set, or extra make targets I need to build?

Missing util(s): zed zgenhostid devname2devid mmap_libaio randfree_file user_ns_exec xattrtest base64 net pamtester

@behlendorf
Copy link
Contributor

That should be enough. The "Missing util(s)" warning shouldn't be fatal, it's largely complaining about some Linux specific utilities and tests which haven't been build (as intended).

@ghost
Copy link

ghost commented Sep 29, 2020

The list of FreeBSD packages for test dependencies here should be up to date.

@asomers-ax
Copy link

Ok, let's try again

@asomers-ax
Copy link

Test failure summary:

  • Checkstyle: I will fix with my next push
  • Debian 10 x86_64: cli_root/zfs_destroy/zfs_destroy_dev_removal_condense failed. But it passes locally and on the other buildbot runs.
  • Fedora 32: build failure; looks unrelated
  • FreeBSD head amd64. ztest failed. AFAIK, ztest doesn't use libzfs, so this PR isn't related to the failure.
  • CentOS 7, Ubuntu 18.04, Ubunt x86_64: zdb_block_size_histogram failed with Expected variance < 10% observed 11%. I don't think that's related to this PR.

@behlendorf
Copy link
Contributor

@asomers you're right those test failures are all known unrelated issue. When you refresh this to address the style issue would you mind rebasing it on master, that'll resolve the Fedora build failure.

When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.

This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.

Fixes openzfs#10995
Sponsored by: Axcient
Downtream bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Signed-off-by: Alan Somers <[email protected]>
@asomers-ax
Copy link

Rebased and squashed.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It'd be nice to have a test for this, but I don't think it's critical.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 30, 2020
@behlendorf behlendorf merged commit a132c2b into openzfs:master Oct 3, 2020
behlendorf pushed a commit that referenced this pull request Oct 16, 2020
When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.

This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.

Sponsored by: Axcient
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Closes #10995
Closes #10999
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.

This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.

Sponsored by: Axcient
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Closes openzfs#10995
Closes openzfs#10999
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.

This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.

Sponsored by: Axcient
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Closes openzfs#10995
Closes openzfs#10999
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zfs receive: Input/output error accessing dataset after resuming interrupted receive
3 participants