Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast Copy Between Related Datasets #13516

Closed
Haravikk opened this issue May 28, 2022 · 2 comments
Closed

Fast Copy Between Related Datasets #13516

Haravikk opened this issue May 28, 2022 · 2 comments
Labels
Type: Feature Feature request or new feature

Comments

@Haravikk
Copy link

Haravikk commented May 28, 2022

Describe the feature would like to see added to OpenZFS

When copying a file from one dataset to another, it should be possible for the file to be copied instantly so long as both datasets are unencrypted, or the source and target dataset share the same encryption root.

I would propose a fastcopy setting for datasets to enable some control over whether files can be fast-copied to them; settings would be on (always fast copy where possible), off (never fast copy), or exact (fast copy only when settings that affect the data structure, such as compression or recordsize are an exact match between the two datasets, this will allow targets to force recompression/restructuring). This setting should probably default to off or exact.

When fast-copying between datasets that do not have the same copies setting, if the copies on the target are lower than the source, additional copies can simply be ignored, whereas if the target wants more copies then the extras will need to be created (e.g- if the source has copies=2 and the target has copies=3 then two copies are linked, and a new third copy is created).

For sending and receiving support, the default behaviour would need to be to treat the fast-copied file as a full copy, using metadata to lookup and send the blocks as if they belonged to the dataset they were fast-copied into. I'm not sure if it would be possible to indicate to zfs send that the receiver should already have a the necessary blocks, as there seems like way too much potential to go out of sync (trying to send a fast-copy of a file that hasn't been sent from the source dataset yet etc.), and I think the logic to handle a send/receive dependant on multiple datasets on each side is too complex (definitely so for this request).

How will this feature improve OpenZFS?

While ZFS makes it very easy to use large numbers of datasets to fine tune settings such as redundancy, record size, etc., copying/moving between them can be comparatively costly as a whole new copy of the file is created within the same pool, resulting in excess redundancy that would require deduplication to prevent (usually too costly to be worth it).

A fast copy of this type would make it possible to copy/move instantly between datasets with zero overhead.

While this may mean a file using smaller records is copied into a dataset with a larger record size (or vice versa), in general this will be preferable to creating a whole new copy, however this will be an opt-in behaviour if the default setting for fastcopy is exact or off.

Additional context

This feature is related to/dependent upon support for cp --reflink requested by various issues such as #13349.

I initially couldn't decide on whether to request this separately or simply add it is a note to other issues, but this feels like a separate feature that would need to be built on top of reflink support, as while basic reflinks are (comparatively) simple, this feature is a bit more complex so will likely be better handled later.

@Haravikk Haravikk added the Type: Feature Feature request or new feature label May 28, 2022
@rincebrain
Copy link
Contributor

I'm not really sure what it is you're looking for, here.

"Don't let me cp --reflink if the dataset settings are different" is a strange requirement, and everything else is just #13392.

@Haravikk Haravikk mentioned this issue May 28, 2022
13 tasks
@Haravikk Haravikk closed this as completed Aug 6, 2022
@Haravikk
Copy link
Author

Haravikk commented Aug 6, 2022

Forgot about this issue when I posted a more focused request as #13572

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

2 participants