Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to run/trigger compression/deduplication of pool/volume manually #3013

Open
pavel-odintsov opened this issue Jan 13, 2015 · 12 comments
Labels
Status: Design Review Needed Architecture or design is under discussion Status: Inactive Not being actively updated Type: Feature Feature request or new feature

Comments

@pavel-odintsov
Copy link

Hello!

I have big amount of non compressed data in multiple pools and volumes. I want to enable compression because my data compressed very well in synthetic tests.

I enabled compression for pool:

zfs set compression=lz4 data

But I can't find any way to compress data on pool without copying it again.

I do following:

for i in `/bin/ls /data`;do
   echo "Process volume ${i}";
   zfs snapshot data/${i}@snap;
   zfs send data/${i}@snap | zfs receive -F data/${i}_compressed;done

It works well and compression going perfectly.

But how I can do compression in place without service interruption and creating temporary volumes?

I review zio.c code and found code used for compression is not hard to understand. What problems with in-place data compression or decompression?

This ticket can be related with #1071 but deduplication logic is very different in compare with compression.

@behlendorf behlendorf added Type: Feature Feature request or new feature Difficulty - Medium labels Jan 16, 2015
@behlendorf
Copy link
Contributor

But I can't find any way to compress data on pool without copying it again.

Right, at the moment doing this transparently isn't supported. You're either going to need to do what you're doing with send/recv to a temporary volume which gets renamed. Or you could write a script to do this on a per-file basis for a dataset. If compression is enabled for the dataset new files will be compressed so you would just need to do something like this cp file file.tmp; unlink file; mv file.tmp tmp. Keep in mind if a dataset has snapshots the uncompressed blocks will remain part of the snapshot until it is also removed.

Doing this transparently in the background is technically possible but the same caveats regarding snapshot apply. They are immutable, period. Obviously someone would still need to write the code for this.

@pavel-odintsov
Copy link
Author

Thank you very much!

I wrote simple Perl script for this task https://gist.github.com/pavel-odintsov/aa497b6d9b351e7b3e2b and it works well.

@pavel-odintsov
Copy link
Author

Unfortunately file-to-file iteration for my data is extremely slow. I run file_rewrite.pl about 36+ hours ago and now about 6% of data was processed.

Processing of files is still not reliable way because files with broken names (due to encoding issues; not related with ZFS) did not processed correctly.

Can I do same on block level in-place? I want to get all used blocks of my volume and do compression for they blocks instead relaying on files.

@behlendorf
Copy link
Contributor

Can I do same on block level in-place?

No. You could send/recv for the pool with incremental snapshots. That would allow you to keep the downtown to a minimum.

@pavel-odintsov pavel-odintsov changed the title Ability to run/trigger compression of pool/volume manually Ability to run/trigger compression/deduplication of pool/volume manually Jan 23, 2015
@pavel-odintsov
Copy link
Author

This issue is even more important in case of ZVOL when we can't touch every file in filesystem (ntfs, refs and another non linux fs).

@paboldin
Copy link
Contributor

paboldin commented May 2, 2016

@behlendorf is it required to recreate the file or is it enough just to re-write the blocks? Can this rewriting be done at the VFS level?

As I can see from the source code it should be enough. In this case one can implement 'toucher' using e.g. dsl_sync_task and dmu_traverse (?). Is that correct?

@behlendorf
Copy link
Contributor

@paboldin simply re-dirtying the block is enough given two caviots.

  1. The new bp and original bp must have different characteristics, in this case checksum algorithm or dedup. Otherwise the write will be optimized out by zio_nop_write().

  2. This could easily result in a doubling of space used if the filesystem/zvol has snapshots. Those block can never be rewritten. It would probably be wise to include a sanity check on the required free space before allowing such an operation.

@rlaager
Copy link
Member

rlaager commented Oct 1, 2016

See also #2554.

@dioni21
Copy link
Contributor

dioni21 commented Jul 5, 2018

The very old problem of BP rewrite. AFAIR, everybody that try abort saying it is too difficult. :-(

@ghost
Copy link

ghost commented Oct 11, 2019

I wrote a small shell script to replicate, verify and overwrite all files in the current working directory and all its descendant directories in order to trigger ZFS compression. Use with significant caution and make sure to have a backup beforehand.

@owlshrimp
Copy link

@paboldin simply re-dirtying the block is enough given two caviots.

  1. The new bp and original bp must have different characteristics, in this case checksum algorithm or dedup. Otherwise the write will be optimized out by zio_nop_write().
  2. This could easily result in a doubling of space used if the filesystem/zvol has snapshots. Those block can never be rewritten. It would probably be wise to include a sanity check on the required free space before allowing such an operation.

So, if for example we enabled deduplication and compression at the same time, or enabled compression and changed checksum algorithm, then dirtied all the blocks, it would result in them all being rewritten? (I presume a combination of deduplication and changed checksum would also work?)

What would be the best way to re-dirty a block, given a hypothetical outer loop that cycles over every block of every file? Can it be done without changing the block's contents? (is this what the above conditions ensure?) Is this something that really should be done from within ZFS itself? From the accompanying library?

Baseless speculation:

Part of me wonders if it's possible to introduce a sequence number* in the block pointers just to make data appear "different" to zio_nop_write() without altering the settings. Then it's a matter of going through the directory tree and progressively dirtying every block of every file, so long as there's space** (and maybe I/O capacity) available to accommodate it.

*a "please rewrite" flag would have to be set on everything, though perhaps that traversal wouldn't be so bad. Also maybe not, if you consider a flag to be a 2-value sequence number. Hmm.

**might be enough to say to ZFS "please leave at least 200 GB" though one would expect the space to be reclaimed if there are no snapshots pinning it

@owlshrimp
Copy link

owlshrimp commented Aug 26, 2021

This is starting to remind me a little of the issue thread for radz expansion ( #12225 ). There were similar requests for a way to trigger the reformatting of old data to the new stripe width, though it may or may not be more tricky there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Design Review Needed Architecture or design is under discussion Status: Inactive Not being actively updated Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

7 participants
@rlaager @behlendorf @paboldin @pavel-odintsov @dioni21 @owlshrimp and others