Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Able to pin or prioritize data for some datasets in l2arc #13460

Open
mailinglists35 opened this issue May 14, 2022 · 1 comment
Open

Able to pin or prioritize data for some datasets in l2arc #13460

mailinglists35 opened this issue May 14, 2022 · 1 comment
Labels
Type: Feature Feature request or new feature

Comments

@mailinglists35
Copy link

Describe the feature would like to see added to OpenZFS

I would like to have some datasets' l2arc cached data to never be freed when the cache device is full

How will this feature improve OpenZFS?

What problem does this feature solve?

slow backup of my iphones that has many small files on hdd when the dataset l2arc is not already cached the cache device, as the backup process needs first to read all existing incremental backups

Additional context

Any additional information you can add about the proposal?

implement a way to tell zfs that this dataset should avoid as much as possible having it's cached data removed from l2arc

what I do now is cron jobs that read the backup files over and over to force zfs feed as much data to l2arc. obviously I cannot move all backups to ssd, if you're thinking to suggest this as a solution

@mailinglists35 mailinglists35 added the Type: Feature Feature request or new feature label May 14, 2022
@GregorKopka
Copy link
Contributor

I vaguely remember from looking at L2ARC that it is writing linear in round-robin style (not in a random access way with space map etc.), thus always overwrites the data written in the last round. Consequence from this is that everything that isn't in ARC while the write pointer reaches the respective on-disk location is discarded through simply invalidating the L2ARC headers that point into the approaching write window.

An option to implement the suggestion would be to enhance the writer to, instead of just dropping the headers from ARC:

  • Read back data marked as 'keep' into ARC to directly write it back to L2. This could lead to a corner case of L2 being not big enough to hold the full contents of what is marked as 'keep', which could result in the writer turning into a busy loop which burns out the flash of the devices backing L2.

  • Skip over data marked as 'keep' and just write into 'free space' (L2 area that is no longer pointed toward from L2 headers in ARC). This would complicate the logic, as the writer would need to create a space map for the current write window and sort through the buffers eligble to be evicted into L2 to locate chunks small enough to fit. Looks like being doable at first glance, but could lead to some fragmentation / space loss.

But I guess it would make more sense to implement this feature as a new 'cache' vdev type (will call it C2 for this mental exercise) that, while being removeable at any time (like L2), employs random access writes by using space maps (similar to how normal data vdevs work), with a persistent lookup table (hosted on the C2-vdev itself) that translates from pool DVA into C2-LBA. Add a reader process that loads that lookup table on pool import to create C2 headers in the ARC (like persistent ARC). Then a writer process (that can be triggered eg. from changing a dataset property or a zpool subcommand) that will scan a specified dataset (or all that are marked for 'cache') to copy all eligble data from permanent pool vdevs onto the C2, so that new devices can be filled with data already existing in the pool (not only new writes). Last would be extending the read path in the same places L2 is hooked, a little patch to the L2 feeder logic to ignore data that is already in C2 (to avoid double-caching) and a SPA hook (I guess) to invalidate&free deleted data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

2 participants