ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

blacklion · 2020-07-09T14:42:25Z

ARC is one of main parts of ZFS success, as it is state-of-art cache algorithm. Its novelty is using of additional «ghost» LRU and MRU lists which remember evicted items and help to tune LRU/MRU balance. Center part of ARC algorithm is arc_adap() function which tune LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function will be called with wrong cache hit (state) adaptation will be sub-optimal and performance will suffer.

Some (long) time ago upstream had been received this commit:

6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit (it is very important!)

After this commit order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which render ghost lists useless and break ARC algorithm.

FreeBSD fixes this problem locally in Change D19094 / Commit r348772.

This fix have not been ported to upstream, ZoL or OpenZFS, unfortunately.

Current OpenZFS contains same bug, though patch is not applicable, as low-level ARC routines were extended, which is not present in FreeBSD sources.

Crucial change is this one, in arc_get_data_impl. All other changes are support this one, to weave proper behavior (adapt or not adapt before changing status of cache element) from call sites of arc_hdr_alloc_pabd() (which is named arc_hdr_alloc_abd() in OpenZFS).

Without this change ARC in OpenZFS is not ARC at all, and there could be serious performance degradation in some scenarios.

I think, it should be fixed before FreeBSD transition to OpenZFS and this fix will be beneficial for whole OpenZFS community.

The text was updated successfully, but these errors were encountered:

ghost · 2020-07-22T15:54:58Z

To be clear, this isn't actually in progress, so if anyone is willing to take it up please do :)

mattmacy · 2020-07-23T00:47:38Z

And to be clear, I'd asked @blacklion to file a PR, not an issue.

blacklion · 2020-07-23T00:58:46Z

And to be clear, I'd asked @blacklion to file a PR, not an issue.

And to be honest, PR could be understood as «problem report», especially by person from FreeBSD, especially by one who have been used to send-pr script ;-)

Jokes aside, codebases have been diverged and fix needs complete rewrite. I don't have time RIGHT NOW to make it and, what is worse, I don't know how to properly test it, as I'm not a author of original fix and know nothing about OpenZFS testing.

On the other hand, original fix is ugly kludge (and author acknowledges this, it is not that I try to shade Slawa), maybe OpenZFS community could offer something better? For example, author of all changes in arc.c which make original patch inapplicable. They must understand this code much better than me.

I could try to port original fix to OpenZFS's arc.c as-is, but it will take some more time, as I need to meet some deadlines at my $work.

blacklion · 2020-07-23T01:08:42Z

BTW, original fix will be even worse on new codebase: it will add second boolean parameter to some functions (it adds first one in FreeBSD codebase, but OpenZFS added its own boolean parameter to same functions!). It is nightmare from API's point of view: it is very easy to mix two boolean parameters at call sites, as booleans are indistinguishable.

mattmacy · 2020-07-23T02:34:19Z

@blacklion I'll update the interfaces to take a flag

The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. See also issue openzfs#10548. Signed-off-by: Matt Macy <[email protected]>

blacklion · 2020-08-12T17:07:41Z

@mattmacy Thank you very much!

The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes openzfs#10548 Closes openzfs#10618

shodanshok · 2020-12-07T13:42:51Z

Any chances to see this backported in the 0.8.x releases?

behlendorf · 2020-12-07T17:17:47Z

Possibly if it's not to troublesome to port, the PR was added to the 0.8 tracker.

The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes openzfs#10548 Closes openzfs#10618 Conflicts: module/zfs/arc.c

The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes openzfs#10548 Closes openzfs#10618

The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10548 Closes #10618

The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes openzfs#10548 Closes openzfs#10618

mattmacy mentioned this issue Jul 23, 2020

Restore ARC MFU/MRU pressure #10618

Merged

12 tasks

behlendorf closed this as completed in e111c80 Aug 12, 2020

shodanshok mentioned this issue Aug 14, 2020

Introduce ZFS module parameter l2arc_mfuonly #10710

Merged

12 tasks

This was referenced Aug 19, 2020

OpenZFS 6950 - ARC should cache compressed data #4768

Closed

Openzfs compressedarc abd patchset WIP #5009

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

blacklion commented Jul 9, 2020

ghost commented Jul 22, 2020

mattmacy commented Jul 23, 2020

blacklion commented Jul 23, 2020 •

edited

Loading

blacklion commented Jul 23, 2020

mattmacy commented Jul 23, 2020

blacklion commented Aug 12, 2020

shodanshok commented Dec 7, 2020

behlendorf commented Dec 7, 2020

ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

Comments

blacklion commented Jul 9, 2020

ghost commented Jul 22, 2020

mattmacy commented Jul 23, 2020

blacklion commented Jul 23, 2020 • edited Loading

blacklion commented Jul 23, 2020

mattmacy commented Jul 23, 2020

blacklion commented Aug 12, 2020

shodanshok commented Dec 7, 2020

behlendorf commented Dec 7, 2020

blacklion commented Jul 23, 2020 •

edited

Loading