Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

Closed
blacklion opened this issue Jul 9, 2020 · 8 comments

Comments

@blacklion
Copy link

ARC is one of main parts of ZFS success, as it is state-of-art cache algorithm. Its novelty is using of additional «ghost» LRU and MRU lists which remember evicted items and help to tune LRU/MRU balance. Center part of ARC algorithm is arc_adap() function which tune LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function will be called with wrong cache hit (state) adaptation will be sub-optimal and performance will suffer.

Some (long) time ago upstream had been received this commit:

6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit (it is very important!)

After this commit order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which render ghost lists useless and break ARC algorithm.

FreeBSD fixes this problem locally in Change D19094 / Commit r348772.

This fix have not been ported to upstream, ZoL or OpenZFS, unfortunately.

Current OpenZFS contains same bug, though patch is not applicable, as low-level ARC routines were extended, which is not present in FreeBSD sources.

Crucial change is this one, in arc_get_data_impl. All other changes are support this one, to weave proper behavior (adapt or not adapt before changing status of cache element) from call sites of arc_hdr_alloc_pabd() (which is named arc_hdr_alloc_abd() in OpenZFS).

Without this change ARC in OpenZFS is not ARC at all, and there could be serious performance degradation in some scenarios.

I think, it should be fixed before FreeBSD transition to OpenZFS and this fix will be beneficial for whole OpenZFS community.

@ghost
Copy link

ghost commented Jul 22, 2020

To be clear, this isn't actually in progress, so if anyone is willing to take it up please do :)

@mattmacy
Copy link
Contributor

And to be clear, I'd asked @blacklion to file a PR, not an issue.

@blacklion
Copy link
Author

blacklion commented Jul 23, 2020

And to be clear, I'd asked @blacklion to file a PR, not an issue.

And to be honest, PR could be understood as «problem report», especially by person from FreeBSD, especially by one who have been used to send-pr script ;-)

Jokes aside, codebases have been diverged and fix needs complete rewrite. I don't have time RIGHT NOW to make it and, what is worse, I don't know how to properly test it, as I'm not a author of original fix and know nothing about OpenZFS testing.

On the other hand, original fix is ugly kludge (and author acknowledges this, it is not that I try to shade Slawa), maybe OpenZFS community could offer something better? For example, author of all changes in arc.c which make original patch inapplicable. They must understand this code much better than me.

I could try to port original fix to OpenZFS's arc.c as-is, but it will take some more time, as I need to meet some deadlines at my $work.

@blacklion
Copy link
Author

BTW, original fix will be even worse on new codebase: it will add second boolean parameter to some functions (it adds first one in FreeBSD codebase, but OpenZFS added its own boolean parameter to same functions!). It is nightmare from API's point of view: it is very easy to mix two boolean parameters at call sites, as booleans are indistinguishable.

@mattmacy
Copy link
Contributor

@blacklion I'll update the interfaces to take a flag

mattmacy added a commit to zfsonfreebsd/ZoF that referenced this issue Jul 23, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache
hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU.
If this function is called with wrong cache hit (state), adaptation will be
sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before call
to arc_access() which revive element in cache and change state from ghost to
real hit.

After this commit, the order of calls was reverted and arc_adapt() is now
called only with «real» hits even if hit was in one of two ghost lists,
which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
mattmacy added a commit to zfsonfreebsd/ZoF that referenced this issue Jul 24, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
ghost pushed a commit to zfsonfreebsd/ZoF that referenced this issue Jul 29, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
ghost pushed a commit to zfsonfreebsd/ZoF that referenced this issue Jul 29, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
ghost pushed a commit to zfsonfreebsd/ZoF that referenced this issue Jul 29, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
ghost pushed a commit to zfsonfreebsd/ZoF that referenced this issue Jul 31, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
ghost pushed a commit to zfsonfreebsd/ZoF that referenced this issue Aug 1, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
ghost pushed a commit to zfsonfreebsd/ZoF that referenced this issue Aug 10, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.
See also issue openzfs#10548.

Signed-off-by: Matt Macy <[email protected]>
@blacklion
Copy link
Author

@mattmacy Thank you very much!

DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Aug 19, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes openzfs#10548
Closes openzfs#10618
DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Aug 19, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes openzfs#10548
Closes openzfs#10618
@shodanshok
Copy link
Contributor

Any chances to see this backported in the 0.8.x releases?

@behlendorf
Copy link
Contributor

Possibly if it's not to troublesome to port, the PR was added to the 0.8 tracker.

behlendorf pushed a commit to behlendorf/zfs that referenced this issue Dec 9, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes openzfs#10548
Closes openzfs#10618
Conflicts:
	module/zfs/arc.c
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Dec 9, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes openzfs#10548
Closes openzfs#10618
tonyhutter pushed a commit that referenced this issue Dec 14, 2020
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #10548
Closes #10618
jsai20 pushed a commit to jsai20/zfs that referenced this issue Mar 30, 2021
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes openzfs#10548 
Closes openzfs#10618
sempervictus pushed a commit to sempervictus/zfs that referenced this issue May 31, 2021
The arc_adapt() function tunes LRU/MLU balance according to 4 types of
cache hits (which is passed as state agrument): ghost LRU, LRU, MRU,
ghost MRU. If this function is called with wrong cache hit (state),
adaptation will be sub-optimal and performance will suffer.

Some time ago upstream received this commit:

6950 ARC should cache compressed data) in arc_read() do next
sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before
call to arc_access() which revive element in cache and change state from
ghost to real hit.

After this commit, the order of calls was reverted and arc_adapt() is
now called only with «real» hits even if hit was in one of two ghost
lists, which renders ghost lists useless and breaks the ARC algorithm.

FreeBSD fixed this problem locally in Change D19094 / Commit r348772.

This change is an adaptation of the above commit to the current arc
code.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes openzfs#10548 
Closes openzfs#10618
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants