Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP | DO NOT MERGE] zstd restructuring #9673

Closed
wants to merge 7 commits into from

Conversation

c0d3z3r0
Copy link
Contributor

@c0d3z3r0 c0d3z3r0 commented Dec 4, 2019

This is the WIP PR / branch where restructuring of #8941 and all related work happens (main parts of the code from @BrainSlayer and @allanjude).

Main goals:

  • keep zstd clean and untouched to make updates easy
  • clean up the code (drop unneeded/unused code)
  • add more documentation
  • ...
  • get this thing ready for final review (we will open a separate, clean PR for final review, so that nobody has to read our 9 000 comments :P)

@c0d3z3r0 c0d3z3r0 changed the title Zstd nomutexcrash [TEST | DO NOT MERGE]Zstd nomutexcrash Dec 4, 2019
@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 Thanks for seperating those two branches! :)

@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Dec 4, 2019

Ubuntu 16 failure of zloop is unrelated.
http://build.zfsonlinux.org/builders/Ubuntu%2016.04%20x86_64%20%28TEST%29/builds/9661/steps/shell_6/logs/1.ztest.out

@BrainSlayer @c0d3z3r0 besides the coverage tests zloop (lets away those)... no mutex errors appeared.

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 Coverage test also completed zloop without mutex related errors.

@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Dec 4, 2019

@c0d3z3r0 That would also do.

if we get the mutex_exit error here now, we know this was the cause.
if we don't we have a nice, functioning branch to fall back to that passes most tests and wil could just add the fixes from brainslayer just to be sure.

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 Everything clear here except coverage test
Which is nice, because it means at least your build-in fix is fine! :)

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 4, 2019

ok, then let's merge in BrainSlayers patches now

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 In that one thats fine... There is no reasonable expectation for a mutex_exit error here atm.

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 4, 2019

applied all... normally they get all tested

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0
No issues merging those in?

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 4, 2019

No issues merging those in?

no, this branch was missing my refactoring

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 When this works, I suggest closing this draft PR and continue the work on the other one...

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 4, 2019

@c0d3z3r0 When this works, I suggest closing this draft PR and continue the work on the other one...

I don't want to do too much spam... I'd close the other, continue work here and just open a new PR for final review

@PrivatePuffin
Copy link
Contributor

Lets do that indeed...
Its too much spam to click through already.

so Ack.

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 Two failures in zloop currently.
None mutex related.

@BrainSlayer
Copy link
Contributor

no. it was also no crash. looks like a irregular regular exit due io error

@PrivatePuffin
Copy link
Contributor

@BrainSlayer Thats what I just said smartass.

@PrivatePuffin
Copy link
Contributor

@c0d3z3r0 Tests finished, we can continue safely :)

@BrainSlayer
Copy link
Contributor

@c0d3z3r0 not yet. you missed a patch from my branch a967990. so ensure that really all is included. for me it looks only this is missing. this patch will trigger a assert if the memory header is not properly initalized. this is better than raising a crashdump.

@PrivatePuffin
Copy link
Contributor

Okey...
@c0d3z3r0
As discussed, the tests are both clean... so we can continue on this branch....

Before we start implementing your refactor one-by-one, 2 things needs to be done:

  1. rebase to include my test refactor (just been merged into master)
  2. implementation of @BrainSlayer his a967990 commit.

After those are merged we can readd the refactor points one by one...

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 4, 2019

are you kidding me? ofc I have merged that commit.... have a close look again

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 5, 2019

@BrainSlayer ping

@PrivatePuffin
Copy link
Contributor

Morning @c0d3z3r0
Lets get this rebased and the refactor im shallwe? :)

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 5, 2019

Morning @c0d3z3r0
Lets get this rebased and the refactor im shallwe? :)

sure, but a bit later; day job ;)

@BrainSlayer
Copy link
Contributor

@c0d3z3r0 dont bother. i just thought it was missing since zstd_free still had zstd_kmem_unknown as case. at the end i found out that i missed it. this is why i added a new patch (the one which solves the memleak) last evening.

@BrainSlayer
Copy link
Contributor

and yes. now we may refactor everything. i can also do some work in advance. for instance i want to get rid of the real_zstd_decompress functions etc. i see no reason to split it into 2 sub functions here. could be merged etc. but not that important. its just for easier reading

@PrivatePuffin
Copy link
Contributor

@BrainSlayer
I think you should holdoff on refactoring yourself.
@c0d3z3r0 is just going to port (parts) of his previous refactor and it would be easier if you add any changes on top of that instead of changing the source of that refactor (your repo) even more.

@BrainSlayer
Copy link
Contributor

since the majority of the refactoring process he did is just style based it could be done from scratch. especially if i rewrite functions from the ground. just some thoughts

@c0d3z3r0
Copy link
Contributor Author

c0d3z3r0 commented Dec 5, 2019

since the majority of the refactoring process he did is just style based it could be done from scratch. especially if i rewrite functions from the ground. just some thoughts

well, I would start adding my changes one by one (until it breaks again \o/), but please don't let's work concurrent, as we both will have to rebase again and again. let me finish here and then let us BOTH start from THE SAME codebase :)

@c0d3z3r0
Copy link
Contributor Author

I bet it's clang :)

@c0d3z3r0 I do get A LOT of feedback from my old pal Google, telling me CLANG IS A BITCH WITH THIS.

Oh, it could be Linux kernel compile-time checks, too (CONFIG_STACK_VALIDATION)

@PrivatePuffin
Copy link
Contributor

But lets be clear... @dan-and what we are discussing is if we can hide this. It indeed seems to bea related error to the other hidden one...

@PrivatePuffin
Copy link
Contributor

Some good news:
I've found a way to ignore folders from codecov.
I still need to do some testing to find the cleanest way, but it's going somewhere.

@dan-and
Copy link

dan-and commented Dec 15, 2019

I will remove all "well known" parts, which have nothing to do with the PR. Give me a bit of time to complete this.

Updated with details to compiler and kernel version.

@BrainSlayer
Copy link
Contributor

objtool warnings can be ignored. i do compile zfs with gcc 9.1 bzw. on kernel 5.4
objtool is a kernel tool which is used to generate strack trace informations. this warning you see will only occure if the ORC unwinder is activated in kernel config. i have not seen yet warnings like unreachable instruction with latest kernel and latest gcc. but its also harmless. objtool is not a bug free tool and does only generate metadata for the orc unwinder which is used to show stack trace informations if a kernel crash occures. so if a newer gcc does generate warnings with objtool. objtool must be fixed but not our code

@BrainSlayer
Copy link
Contributor

see also #6950

@dan-and
Copy link

dan-and commented Dec 15, 2019

Thanks @BrainSlayer
Then everything is fine :-)

@PrivatePuffin
Copy link
Contributor

Okey, a headsup:
codecoverage work is going fine.

I have found a way to cleanly exclude paths from code coverage.
But said solution was a dirty patch that wasn't going to get merged.

Now I'm doing some small refactoring and syntax tests on said solution to find the cleanest and maintanable solution for the ignores.

I estimate i'll have a good PR in tomorrow, which I should be able to get merged in 1 or 2 days, as its relatively simple. Together with that PR i'll send @c0d3z3r0 a PR to add the ignore here too. We can clean the duplicates during rebase quite easily as it is just a few rows.

I'll also make sure the PR description file gets updated accordingly.

That would delay schedule a bit, but 2-3 days won't be a problem I think :)
But we should be able to put the PR up the moment the codecoverage PR gets merged 👍

@c0d3z3r0
Copy link
Contributor Author

Okey, a headsup:
codecoverage work is going fine.

I have found a way to cleanly exclude paths from code coverage.
But said solution was a dirty patch that wasn't going to get merged.

Now I'm doing some small refactoring and syntax tests on said solution to find the cleanest and maintanable solution for the ignores.

I estimate i'll have a good PR in tomorrow, which I should be able to get merged in 1 or 2 days, as its relatively simple. Together with that PR i'll send @c0d3z3r0 a PR to add the ignore here too. We can clean the duplicates during rebase quite easily as it is just a few rows.

I'll also make sure the PR description file gets updated accordingly.

That would delay schedule a bit, but 2-3 days won't be a problem I think :)
But we should be able to put the PR up the moment the codecoverage PR gets merged

Perfect! :)

@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Dec 15, 2019

Preliminary codecoverage ignore fix is ready here: #9726

It am not 100% certain the quoting when using ?= and += is going to work out completely.
But for all these issues I know the workarounds already and I think it might work the way it is.

@c0d3z3r0 I don't think I even need to write you the oneliner you need to ignore zstd if you read my changes, do I? ;)
(keep in mind the * in: "*/contrib/zstd/*" ;) )

Do remember this replaces(!) the previous ignore fix!

@c0d3z3r0
Copy link
Contributor Author

@Ornias1993 very nice :) integrated

@c0d3z3r0
Copy link
Contributor Author

oops.. newline f**+up

- Removes broken codecov ignore
- Places ignore section ax_code_coverage
- Modifies LCOV command to deal with multiple ignored paths
- Forwards users from codecov to LCOV for ignores

Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
@c0d3z3r0
Copy link
Contributor Author

oh damn.... one shouldn't touch code, while hungry.... -.-

This imports the unmodified ZStandard source to contrib/ which will be
used by ZFS. This code shall not be modified in any way to keep it
easily updatable.

Only the required files from lib/ are imported.

contrib/zstd is excluded from codecov calculation as dependencies don't
need full codecov.

Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing [email protected]
Signed-off-by: Michael Niewöhner <[email protected]>
These files make ZStandard compile as a kernel module without modifying
the ZStandard source files.

Removed unexpanded SVN version tracking tags from FreeBSD

Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Extend APIs that deals with compression to also pass the compression
level.

Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
This add a new build intermediary libzstd. This allows us to keep zstd
cleanly separated from libzpool and to override the frame-size compiler
option without for zstd only. (This is needed due to unused coude in
the unmodified zstd source.)

Signed-off-by: Michael Niewöhner <[email protected]>
Add zstd to the test suite
- add zstd to history_002_pos.ksh
- add random levels of zstd to history_002_pos.ksh
- add zstd-fast to history_002_pos.ksh
- add random levels of zstd-fast to history_002_pos.ksh

Add documentation
- add man page content
- add README for contrib/zstd

Fixup copyright headers of touched and new files
- Adds copyright headers for Allan Jude / Klara Inc.
- Added Ornias1993 Copyright
- Cleans copyright header formatting

Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
@BrainSlayer
Copy link
Contributor

i should not code when i'm thirsty. especially on saturday nights

@PrivatePuffin
Copy link
Contributor

PrivatePuffin commented Dec 16, 2019

@c0d3z3r0
Multi line variables for this ignore didn't work out, had to oneline it to get it working...
Please follow the change I made in #9726

That PR is now out for final review too btw, so drop a quick review if you like :)
(don't start bitching about grammar please, that also distracted people in my previous PR)

@c0d3z3r0 What you could do:

  • Drop the current commit you added.
  • Just add : "/contrib/zstd/" to the line I used.
  • Remove the ignore in codecov.yml
  • Just rebase if there is something to rebase.

You should be able to do this without my PR being merged, those two do not rely on eachother anymore :)

@c0d3z3r0 and @BrainSlayer I did some CPPCHECK tests and the issues that are there are from the library and are fixed in upcoming ZSTD 1.4.5

No new relevant issues it seems

I also did additional testing on different values for pool_count = (boot_ncpus * 4);

I've seen no statistically relevant performance differences between the following values:
pool_count = (boot_ncpus * 4);
pool_count = (boot_ncpus * 8);
pool_count = (boot_ncpus * 12);
pool_count = (boot_ncpus * 16);

So the guestimated 4 value is perfectly fine! :)

@c0d3z3r0
Copy link
Contributor Author

#9735

@c0d3z3r0 c0d3z3r0 closed this Dec 16, 2019
@BrainSlayer
Copy link
Contributor

try *2 or *1 :-) i have choosen 4 but believe 2 is enough. so even 4 was already a "i'm afraid" value :-)

what issues do you refer too with CPPCHECK?

@PrivatePuffin
Copy link
Contributor

@BrainSlayer I already discussed any issues with the folks at zstd.
It has no relevance to discuss already fixed issues with someone from another project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants