-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move hdf5plugin
to a sub-folder
#114
Conversation
I am pretty agnostic about this. A very long time ago (~10y), I had my projects in |
Apparently it is more pythonic today, so I'm in favor of going ahead.... and going back ahead in 10y. |
Smaller the |
As long as tests pass, I don't see any issue here. |
Well, to confident...it broke debian packaging: see #118 |
REVERT: 2b63814b Tag open source release 1.1.9. REVERT: 9c1be179 'size' remains unused if none of ZLIB, LZO and LZ4 are available. REVERT: 78650d12 Add project goals to CONTRIBUTING.md. REVERT: 5e7c14bd Add stubs for abseil flags. REVERT: 80a2a10c Remove unused run_microbenchmarks flag. REVERT: 453942b3 Add absl::GetFlag and absl::SetFlag to uses of flags. REVERT: ea368c2f Add AppVeyor status badge. REVERT: d1d1f486 Remove unused include in snappy_benchmark.cc. REVERT: 4ebd8b2f Split benchmarks and test tools into separate targets. REVERT: 0793e2ae Merge pull request silx-kit#117 from cmumford:disable-osx-fuzzer REVERT: ac55f842 Test stub improvements. REVERT: 6e9ae724 Disable fuzzing on OSX. REVERT: 402d8881 Fixup for adding the third_party/{benchmark, googletest} submodules. (silx-kit#115) REVERT: 6badb0a2 Merge pull request silx-kit#114 from cmumford:werror-only-clang REVERT: bc53daa7 Fixed endif clause. REVERT: e9a6a084 Matching clang. REVERT: 955a5dd1 Building with `-Werror` only with clang. REVERT: 42d1dd7e Fix CHECK_EQ to call ok() instead of CheckSuccess(). REVERT: eaaa0ed0 Fixup for adding the third_party/{benchmark, googletest} submodules. (silx-kit#111) REVERT: e1e91ee4 Rework file:: stubs. REVERT: 6aa79cb4 Wrap snappy_unittest in an anonymous namespace and remove static from functions. REVERT: bae9f9be Fixup for adding the third_party/{benchmark, googletest} submodules. (silx-kit#110) REVERT: 5f913be0 Fix unused local variable warnings. REVERT: 549685a5 Remove custom testing and benchmarking code. REVERT: 11f9a77a Add Travis-CI build status badge to README.md. REVERT: 49540965 Update Travis CI config. REVERT: 8995ffab Replace #pragma nounroll with equivalent used elsewhere. REVERT: d1daa830 Remove inline qualifier from static variables. REVERT: 3b571656 1) Improve the lookup table data to require less instructions to extract the necessary data. We now store len - offset in a signed int16, this happens to remove masking offset in the calculations and the calculations that need to be done precisely give the flags that we need for testing correctness. 2) Replace offset extraction with a lookup mask. This is less uops and is needed because we need to special case type 3 to always return 0 as to properly trigger the fallback. 3) Unroll the loop twice, this removes some loop-condition checks AND it improves the generated assembly. The loop variables tend to end up in a different register requiring mov's having two consecutive copies allows the elision of the mov's. REVERT: a9730ed5 Optimize zippy decompression by making IncrementalCopy faster. REVERT: 56c2c247 Internal change REVERT: a94be58e Optimize zippy decompression by making IncrementalCopy faster. REVERT: 01a566f8 Fix opensource version REVERT: 616b8229 Add LZ4 as a benchmark option. Snappy is starting to look really good compared to LZ4. LZ4 is considered the fastest solution by many on internet. We now see that Snappy is actually becoming very competitive with compression a little faster and decompression slower but certainly not terribly slower. REVERT: e4a6e97b Extend validate benchmarks over all types and also add a medley for validation. REVERT: 719bed0a Bug fix. Error on 0 offset copies. REVERT: 289c8a3c Make zippy decompression branchless REVERT: 3bfa265a Revert zippy optimization that causes heap buffer overflows. REVERT: 4d2dc9dc Optimize zippy unzipping by upto >10% by making IncrementalCopy faster. REVERT: 11e5165b Add a benchmark that decreased the branch prediction memorization by increasing the amount of independent branches executed per benchmark iteration. REVERT: 6835abd9 Change hash function for Compress. REVERT: 368b01c8 Merge pull request silx-kit#107 from jsteemann:bug-fix/fix-compile-warning REVERT: 1ce58af2 Fix the use of op + len when op is nullptr and len is non-zero. See https://reviews.llvm.org/D67122 for some discussion of why this can matter. I don't think this should have any noticeable effect on performance. REVERT: 0b990db2 Run clang-format REVERT: cb2b3c7e fix compile warnings due to missing override specifiers REVERT: 7ffaf77c Replace ARCH_K8 with __x86_64__. REVERT: 4dd277fe Replace the division with a constant table in IncrementalCopy REVERT: f16eda34 Correct uninitialized variable. REVERT: 837f38b3 Revise stubs for ARCH_{K8,PPC,ARM}. REVERT: e1353b9f Remove ARCH_* guards around Bits::FindLSBSetNonZero64(). REVERT: c98344f6 Fix Clang/GCC compilation warnings. REVERT: 113cd97a Tighten types on a few for loops. REVERT: abde3abb Fix Travis CI build. REVERT: e6506681 Fix accidental double std:: qualifiers. REVERT: 63620c06 Add some std:: qualifiers to types and functions. REVERT: 5417da69 Switch from C headers to C++ headers. REVERT: 251d935d Remove #include <string> from snappy-stubs-public.h. REVERT: 4f195aee Remove mismatched #endif. REVERT: 041c6080 Remove platform-dependent code for unaligned loads/stores. REVERT: 27ff130f Remove platform-dependent code for little-endian loads and stores. REVERT: a4cdb5d1 Introduce SNAPPY_ATTRIBUTE_ALWAYS_INLINE. REVERT: 231b8be0 Migrate to standard integral types. REVERT: 14bef662 Modernize memcpy() and memmove() usage. REVERT: d674348a Improve zippy with 5-10%. REVERT: 4dfcad9f assertion failure on darwin_x86_64, have to investigage REVERT: e1917874 assertion failure on darwin_x86_64, have to investigage REVERT: 0faf5637 This cl does two things 1) It shaves of a few cycles from the data dependency chain. By using "shrd" instead of a load. 2) The important loop is finding small copies (4-12) which are either "copy 1", or "copy 2" depending if the offset fits <2048. It turns out that this is a branch that is mispredicted often. Due to the long dependency chain the CPU is running with IPC~1 anyway so we can freely add instructions to instead emit copies branchfree. This reduces the branch misspredicts from 15% to 11% (for BM_ZFlat/6 txt1) and from 5.6% to 4% (for BM_ZFlat/10 or pb). REVERT: 0c7ed08a The result on protobuf benchmark is around 19%. Results vary by their propensity for compression. As the frequency of finding matches influences the amount of branch misspredicts and the amount of hashing. REVERT: 3c77e014 1) Make the output pointer a local variable such it doesn't need a load add store on it's loop carried dependency chain. 2) Reduce the input pointer loop carried dependency chain from 7 cycles to 4 cycles by using pre-loading. This is a very subtle point. 3) Just brutally copy 64 bytes which removes a difficult to predict branch from the inner most loop. There is enough bandwidth to do so in the intrinsic cycles of the loop. 4) Implement limit pointers that include the slop region. This removes unnecessary instructions from the hot path. 5) It seems the removal of the difficult to predict branch has removed the code sensitivity to alignment, so remove the asm nop's. REVERT: 9eabb7ba Cut a load from the critical dependency chain of the input pointer by speculating the uncommon case of COPY_4 is not happening. REVERT: cddd9c08 Improve comments in IncrementalCopy, add an assert. git-subtree-dir: src/snappy git-subtree-split: 537f4ad6240e586970fe554614542e9717df7902
2b63814b Tag open source release 1.1.9. 9c1be179 'size' remains unused if none of ZLIB, LZO and LZ4 are available. 78650d12 Add project goals to CONTRIBUTING.md. 5e7c14bd Add stubs for abseil flags. 80a2a10c Remove unused run_microbenchmarks flag. 453942b3 Add absl::GetFlag and absl::SetFlag to uses of flags. ea368c2f Add AppVeyor status badge. d1d1f486 Remove unused include in snappy_benchmark.cc. 4ebd8b2f Split benchmarks and test tools into separate targets. 0793e2ae Merge pull request silx-kit#117 from cmumford:disable-osx-fuzzer ac55f842 Test stub improvements. 6e9ae724 Disable fuzzing on OSX. 402d8881 Fixup for adding the third_party/{benchmark, googletest} submodules. (silx-kit#115) 6badb0a2 Merge pull request silx-kit#114 from cmumford:werror-only-clang bc53daa7 Fixed endif clause. e9a6a084 Matching clang. 955a5dd1 Building with `-Werror` only with clang. 42d1dd7e Fix CHECK_EQ to call ok() instead of CheckSuccess(). eaaa0ed0 Fixup for adding the third_party/{benchmark, googletest} submodules. (silx-kit#111) e1e91ee4 Rework file:: stubs. 6aa79cb4 Wrap snappy_unittest in an anonymous namespace and remove static from functions. bae9f9be Fixup for adding the third_party/{benchmark, googletest} submodules. (silx-kit#110) 5f913be0 Fix unused local variable warnings. 549685a5 Remove custom testing and benchmarking code. 11f9a77a Add Travis-CI build status badge to README.md. 49540965 Update Travis CI config. 8995ffab Replace #pragma nounroll with equivalent used elsewhere. d1daa830 Remove inline qualifier from static variables. 3b571656 1) Improve the lookup table data to require less instructions to extract the necessary data. We now store len - offset in a signed int16, this happens to remove masking offset in the calculations and the calculations that need to be done precisely give the flags that we need for testing correctness. 2) Replace offset extraction with a lookup mask. This is less uops and is needed because we need to special case type 3 to always return 0 as to properly trigger the fallback. 3) Unroll the loop twice, this removes some loop-condition checks AND it improves the generated assembly. The loop variables tend to end up in a different register requiring mov's having two consecutive copies allows the elision of the mov's. a9730ed5 Optimize zippy decompression by making IncrementalCopy faster. 56c2c247 Internal change a94be58e Optimize zippy decompression by making IncrementalCopy faster. 01a566f8 Fix opensource version 616b8229 Add LZ4 as a benchmark option. Snappy is starting to look really good compared to LZ4. LZ4 is considered the fastest solution by many on internet. We now see that Snappy is actually becoming very competitive with compression a little faster and decompression slower but certainly not terribly slower. e4a6e97b Extend validate benchmarks over all types and also add a medley for validation. 719bed0a Bug fix. Error on 0 offset copies. 289c8a3c Make zippy decompression branchless 3bfa265a Revert zippy optimization that causes heap buffer overflows. 4d2dc9dc Optimize zippy unzipping by upto >10% by making IncrementalCopy faster. 11e5165b Add a benchmark that decreased the branch prediction memorization by increasing the amount of independent branches executed per benchmark iteration. 6835abd9 Change hash function for Compress. 368b01c8 Merge pull request silx-kit#107 from jsteemann:bug-fix/fix-compile-warning 1ce58af2 Fix the use of op + len when op is nullptr and len is non-zero. See https://reviews.llvm.org/D67122 for some discussion of why this can matter. I don't think this should have any noticeable effect on performance. 0b990db2 Run clang-format cb2b3c7e fix compile warnings due to missing override specifiers 7ffaf77c Replace ARCH_K8 with __x86_64__. 4dd277fe Replace the division with a constant table in IncrementalCopy f16eda34 Correct uninitialized variable. 837f38b3 Revise stubs for ARCH_{K8,PPC,ARM}. e1353b9f Remove ARCH_* guards around Bits::FindLSBSetNonZero64(). c98344f6 Fix Clang/GCC compilation warnings. 113cd97a Tighten types on a few for loops. abde3abb Fix Travis CI build. e6506681 Fix accidental double std:: qualifiers. 63620c06 Add some std:: qualifiers to types and functions. 5417da69 Switch from C headers to C++ headers. 251d935d Remove #include <string> from snappy-stubs-public.h. 4f195aee Remove mismatched #endif. 041c6080 Remove platform-dependent code for unaligned loads/stores. 27ff130f Remove platform-dependent code for little-endian loads and stores. a4cdb5d1 Introduce SNAPPY_ATTRIBUTE_ALWAYS_INLINE. 231b8be0 Migrate to standard integral types. 14bef662 Modernize memcpy() and memmove() usage. d674348a Improve zippy with 5-10%. 4dfcad9f assertion failure on darwin_x86_64, have to investigage e1917874 assertion failure on darwin_x86_64, have to investigage 0faf5637 This cl does two things 1) It shaves of a few cycles from the data dependency chain. By using "shrd" instead of a load. 2) The important loop is finding small copies (4-12) which are either "copy 1", or "copy 2" depending if the offset fits <2048. It turns out that this is a branch that is mispredicted often. Due to the long dependency chain the CPU is running with IPC~1 anyway so we can freely add instructions to instead emit copies branchfree. This reduces the branch misspredicts from 15% to 11% (for BM_ZFlat/6 txt1) and from 5.6% to 4% (for BM_ZFlat/10 or pb). 0c7ed08a The result on protobuf benchmark is around 19%. Results vary by their propensity for compression. As the frequency of finding matches influences the amount of branch misspredicts and the amount of hashing. 3c77e014 1) Make the output pointer a local variable such it doesn't need a load add store on it's loop carried dependency chain. 2) Reduce the input pointer loop carried dependency chain from 7 cycles to 4 cycles by using pre-loading. This is a very subtle point. 3) Just brutally copy 64 bytes which removes a difficult to predict branch from the inner most loop. There is enough bandwidth to do so in the intrinsic cycles of the loop. 4) Implement limit pointers that include the slop region. This removes unnecessary instructions from the hot path. 5) It seems the removal of the difficult to predict branch has removed the code sensitivity to alignment, so remove the asm nop's. 9eabb7ba Cut a load from the critical dependency chain of the input pointer by speculating the uncommon case of COPY_4 is not happening. cddd9c08 Improve comments in IncrementalCopy, add an assert. git-subtree-dir: src/snappy git-subtree-split: 2b63814b15a2aaae54b7943f0cd935892fae628f
092190c38 Fix spelling errors (silx-kit#128) b241ca0d0 Fix the CMake multiconfig var and test and tools handling (silx-kit#127) 7b33e4f19 Fix some make output issues (silx-kit#126) 51d221848 Documentation fixes and fixes from testing 1.1.1 (silx-kit#125) 4ae30429c CMake: (fix) HDF5 1.14.0 (silx-kit#121) e1072ec5b Update Github actions (silx-kit#118) b021532ea Fix leak and consistify generic API with ZFP C API (silx-kit#117) 06408bb22 Enable CMake Testing (silx-kit#116) eb5440043 Fix version info in docs ad61dfa41 Revert "rework feat-cmake_tests CMake and replace scripts (silx-kit#111)" (silx-kit#115) 03b86413f rework feat-cmake_tests CMake and replace scripts (silx-kit#111) 08116fa9e Update README.md (silx-kit#114) a10bd7bdb Merge branch 'master' of github.com:LLNL/H5Z-ZFP 1660d94a4 Fix h5repack test (silx-kit#107) fd01bf1d7 mention generic interface 8d5ec2e74 Update docs (silx-kit#106) c76f847d9 Added new endianness test (silx-kit#102) 8acf8244b fix reading of byte-swapped input files (silx-kit#95) (silx-kit#101) 62ab4eb6d Minor code clean-up. (silx-kit#103) 60032f640 Bugfix/hdf5 1.14.0 (silx-kit#99) 8fad85d72 CMake: (fix) MPI parallel HDF5 1.14.0 (silx-kit#97) e81aec114 Don't flag a skipped test as a failed test. (silx-kit#94) git-subtree-dir: src/H5Z-ZFP git-subtree-split: 092190c38d5760b3212ce4d96cced767b90aa189
Idea is to avoid having a
hdf5plugin
folder in the root of the repo to avoid import from source and thus remove the runtime check of the version file to simplify a bitsetup.py
and__init__.py
.I moved
hdf5plugin
intosrc/
, but alternatives are possible to avoid mixing plugin sources andhdf5plugin
:lib/hdf5plugin
as in matplotlib or whatever sub-folder name.hdf5plugin
could be renamed tolib
or anything.What do you think?