Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Improved performance of concatenating non-aligned validities (15x) #291

Merged
merged 2 commits into from
Aug 18, 2021

Conversation

jorgecarleitao
Copy link
Owner

@jorgecarleitao jorgecarleitao commented Aug 16, 2021

This PR significantly improves the performance of concatenating arrays whose lengths are not a multiple of 8 by improving the performance of concatenating bitmaps.

Before this PR, we concatenated bitmaps by iterating bit by bit and setting bit by bit. However, there is a more efficient way of doing this via byte operations. Specifically, given a mutable bitmap [10101000, --101010] (length = 8+6=14) a bitmap can be concatenated to by shifts. E.g. [00000011, a, b, ..., c] can be concatenated to it by something like

  • 00000011 << 6 and OR it
  • merge a with 00000011 with an offset of 2 and append
  • merge b with a with an offset of 2 and append
  • ...
  • append c

This results in a significantly less number of instructions, lookups, etc.

This improves performance of almost operations that in some way concatenate validities. It includes:

  • Growable API (concat, filter, merge-sort)
  • lower-level bitmap concatenation
git checkout afb05d2511d495075180436dcd16af2e4b6ed71a
cargo bench --no-default-features --features benchmarks,compute --bench bitmap --bench concat --bench filter_kernels -- "2\^20"
git checkout improve_perf
cargo bench --no-default-features --features benchmarks,compute --bench bitmap --bench concat --bench filter_kernels -- "2\^20"
bitmap extend aligned 2^20                                                                             
                        time:   [3.7567 us 3.7847 us 3.8217 us]
                        change: [-3.0540% +1.6941% +6.6158%] (p = 0.49 > 0.05)

bitmap extend unaligned 2^20                                                                            
                        time:   [247.46 us 248.23 us 249.13 us]
                        change: [-75.411% -75.289% -75.172%] (p = 0.00 < 0.05)

bitmap extend_constant aligned 2^20                                                                             
                        time:   [2.6766 us 2.6822 us 2.6883 us]
                        change: [-99.536% -99.534% -99.532%] (p = 0.00 < 0.05)

bitmap extend_constant unaligned 2^20                                                                             
                        time:   [2.6916 us 2.6970 us 2.7026 us]
                        change: [-99.531% -99.529% -99.527%] (p = 0.00 < 0.05)

int32 concat aligned 2^20                                                                            
                        time:   [487.53 us 488.09 us 488.75 us]
                        change: [-93.566% -93.548% -93.530%] (p = 0.00 < 0.05)

int32 concat unaligned 2^20                                                                            
                        time:   [758.76 us 759.91 us 761.29 us]
                        change: [-89.977% -89.951% -89.923%] (p = 0.00 < 0.05)

boolean concat aligned 2^20                                                                            
                        time:   [224.02 us 224.50 us 225.02 us]
                        change: [-98.193% -98.187% -98.181%] (p = 0.00 < 0.05)

boolean concat unaligned 2^20                                                                            
                        time:   [708.63 us 710.23 us 712.14 us]
                        change: [-94.305% -94.286% -94.268%] (p = 0.00 < 0.05)

filter 2^20 f32         time:   [2.5137 ms 2.5199 ms 2.5274 ms]                             
                        change: [-2.6963% -2.3576% -1.9729%] (p = 0.00 < 0.05)

filter null 2^20 f32    time:   [7.6607 ms 7.6773 ms 7.6954 ms]                                 
                        change: [-12.051% -11.757% -11.473%] (p = 0.00 < 0.05)

@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Aug 16, 2021
@codecov
Copy link

codecov bot commented Aug 17, 2021

Codecov Report

Merging #291 (b556812) into main (0742edd) will increase coverage by 0.10%.
The diff coverage is 97.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #291      +/-   ##
==========================================
+ Coverage   77.25%   77.35%   +0.10%     
==========================================
  Files         315      315              
  Lines       20791    20911     +120     
==========================================
+ Hits        16062    16176     +114     
- Misses       4729     4735       +6     
Impacted Files Coverage Δ
src/bitmap/utils/chunk_iterator/mod.rs 85.91% <ø> (ø)
src/bitmap/utils/mod.rs 100.00% <ø> (ø)
src/bitmap/mutable.rs 89.13% <93.84%> (+1.06%) ⬆️
src/array/growable/boolean.rs 80.76% <100.00%> (ø)
src/array/growable/utils.rs 100.00% <100.00%> (ø)
tests/it/bitmap/mutable.rs 100.00% <100.00%> (ø)
src/io/json_integration/write.rs 0.00% <0.00%> (-6.25%) ⬇️
src/io/csv/write/mod.rs 72.00% <0.00%> (-4.00%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0742edd...b556812. Read the comment docs.

@jorgecarleitao jorgecarleitao force-pushed the improve_perf branch 2 times, most recently from e03cde4 to 17cade2 Compare August 17, 2021 08:24
@jorgecarleitao
Copy link
Owner Author

cc @ritchie46 and @Dandandan , since you like these things :)

@ritchie46
Copy link
Collaborator

Love it! I see some interesting bit comments. Have you got a summary of what you do? A memcpy instead of iterators?

@Dandandan
Copy link
Collaborator

cc @ritchie46 and @Dandandan , since you like these things :)

Amazing 😎

@jorgecarleitao jorgecarleitao changed the title Improved performance of concatenating non-aligned validities (+4x) Improved performance of concatenating non-aligned validities (15x) Aug 17, 2021
@jorgecarleitao
Copy link
Owner Author

Love it! I see some interesting bit comments. Have you got a summary of what you do? A memcpy instead of iterators?

:) Updated the description with the idea 👍

@sundy-li
Copy link
Collaborator

sundy-li commented Aug 24, 2021

A bug was found in #325

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants