Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-856] Optimize of string/binary split #918

Merged
merged 21 commits into from
May 23, 2022

Conversation

FelixYBW
Copy link
Collaborator

What changes were proposed in this pull request?

manually split the string/binary column.
Previously we use builder to construct the string buffer. The PR create the destination buffers and read the src buffers, then split the column manually

The PR also removed some unused code

@github-actions
Copy link

#856

@FelixYBW
Copy link
Collaborator Author

split 5 columns of lineitem table, 3456 reducers on CLK machine

1 thread 24 threads
base 6.98 17.4
c776c48 2.21 3.82

@FelixYBW
Copy link
Collaborator Author

Optimizations:

3456 partition 1 thread 24 thread
default 6.98 17.4
opt1 2.21 3.82
prefetch 1.94 3.71
add capacity check 2.12 3.9
avx cpy 1.92 4
16k batch 2.44 5.66
value_offset cached 2.18 3.33
remove partition_binary_buffer_idx_offset_ 2.16 3.26
large page, 2M 2.08 2.92
allocate value buffer from pool 1.92 2.39
9 + prefetch dst next pid 1.85 2.43

@FelixYBW FelixYBW merged commit c81f8af into oap-project:main May 23, 2022
@FelixYBW FelixYBW deleted the shuffle_opt_string3 branch May 23, 2022 09:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant