Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] split chunk of HashTable #51175

Merged
merged 17 commits into from
Oct 18, 2024

Conversation

murphyatwork
Copy link
Contributor

@murphyatwork murphyatwork commented Sep 19, 2024

Why I'm doing:

@          0x2f9d4b5  malloc
@          0x8f9c745  operator new()
@          0x2ddb2ee  std::vector<>::_M_range_insert<>()
@          0x2dde914  starrocks::BinaryColumnBase<>::append()
@          0x365fa4e  starrocks::NullableColumn::append()
@          0x37458f9  starrocks::JoinHashTable::append_chunk()
@          0x3c86e80  starrocks::HashJoinBuilder::append_chunk()
@          0x3c8100c  starrocks::HashJoiner::append_chunk_to_ht()
@          0x3ab6649  starrocks::pipeline::HashJoinBuildOperator::push_chunk()
@          0x3a6769c  starrocks::pipeline::PipelineDriver::process()
@          0x3a58b9e  starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@          0x305ebac  starrocks::ThreadPool::dispatch_thread()
@          0x305882a  starrocks::Thread::supervise_thread()

JoinHashTable::build_chunk is a Chunk which contains all data from build side, it means it can be very large for particular cases. As a result, it can easily encounter the memory allocation issue, when jemalloc/os cannot allocate a large continuous memory, as above exception.

The particular cases can be:

  • use string column as build side
  • use array column as build side

What I'm doing:

Split that chunk into multiple smaller segments(whose rows is usually 131072) to get rid of this issue:

  • Introduce a SegmentedChunk and SegmentedColumn to replace original Chunk and Column
  • They're not transparent replacement, but implemented most of required interfaces. So minimal code changes are required
  • To deal with the address problem(map the global offset to segment offset): we choose to translate the index just-in-time, like offset%segment_size, rather than maintaining a index for it. It's effective enough with static segment_size.
  • We use static segment_size rather than dynamic, which is easier to implement and more efficient

Potential downside and considerations of this approach:

  • When generate output for JoinHashMap, it needs to randomly copy data from the build_chunk according to build_index. With SegmentedChunk, since the memory address is not continuous anymore, we need to lookup the segment first then lookup the record in it. To deal with it, we try best to use the SegmentedChunkVisitor to reduce this overhead via eliminating the virtual function call
  • The key_column of JoinHashMap cannot not use columns of build_chunk anymore. Since their memory layout is different, key_column use a continuous column, but build_chunk uses a segmented way. It would introduce some memory overhead and memory copy overhead.
    • Why not make the key_column segmented ? The overhead is relatively larger for the probe procedure, and also it needs to change a lot of code, which is beyond the scope. So we choose the easy path

Performance

Running ./shuffle_chunk_bench
Run on (104 X 3200.25 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x52)
  L1 Instruction 32 KiB (x52)
  L2 Unified 1024 KiB (x52)
  L3 Unified 36608 KiB (x2)
Load Average: 100.59, 89.61, 83.99
--------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------
bench_chunk_clone 3992519949343932 ns     21223730 ns            1 items_per_second=192.992k/s
bench_segmented_chunk_clone 3992510186082870 ns     22087674 ns            1 items_per_second=185.443k/s

The bench_segmented_chunk_clone is still slower than regular chunk_clone, it mostly comes from the unpredictable random memory access during copy. Considering it can help memory allocation, i think it's worth to do it.

We can further optimize the performance through make the memory access more sequential.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch 8 times, most recently from 888d040 to efd3854 Compare September 23, 2024 06:44
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch from efd3854 to 67007f2 Compare September 24, 2024 23:37
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch from 70ea039 to 96fe3e9 Compare October 12, 2024 02:38
Signed-off-by: Murphy <[email protected]>
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch from bc548fd to 59c88a4 Compare October 14, 2024 07:06
Signed-off-by: Murphy <[email protected]>
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch from 59c88a4 to adf5fde Compare October 15, 2024 08:13
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch 2 times, most recently from 1367a69 to 5888092 Compare October 17, 2024 11:13
Signed-off-by: Murphy <[email protected]>
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch from 5888092 to 80fd5dc Compare October 18, 2024 00:59
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

pass : 280 / 296 (94.59%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/storage/chunk_helper.h 1 2 50.00% [186]
🔵 be/src/exec/join_hash_map.cpp 13 14 92.86% [662]
🔵 be/src/storage/chunk_helper.cpp 234 248 94.35% [761, 795, 842, 843, 844, 845, 847, 952, 957, 958, 992, 993, 999, 1000]
🔵 be/src/exec/spill/mem_table.cpp 1 1 100.00% []
🔵 be/src/column/binary_column.cpp 6 6 100.00% []
🔵 be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.cpp 1 1 100.00% []
🔵 be/src/column/column_helper.cpp 15 15 100.00% []
🔵 be/src/exec/join_hash_map.tpp 8 8 100.00% []
🔵 be/src/exec/join_hash_map.h 1 1 100.00% []

@meegoo meegoo merged commit 5dd0cc5 into StarRocks:main Oct 18, 2024
50 checks passed
Copy link

@Mergifyio backport branch-3.3

@github-actions github-actions bot removed the 3.3 label Oct 18, 2024
Copy link

@Mergifyio backport branch-3.2

@github-actions github-actions bot removed the 3.2 label Oct 18, 2024
Copy link
Contributor

mergify bot commented Oct 18, 2024

backport branch-3.3

✅ Backports have been created

Copy link
Contributor

mergify bot commented Oct 18, 2024

backport branch-3.2

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Oct 18, 2024
Signed-off-by: Murphy <[email protected]>
(cherry picked from commit 5dd0cc5)

# Conflicts:
#	be/src/column/binary_column.h
#	be/src/exec/join_hash_map.cpp
#	be/src/exec/join_hash_map.h
#	be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.cpp
#	be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.h
mergify bot pushed a commit that referenced this pull request Oct 18, 2024
Signed-off-by: Murphy <[email protected]>
(cherry picked from commit 5dd0cc5)

# Conflicts:
#	be/src/column/binary_column.h
#	be/src/exec/join_hash_map.cpp
#	be/src/exec/join_hash_map.h
#	be/src/exec/join_hash_map.tpp
#	be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.cpp
#	be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.h
#	be/src/exec/spill/mem_table.cpp
ZiheLiu pushed a commit to ZiheLiu/starrocks that referenced this pull request Oct 31, 2024
renzhimin7 pushed a commit to renzhimin7/starrocks that referenced this pull request Nov 7, 2024
@github-actions github-actions bot added the 3.3 label Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants