Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-1434] [VL] Fix shuffle read unalign buffer which origins from netty #2159

Merged
merged 1 commit into from
Jul 3, 2023

Conversation

jinchengchenghh
Copy link
Contributor

@jinchengchenghh jinchengchenghh commented Jun 30, 2023

For small size ColumnarBatch, this batch will not be compressed, the buffer which origins from netty is aligned, but the actual buffer used in RecordBatch is SliceBuffer(buffer, offset, size), this buffer cannot guarantee align. SIMD instruction movdqa required the address 16B aligned, so it will core dump at velox function copyValuesAndNulls and left potential coredump. This copy is expensive but essential.

BufferReader::DoReadAt(int64_t position, int64_t nbytes) {
return SliceBuffer(buffer_, position, nbytes); // buffer_ is netty buffer
}

For most batch which is not tiny batch and compress codec use default lz4, shuffle read will decompress the buffer to an aligned address which meets SIMD instrictions requirement.

Relevant issue: facebookincubator/velox#2388

@github-actions
Copy link

#1434

@github-actions
Copy link

Run Gluten Clickhouse CI

@zhouyuan
Copy link
Contributor

zhouyuan commented Jul 3, 2023

@jinchengchenghh can you please make a short notes in the PR description?

-yuan

@github-actions
Copy link

github-actions bot commented Jul 3, 2023

Run Gluten Clickhouse CI

values = convertToVeloxBuffer(arrowValueBuffer);
} else {
values = AlignedBuffer::allocate<char>(arrowValueBuffer->size(), pool);
memcpy(values->asMutable<char>(), arrowValueBuffer->data(), arrowValueBuffer->size());
Copy link
Member

@zhztheplayer zhztheplayer Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to tweak on Java-side Netty library usages to make them conform to 16 byte alignment? So that we can avoid memcopy fundamentally

@github-actions
Copy link

github-actions bot commented Jul 3, 2023

Run Gluten Clickhouse CI

@zhouyuan zhouyuan merged commit bfc3a6a into apache:main Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants