-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Typed Buffers in Arrays (#1811) (#1176) #3743
Conversation
@@ -162,12 +183,15 @@ impl Buffer { | |||
/// Panics iff `(offset + length)` is larger than the existing length. | |||
pub fn slice_with_length(&self, offset: usize, length: usize) -> Self { | |||
assert!( | |||
offset + length <= self.len(), | |||
offset.saturating_add(length) <= self.length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually a bug, now covered by https://github.com/apache/arrow-rs/pull/3743/files#diff-24dc7184f64fe7a484db137ef61b2ec31090bb959fc9013beefea7865cdfea59R628
@@ -603,7 +603,7 @@ mod tests { | |||
let record_batch = | |||
RecordBatch::try_new(Arc::new(schema), vec![Arc::new(a), Arc::new(b)]) | |||
.unwrap(); | |||
assert_eq!(record_batch.get_array_memory_size(), 592); | |||
assert_eq!(record_batch.get_array_memory_size(), 640); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a slight regression in the size of the arrays themselves, typically they are passed around as references so I am inclined to think this doesn't really matter, it is also likely temporary
unsafe { | ||
std::slice::from_raw_parts( | ||
self.buffer.as_ptr() as *const T, | ||
self.buffer.len() / std::mem::size_of::<T>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found caching this value doesn't impact performance, this is likely because the "hot" codepaths are either iterating the slice amortising this cost over multiple values, or using unchecked addressing where this length isn't ever consulted
@@ -637,7 +637,7 @@ pub fn new_null_array(data_type: &DataType, length: usize) -> ArrayRef { | |||
} | |||
|
|||
// Helper function for printing potentially long arrays. | |||
pub(crate) fn print_long_array<A, F>( | |||
fn print_long_array<A, F>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a drive-by cleanup
/// | ||
/// We store a pointer instead of an offset to avoid pointer arithmetic | ||
/// which causes LLVM to fail to vectorise code correctly | ||
ptr: *const u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seemingly inconsequential change makes a substantial difference to the benchmarks, LLVM seems to fail to realise that the offset doesn't change in the body of the loop and so fails to lift it out, which in turn causes the loop itself to fail to vectorise correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tustvold for moving this forward! I plan to review this in next few days.
// Soundness | ||
// pointer alignment & location is ensured by RawPtrBox | ||
// buffer bounds/offset is ensured by the ArrayData instance. | ||
unsafe { | ||
std::slice::from_raw_parts( | ||
self.value_offsets.as_ptr().add(self.data.offset()), | ||
self.len() + 1, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's good. 👍
arrow-array/src/array/list_array.rs
Outdated
let value_offsets = match data.is_empty() && data.buffers()[0].is_empty() { | ||
true => OffsetBuffer::new_empty(), | ||
false => { | ||
let buffer = ScalarBuffer::new( | ||
data.buffers()[0].clone(), | ||
data.offset(), | ||
data.len() + 1, | ||
); | ||
// Safety: | ||
// ArrayData is valid | ||
unsafe { OffsetBuffer::new_unchecked(buffer) } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems occurring more than once. Maybe we can have a function for it?
|
||
/// A non-empty buffer of monotonically increasing, positive integers | ||
#[derive(Debug, Clone)] | ||
pub struct OffsetBuffer<O: ArrowNativeType>(ScalarBuffer<O>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub struct OffsetBuffer<O: ArrowNativeType>(ScalarBuffer<O>); | |
pub struct OffsetBuffer<O: OffsetSizeTrait>(ScalarBuffer<O>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OffsetSizeTrait
is defined in arrow-array and so can't be used here
Self { buffer, ptr, len } | ||
let byte_offset = offset.checked_mul(size).expect("offset overflow"); | ||
let byte_len = len.checked_mul(size).expect("length overflow"); | ||
buffer.slice_with_length(byte_offset, byte_len).into() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check alignment anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alignment gets checked in the From conversion
/// # Safety | ||
/// | ||
/// - ArrayData must contain a valid [`OffsetBuffer`] as its first buffer | ||
unsafe fn get_offsets<O: ArrowNativeType>(data: &ArrayData) -> OffsetBuffer<O> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will eventually get removed as we make ArrayData an enumeration of typed buffers
Benchmark runs are scheduled for baseline = ebe6f53 and contender = 47e4b61. 47e4b61 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Part of #1176
Closes #1811
Rationale for this change
The first part of moving towards a fully typed ArrayData as part of #1176
As an added bonus the change to precompute the offset for
ByteArray
improves the performance of the string comparison kernels by up to 25%. The arithmetic kernels show a very slight performance regression of ~1-2%, I'm inclined to think this does not matter, we are talking microseconds here.What changes are included in this PR?
Are there any user-facing changes?